Notes on TheSchwartz

I've been playing around with SixApart's TheSchwartz for the last few days. TheSchwartz is a lightweight reliable job queue, typically used for handling relatively high latency jobs that you don't want to try and handle from a web process e.g. for sending out emails, placing orders into some external system, etc. Basically interacting with anything which might be down or slow or which you don't really need right away.

Actually, TheSchwartz is a job queue library rather than a job queue system, so some assembly is required. Like most Danga/SixApart software, it's lightweight, performant, and well-designed, but also pretty light on documentation. If you're not comfortable reading the (perl) source, it might be a challenging environment to setup.

Notes from the last few days:

  • Don't use the version on CPAN, get the latest code from subversion instead. At the moment the CPAN version is 1.04, but current svn is at 1.07, and has some significant additional functionality.

  • Conceptually TheSchwartz is very simple - jobs with opaque function names and arguments are inserted into a database for workers with a particular 'ability'; workers periodically check the database for jobs matching the abilities they have, and grab and execute them. Jobs that succeed are marked completed and removed from the queue; jobs that fail are logged and left on the queue to be retried after some time period up to a configurable number of retries.

  • TheSchwartz has two kinds of clients - those that submit jobs, and workers that perform jobs. Both are considered clients, which is confusing if you're thinking in terms of client-server interaction. TheSchwartz considers both sides to be clients.

  • There are three main classes to deal with: TheSchwartz, which is the main client functionality class; TheSchwartz::Job, which models the jobs that are submitted to the job queue; and TheSchwartz::Worker, which is a role-type class modelling a particular ability that a worker is able to perform.

  • New worker abilities are defined by subclassing TheSchwartz::Worker and defining your new functionality in a work() method. work() receives the job object from the queue as its only argument and does its stuff, marking the job as completed or failed after processing. A useful real example worker is TheSchwartz::Worker::SendEmail (also by Brad Fitzpatrick, and available on CPAN) for sending emails from TheSchwartz.

  • Depending on your application, it may make sense for workers to just have a single ability, or for them to have multiple abilities and service more than one type of job. In the latter case, TheSchwartz tries to use unused abilities whenever it can to avoid certain kinds of jobs getting starved.

  • You can also subclass TheSchwartz itself to modify the standard functionality, and I've found that useful where I've wanted more visibility of what workers are doing that you get out of the box. You don't appear at this point to be able to subclass TheSchwartz::Job however - TheSchwartz always uses this as the class when autovivifying jobs for workers.

  • There are a bunch of other features I haven't played with yet, including job priorities, the ability to coalesce jobs into groups to be processed together, and the ability to delay jobs until a certain time.

I've actually been using it to setup a job queue system for a cluster, which is a slightly different application that it was intended for, but so far it's been working really well.

I'm still feeling like I'm still getting to grips with the breadth of things it could be used for though - more experimentation required. I'd be interested in hearing of examples of what people are using it for as well.

Recommended.

blog comments powered by Disqus