Using RabbitMQ as Distributed broker - How to Serialize jobs per queue

613 Views Asked by At

Each Job in my system belongs to a specific userid and can be put in rabbitmq from multiple sources. My requirements:

  • No more than 1 job should be running per user at any given time.
  • Jobs for other users should not experience any delay because of job piling up for a specific user.
  • Each Job should be executed at least once. Each Job will have a max retries count and is re-inserted in queue (or probably delayed) with a delay if fails.
  • Maintaining Sequence of Jobs (per user) is desirable but not compulsory.
  • Jobs should probably be persisted, as I need them executing atleast once. There is no expiry time of jobs.
  • Any of the workers should be able to run jobs for any of the user.

With these requirements, I think maintaining a single queue for each individual user makes sense. I would also need all the workers watching all user queues and execute job for user, whose job is currently not running anywhere (ie, no more than 1 job per user)

Would this solution work using RabbitMQ in a cluster setup? Since the number of queues would be large, I am not sure each worker watching every user queue would cause significant overhead or not. Any help is appreciated.

1

There are 1 best solutions below

0
On

As @dectarin has mentioned, having multiple workers listen to multiple job queues will make it hard to ensure that only one job per user is being executed.

I think it'd work better if the jobs go through a couple steps.

  1. User submits job
  2. Job gets queued per user until no other jobs are running
  3. Coordinator puts job on the active job queue that is consumed by the workers
  4. Worker picks up the job and executes it
  5. Worker posts the results in a result queue
  6. The results get sent to the user

I don't know how the jobs get submitted to the system, so it's hard to tell if actual per-user MessageQueues would be the best way to queue the waiting. If the jobs already sit in a mailbox, that might work as well, for instance. Or store the queued jobs in a database, as a bonus that'd allow you to write a little front end for users to inspect and manage their queued jobs.

Depending on what you choose, you can then find an elegant way to coordinate the single job per user constraint.

For instance, if the jobs sit in a database, the database keeps things synchronised and multiple coordinator workers could go through the following loop:

while( true ) {
    if incoming job present for any user {
        pick up first job from queue
        put job in database, marking it active if no other active job is present
        if job was marked active {
            put job on active job queue
        }
    }
    if result is present for any user {
        pick up first result from result queue
        send results to user
        mark job as done in database
        if this user has job waiting in database, mark it as active
        if job was marked active {
            put job on active job queue
        }
    }
}

Or if the waiting jobs sit in per-user message queues, transactions will be easier and a single Coordinator going through the loop won't need to worry about multi-threading.

Making things fully transactional across database and queues may be hard but need not be necessary. Introducing a pending state you should allow you to err on the side of caution making sure no jobs get lost if a step fails.