I am currently working on a software that uses RabbitMQ to deliver messages to workers. Basically, this is quite an easy task, as you can use a single direct exchange with a single queue, and all the workers consume that queue. Job done.
But now, things start to get complicated. If have two additional requirements, that make me think:
- The messages are not all equal, i.e. messages have a "flag", which is used to group messages. Now the first requirement is that all messages with the same "flag" are processed in-order.
- It is desired to "pause" processing of all messages with a specific flag.
The first one is quite easy to solve: You simply have to make sure that messages with the same flag are always processed by the same worker, and set its prefetch to 1. To make that sure you can use an x-consistent-hash
exchange.
The second one is also easy to solve: You simple have to use a separate queue for each flag, then you can stop processing a queue.
Now, while this basically works, it introduces a few problems, and I'm not too sure how to solve them:
- If I set the
prefetch
option to 1, things become slower, as I can't process messages in parallel. This highly decreases performance. - If I use a separate queue for each flag, I end up with a high number of queues (> 10k). While this does not seem to be a problem for RabbitMQ to handle, I wonder whether it's a good idea.
- Additionally, I now have multiple worker "steps", i.e. once a message has been handled by a worker, it puts it to another RabbitMQ, and the same thing starts over. This means, if I do not use separate instances of RabbitMQ for each step, I end up with lots of queues (n exchanges and n times 10k queues).
Is there a better way to solve this? If so, how? Any ideas, hints, …?
to process all messages with the same "flag" in order I recommend setting up an exchange with multiple queues and the exchange routes messages into the different queues based on the routing-key (this is what the routing-key is designed for). This way, you don't need to set prefetch to 1, you have a queue dedicated to that routing-key.
If you have 10k+ "flags" then you run into different problems. Having that many queues is possible, but not very maintainable. Could you perhaps have a "High priority queue" in which important flags are routed to a handful of queues and all the rest of the "non-important flags" gets routed into a single "low priority queue"?