I have a consumer service retrieving messages from a RabbitMQ queue using EasyNetQ subscriber. Each message takes tens of seconds to process, and I need to run them in parallel to ensure I can keep up with the producer. However, each message has a property, call it groupingId. It's important that tasks with the same groupingId are not executed concurrently, as this causes resource collisions.
Its likely that there are many hundreds of groupingIds, and in usual practice not too many messages at any one time having the same Id. However the data can be bursty leading to clusters of hundreds of the same Id happening at one time.
I thought maybe TPL Dataflow might be a good fit, but I'm not that familiar with it, and not sure how to achieve what I need with it. Any guidance would be appreciated.
Create a dictionary of grouping IDs and lock on them.
First, create the dictionary somewhere, probably as a member variable.
When you need to process a message, use this logic.