I am new to OpenMP so took some time to figure out the right way to ask the question so that its easier for experts to understand my queries.
I am been trying to formulate the best way to ask the problem: Previous attempts are:
But I think I have found the most parsimonious way to ask
Q: How can we implement a parallel set of FIFO task queues ?
So each queue can parallelly execute - but inside the queue - the execution order of tasks must be FIFO aka sequential.
We basically need a master thread that feeds the FIFO queues and a set of thread pools that picks from these queues and executes it as threads become available.
Hopefully this is the best way to ask instead of pseudo code examples
I don't think OpenMP tasks are a good match for this. Three reasons for this:
Therefore I suggest you use a normal FIFO setup. You can still use OpenMP's thread-pool for this. Here is a quick outline of how it may look:
We start with a few placeholders for your sensor data. For functions I follow pthread's convention: return 0 on success, error code otherwise.
For the FIFO I use a textbook blocking ring buffer. Blocking means we need condition variables which (unless I'm mistaken) are not supported by OpenMP, so I use pthreads for this. We could instead go for spinlocks or lockfree designs. However, since there is a good chance either the producer will outperform the consumer or vice-versa, putting one side to sleep might boost the clock speed of the other side. Therefore I think a blocking / sleeping FIFO is good, at least as a baseline for benchmarks of other approaches.
Now all we need is one FIFO per consumer thread and we can let the producer distribute the work among them.
Again, we can use OpenMP's thread pool to launch the whole endeavor.
Follow-up questions
We need a way to stop the process and also deal with errors, unless we don't care about either. In particular, the producer may run out of elements to produce. In either case, the side of the FIFO that does not continue, closes said FIFO.
At that point, the rules are pretty straightforward: The producer can no longer push elements into the FIFO since there is no guarantee a consumer will get to them. On the other hand, a consumer will still read the elements remaining in the FIFO before getting the return value indicating the closed FIFO. That means the remaining elements in the FIFO are drained once the producer stops.
The
closedcondition is signalled with the error return valueEPIPE, just to mimic similar semantics in a Unix pipe.Not a good idea. You want to overlap production and consumption as much as possible because this maximizes parallelization. It also prevents potential deadlocks when the FIFOs get full before your launch criterium is reached
Sure. You didn't specify whether these values are little-endian or big-endian so I assume the native machine order.
Side note: A-priori knowledge of the number of sensors isn't needed. We might as well launch like this:
That shouldn't happen assuming you launch producers and consumers at the same time. Here is a simple producer without file IO that I built and that works fine on my 16 thread system.
The worst that can happen is that you have more than 64 times the same sensor ID in a row (or one sensor ID modulo consumer count). Then the producer would fill one FIFO to the brim and has to wait until the consumer drains the FIFO while all other consumers are idle.
In that case increase the FIFO size until it is larger than the maximum number expected in a row. Or switch to a dynamically growing FIFO. But then you have to be careful again to not overload memory.