How to set up thread-local accumulators for Java parallel streams that are visible to the main thread

309 Views Asked by At

How do I set up some thread-local storage for each worker thread in the default (common) fork-join pool, and how do I access that TLS from the main thread?

I need to implement the "collector" pattern for Java parallel streams, which is like reduce in MapReduce, but with partial reductions of associative operations grouped within each thread, prior to one final reduction step at the end. (Note that I do not want to use the MapReduce pattern directly, for performance reasons -- collectors cut down on the amount of data sent from mappers to reducers.)

Basically I need each thread in the common fork-join thread pool to have a large array of accumulators associated with it, and as the mappers run using Stream.parallel().forEach(...), each thread should collect the resulting value by updating some bin in its own accumulator in a lock-free way. At the end of the operation, I want the calling thread (the main thread) to have access to the accumulators from each worker thread, so that it can do a final reduce to combine all the accumulators into a single accumulator array.

My idea is to use a ConcurrentSkipListMap indexed by thread name to store each of the thread-local accumulators, but there is quite a lot of overhead to this, so it's not ideal.

0

There are 0 best solutions below