The hashmap has two key and value pairs, they are not processed in parallel by different threads.
import java.util.stream.Stream;
import java.util.Map;
import java.util.HashMap;
class Ideone
{
public static void main (String[] args) throws java.lang.Exception
{
Map<String, Integer> map = new HashMap<>();
map.put("a", 1);
map.put("b", 2);
map.values().parallelStream()
.peek(x -> System.out.println("processing "+x+" in "+Thread.currentThread()))
.forEach(System.out::println);
}
}
Output:
processing 1 in Thread[main,5,main]
1
processing 2 in Thread[main,5,main]
2
URL: https://ideone.com/Hkxkoz
The ValueSpliterator should have tried to split the arrays of HashMap into slot of size 1, which means two elements should be processed in different threads.
Source: https://www.codota.com/code/java/methods/java8.util.HMSpliterators$ValueSpliterator/%3Cinit%3E
After wrapped them in ArrayList, it works as expected.
new ArrayList(map.values()).parallelStream()
.peek(x -> System.out.println("processing "+x+" in "+Thread.currentThread()))
.forEach(System.out::println);
output:
processing 1 in Thread[ForkJoinPool.commonPool-worker-3,5,main]
1
processing 2 in Thread[main,5,main]
2
As explained in this answer, the issue is connected with the fact that the
HashMaphas a capacity potentially larger than its size and the actual values are distributed over the backing array based on their hash codes.The splitting logic is basically the same for all array based spliterators, whether you stream over an array, an
ArrayList, or aHashMap. To get balanced splits on a best-effort basis, each split will half the (index) range, but in case ofHashMap, the number of actual elements within the range differs from the range size.In principle, every range based spliterator can split down to single elements, however, the client code, i.e. the Stream API implementation, might not split so far. The decision for even attempting to split is driven by the expected number of elements and number of CPU cores.
Taking the following program
you will get:
On ideone
So, as said, the spliterator can split down to individual elements if we split deep enough, however, the estimated size of two elements does not suggest that it’s worth doing that. On each split, it will halve the estimate and while you might say that it’s wrong for the elements you’re interested in, it’s actually correct for most spliterators here, as when going down to the maximum level, most spliterators are representing an empty range and splitting them turns out to be a waste of resources.
As said in the other answer, the decision is about balancing the work of splitting (or preparation in general) and the expected work to parallelize, which the Stream implementation can’t know in advance. If you know in advance that the per-element workload will be very high, to justify more preparation work, you can use, e.g.
new ArrayList<>(map.[keySet|entrySet|values]()) .parallelStream()to enforce balanced splits. Usually, the problem will be much smaller for larger maps anyway.