Encounter order collections - best practice & example

1.1k Views Asked by At

Ordering

Streams may or may not have a defined encounter order. Whether or not a stream has an encounter order depends on the source and the intermediate operations. Certain stream sources (such as List or arrays) are intrinsically ordered, whereas others (such as HashSet) are not. Some intermediate operations, such as sorted(), may impose an encounter order on an otherwise unordered stream, and others may render an ordered stream unordered, such as BaseStream.unordered(). Further, some terminal operations may ignore encounter order, such as forEach().

  1. Are there any other types that does not have the encounter order property but HashSet?
  2. In case I'm not intersted in keeping the exsisting order or any sort of sorting, is it considered to be best practice by explicitly calling to unordered intermidiate operation on each stream that will be computed in parallel?
3

There are 3 best solutions below

2
On BEST ANSWER

Besides the HashSet and HashMap’s collection views, Stream.generate() will generate an unordered stream.

Needless to say, streams generated by a Random are unordered too. Also, Stream.empty() does not report to have an encounter order, but that has not much consequences…

If you know that you don’t need the Stream to maintain the encounter order, it’s a good practice to use unordered()—even if it doesn’t improve the performance, as with most operations in the current implementation, it doesn’t harm and will document that you don’t care for the order. That does not only apply to parallel streams, some operations, like distinct(), may benefit from unorderedness even in the sequential case.

In some cases, choosing the right terminal operation, like findAny() instead of findFirst() documents that intent more concise and will also have a higher impact on the performance, given the current implementation.

3
On

Encounter order is nothing but the order of the source. E.g. In ArrayList, the elements are sorted in insertion order so streaming over it will give you elements in that order only.

  1. Apart from HashSet, HashMap is also unordered.
  2. If you are only interested in collect operation and don't care about ordering then you don't need to worry about it. Just stream() will be fine. E.g. if you want to let's say calculate the sum then you would do something like this:

    List<Integer> list = Arrays.asList(1,2,3);
    int sum = list.stream().collect(Collectors.summingInt(e -> e));
    

    In this case, it doesn't matter in which order the elements are flowing into the stream.

0
On

For your second question yes. If you don't care about those, unordered will help. I sort had the same question a while ago, here.

Now think about un-ordered. When you have a List and you multiply each element by two and collect that back to a List and you do that in parallel. Each merge of the intermediate List has to happen in such a way that the order is preserved in the result List. You might have computed the 4-th and the first intermediate result and now need to merge them. If you care about order, you can't merge them directly, since this will obviously break the order; so you need to compute the other intermediate results and merge them in the same order.

You might imagine this like traversing a List from left to right; from index zero to the last one.

On the other hand if you don't care about order, this merging can happened in any order whenever it's ready.You can even read any element in any order, since this is irrelevant.

findFirst and findAny work with the same idea in the background. Suppose you have a List with 8 elements, process it in parallel and need to return the first one only. You might have already processed 7 last elements already, but because you need the first, it does not matter - you still have to wait for the first to process. It gets obvious why findAny is better...