Coming from this article, it says:
When we are using any of the modify methods – such as add() or remove() – the whole content of the CopyOnWriteArrayList is copied into the new internal copy.
Due to this simple fact, we can iterate over the list in a safe way, even when concurrent modification is happening.
When we're calling the iterator() method on the CopyOnWriteArrayList, we get back an Iterator backed up by the immutable snapshot of the content of the CopyOnWriteArrayList.
Its content is an exact copy of data that is inside an ArrayList from the time when the Iterator was created. Even if in the meantime some other thread adds or removes an element from the list, that modification is making a fresh copy of the data that will be used in any further data lookup from that list.
Simple question to ask myself next is why both? Basically, from my understanding write operations are made on a new copy, while read operation are done on a clone of a collection.
If for example writes are done on a new copy, that means I can iterate over "original" collection - meaning it won't be affected. So why add overhead of storing elements inside another copy (snapshot)? Or the opposite direction, if I store elements inside copy (snapshot), why writes need to be done on a copy, when I am literally iterating over clone and not "original" collection (meaning snapshot will never change)?
I hope the question is legal to ask, as I literally checked every possible source on the internet and not a single article helped me clear this confusion. What am I missing here?
CopyOnWriteArrayList
does not create a copy of the array when you calliterator
, as the docs says:Note the word "reference".
This sentence is worded rather poorly:
It doesn't mean that a copy of the array is created when you call
iterator()
. It should have said:The more important point of that paragraph is:
What this means is that if you create an iterator, then proceeds to mutate the list in some way, the iterator won't see those changes. Why? Because the mutations are done by making a new array that has the mutations, but the iterator is iterating through the old array, which doesn't have the mutations. This is why we say that the iterator takes a "snapshot".
Here's some code from OpenJDK to illustrate.
In
iterator()
, it simply creates aCOWIterator
withgetArray()
, which gets the snapshot by returning the volatilearray
field:And mutator methods, such as
add
, sets thearray
field:I've removed the (un)locking code to make it easier to see what's happening.