I recently found out that multiset<T>
implementation in STL actually keeps different copies of the same repeated elements in the tree.
My expectation before was for it to internally use a map<T, int>
and just keep the count of the repeated elements.
What is the scenario where this implementation can be beneficial compared to just keeping the count? Is there any use case for multiset
where the code will break if the internal implementation changes? Or is there any operation which will increase in complexity if it changes?
I was wondering what is the thought process behind that choice?
By default
std::multiset
usesoperator<
to decide if two elements are equivalent (note: equiavent not equal!). Now consider this element type:Two elements are considered equivalent when
!(a < b) && !(b < a)
, that does in general not imply thata == b
. Hence, storing only the count is not sufficient.Moreoever, even equality does not imply identity (
a==b
does not imply&a==&b
). As the container owns its elements, two elements in the container cannot possibily be identical, they are always two different objects. Also, in the abovefoo
we could add a memberz
that is neither considered foroperator<
nor foroperator==
but can be accessed via a getter.The reason sorted containers typically check equivalence not equality is that they need a way to compare elements anyhow (they are sorted). Being able to compare two elements via
<
is sufficient to check equivalence, while requiring==
in addition would make the container less generic without apparent gain. If you like you can always use a comparator that is such that equivalence does imply equality (in the above example: compare alsoy
inoperator<
).As already mentioned in a comment, if you want to have the behavior you describe you could use a
std::map<foo,size_t>
(usesoperator<
as default comparator for the keys as well). Instead of only storing thefoo
s you can store a pair offoo
and the count. Now, because twofoo
s with samex
are considered equivalent, they do map to the same key:As a result,
counter
will hold 1 element: a copy off
and the count for that key will be2
.