Count in Multiset

2.3k Views Asked by At

I have been using the C++ STL for a while but never really got around to using multisets (or multimaps). I have a question based on counting the number of elements with the same key. For eg. Here is an unordered_multiset {0, 2, 5, 1, 1, 2, 7, 5}

If I say, count(5), it should return 2. There are two ways to achieve that using the C++11 standards of unordered_multiset. 1) count 2) equal_range and then subtracting the resulting iterators.

1) is said to take linear time in number of occurrences but 2) is constant time. Why is that?

3

There are 3 best solutions below

0
On

Firstly, the complexity of equal_range is documented at the link you yourself provide as:

Average case: constant.
Worst case: linear in container size.

Secondly, the logical operation of "subtracting the resulting iterators" has to be implemented using linear iteration with complexity O(bucket_size(bucket(key))), stepping through a list or vector of hash-colliding values checking for the matches, so...

"2) equal_range and then subtracting the resulting iterator"..."is constant time"

...is not a well-founded assertion.

As for "1) count", it's complexity is likewise documented- in this case:

Average case: linear in the number of elements counted.
Worst case: linear in container size.

Which again may differ from your "linear time in number of occurences". The reason that's the average is that normally with max_load_factor at the default of 1.0 and a good hash function, there will only be collisions as for a random scattering - something around the 10-20% mark, so most of the time the only keys hashing to a specific bucket will be those you're counting - with the average being a constant multiple around 1.1x or 1.2x that - hence linear.

0
On

The problem is equal_range returns "forward iterator" (this is mentioned in the link you provided), which has only increment operation defined in contrast to random access iterators. So to calculate difference between two we have to increment the first one until it becomes equal to the second - this gives linear count time.

And for instance in gcc standard library count is implemented in this exact way:

count(const _Key& __k) const
{
  pair<const_iterator, const_iterator> __p = equal_range(__k);
  const size_type __n = std::distance(__p.first, __p.second);
  return __n;
}
0
On

1)Count

It's Complexity: Logarithmic in size and linear in the number of matches

Ex : mymultiset.count(73)

Logarithmic in size: To find the element in the very first same as binary search

Linear in the number of matches: Since the element is found it will flow linearly to know the number of matches because as we know sets are sorted

2)Equal range

It's Complexity: Logarithmic in size

" The function returns a pair, whose member pair::first is the lower bound of the range (the same as lower_bound), and pair::second is the upper bound (the same as upper_bound). "

Check least complexity to get upper/lower bound u will find it (Logarithmic in size) You can check that link also : http://www.cplusplus.com/reference/set/multiset/equal_range/