I have a C++ function that needs to insert a range of consecutive integers into a set and for each new element of the set at the end of a dequeue in the same order as the iteration. Below is a solution that is approximately O(log(n) * n) due to the repeated inserts each being O(log(n)). I would like to get an O(n) solution. I would like to use the set::insert() that takes a hint iteration position, but if I do that, I don't see how to determine in constant time whether the item was already in the set or not.
#include <deque>
#include <set>
void
insertUnique(const int beginOffset,
const int endOffset,
std::set<int> &sent,
std::deque<int> &recent)
{
for (int offset = beginOffset; offset < endOffset; ++offset) {
const bool inserted = sent.insert(offset).second;
if (inserted) {
recent.push_back(offset);
}
}
}
Is there a way to refactor this to be O(n) and accomplish the same work while leaving the arguments to the function unchanged? Is there a way to insert with an iterator hint and also know whether or not the item was inserted?
if
sent
is merely being used to determine whether the integer has been queued, then I would suggest using astd::unordered_set
since all insertions and searches have average constant time.However, unless your set is going to become huge, it's unlikely to make much difference.
In fact, if the number of different integers being recorded is less than ~1000 then you might even get better real performance using a vector, particularly if you keep it sorted - since
std::find()
uses a binary search which is O(logN) time but without pointer de-references and with good memory locality.EDIT:
Just for fun, I had a couple of attempts but the problem is that
sent
is aset<>
which has no means of insertion quicker than O(logN).This one copies the set to an unordered_set (average constant time operation) but the final insert is logN :-(
This one may have some promise, if you can stomach it. It seeks to use set_difference (O2n) to reduce the size of the set of items that must be sent and then cached so in theory it increases in efficiency as the
sent
set expands.