I've made a simple Huffman encoding program to output individual encodings for characters and save the encoded file. This was for an assignment, and I was told that using const_cast
on heap.top()
is considered undefined behavior if we heap.pop()
afterwards, but I'm not sure I understand why.
I've read cppreference regarding the std::pop_heap
which is the underlying function called when we call heap.pop()
and I believe that a nullptr
in the comparison is still defined and understood. It doesn't seem to function abnormally to me when I debugged it.
Here's an example
#include <functional>
#include <queue>
#include <vector>
#include <iostream>
#include <memory>
template<typename T> void print_queue_constcast(T& q) {
while(!q.empty()) {
auto temp = std::move(const_cast<int&>(q.top()));
std::cout << temp << " ";
q.pop();
}
std::cout << '\n';
}
template<typename T> void print_queue(T& q) {
while(!q.empty()) {
std::cout << q.top() << " ";
q.pop();
}
std::cout << '\n';
}
int main() {
std::priority_queue<int> q1;
std::priority_queue<int> q2;
for(int n : {1,8,5,6,3,4,0,9,7,2}){
q1.push(n);
q2.push(n);
}
print_queue(q1);
print_queue_constcast(q2);
}
Could anyone explain what is actually going in the backgroun that'd be undefined behavior or that would cause this to fail under certain circumstances?
tl;dr: Maybe; maybe not.
Language-level safety
Like a set, a priority_queue is in charge of ordering its elements. Any modification to an element would potentially "break" the ordering, so the only safe way to do that is via the container's own mutating methods. (In fact, neither one actually provides such a thing.) Directly modifying elements is dangerous. To enforce this, these containers only expose
const
access to your elements.Now, at the language level, the objects won't actually have a static type of
const T
; most likely they're justT
s. So modifying them (after aconst_cast
to cheat the type system) doesn't have undefined behaviour in that sense.Library-level safety
However, you are potentially breaking a condition of using the container. The rules for priority_queue don't ever actually say this, but since its mutating operations are defined in terms of functions like
push_heap
andpop_heap
, your use of such operations will break preconditions of those functions if the container's ordering is no longer satisfied after your direct mutation.Thus your program will have undefined behaviour if you break the ordering and later mutate the priority_queue in such a way that depends on the ordering being intact. If you don't, technically your program's behaviour is well-defined; however, in general, you'd still be playing with fire. A
const_cast
should be a measure of last resort.So, where do we stand?
The question is: did you break the ordering? What's the state of the element after moving from it, and is the ordering satisfied by having an object in that state at the top of the queue?
Your original example uses
shared_ptr
s, and we know from the documentation that a moved-fromshared_ptr
turns safely into a null pointer.The default
priority_queue
ordering is defined bystd::less
, which yields a strict total order over raw pointers;std::less
on ashared_ptr
will actually invoke its base case ofoperator<
, but that in turn is defined to invokestd::less
on its raw pointer equivalent.Unfortunately, that doesn't mean that a null shared_ptr is ordered "first": though
std::less
's pointer ordering is strict and total, where null pointers land in this ordering is unspecified.So, it is unspecified as to whether your mutation will break the ordering, and therefore it is unspecified as to whether your
pop()
will have undefined behaviour.(The MCVE example with
int
is safe becausestd::move
on anint
has no work to do: it'll just copy theint
. So, the ordering remains unaffected.)Conclusion
I would agree with what was presumably your driving rationale, that it is unfortunate
pop()
doesn't return you the popped thing, which you could thenmove
from. Similar restrictions with sets and maps are why we now have node splicing features for those containers. There is not such a thing for a priority_queue, which is just a wrapper around another container like a vector. If you need more fine-grained control, you can substitute that for your own which has the features you need.Anyway, for the sake of a
shared_ptr
increment (as in your original code), I'd probably just take the hit of the copy, unless you have some really extreme performance requirements. That way, you know everything will be well-defined.Certainly, for the sake of an
int
copy (as in your MCVE), astd::move
is entirely pointless (there are no indirect resources to steal!) and you're doing a copy anyway, so the point is rather moot and all you've done is to create more complex code for no reason.I would also recommend not writing code where you have to ask whether it's well-defined, even if it turns out it is. That's not ideal for readability or maintainability.