In C++ there is the infamous problem of self-assignment: when implementing operator=(const T &other), one has to be careful of the this == &other case to not destroy this's data before copying it from other.
However, *this and other may interact in more interesting ways than being the same object. Namely, one may contain the other. Consider the following code:
#include <iostream>
#include <string>
#include <utility>
#include <vector>
struct Foo {
std::string s = "hello world very long string";
std::vector<Foo> children;
};
int main() {
std::vector<Foo> f(4);
f[0].children.resize(2);
f = f[0].children; // (1)
// auto tmp = f[0].children; f = std::move(tmp); // (2)
std::cout << f.size() << "\n";
}
I'd expect that lines (1) and (2) are identical: program is well-defined to print 2. However, I'm yet to find a compiler+standard library combination that works with line (1) and Address Sanitizer enabled: GCC+stdlibc++, Clang+libc++ and Visual Studio+Microsoft STL all crash.
Curiously, disabling Address Sanitizer removes the crash and the program starts printing 2.
Why is this operation prohibited or permitted in the standard C++?
Extra question: same, but with f[0].children = f. Extra-extra question: use std::any instead of std::vector<Foo>.
I'm not convinced that (1) is well-defined, because in order to copy a new value into
f[0], the old object residing at that location must first be destroyed, or is at the very least modified while under the contract of being const.From std::vector<T,Allocator>::operator= (emphasis mine):
So it would be expected that in all scenarios above, it's possible the object is destroyed before it's be copied, and you fall into the territory of behavior that is either undefined or specific to an implementation.
In practical terms, for the vector to re-use this memory it generally necessitates placement-delete followed by placement-new and in these cases once again the referenced object being copied is destroyed in the process.
Even in the most lenient scenario (i.e. "replaced by element-wise copy-assignment") you begin with
Foo::operator=(const Foo&)invoked onf[0]to replace it with a copy off[0].children[0]. The vectorf[0].children[0].childrenis empty, and so the copy will result in both elements off[0].childrenbeing destroyed but leaving the target vector's capacity (which is 2) unchanged. Before even getting to the next element, theconst Foo&that was originally being copied has been modified, breaking its contract and all bets are off.I don't think there's any automatic way to protect against that without maybe using some kind of custom garbage-collecting allocator. You simply need to recognize the self-referential problem and avoid it. You worked around the problem in (2) by introducing a copy, and that is at least well-defined. It can be taken one step further by moving the data out of the container first:
Perhaps the problem can be more generally worked around with careful application of
std::shared_ptr, since your main issue is the destruction of data that you expected is still referenced.I think the whole contract-breaking-of-const-object stuff is really the key to answering your "extra" question about
f[0].children = fwithout getting too deep in details. In this case,childrenmay be reallocated due to the required increase in capacity, and in doing so modifiesfwhich was supposed to be const.