Range-based for loop on a temporary range

5.6k Views Asked by At

Thanks to some segmentation faults and warnings in valgrind, I found that this code is incorrect and has some sort of dangling reference in the for-range loop.

#include<numeric>
#include<vector>

auto f(){
    std::vector<std::vector<double>> v(10, std::vector<double>(3));
    iota(v[5].begin(), v[5].end(), 0);
    return v;
}

int main(){
    for(auto e : f()[5])
        std::cout << e << std::endl;
    return 0;
}

It looks as if the begin and end is taken from a temporary and lost in the loop.

Of course, a way around is to do

    auto r = f()[5];
    for(auto e : r)
        std::cout << e << std::endl;

However, I wonder exactly why for(auto e : f()[5]) is an error and also if there is a better way around or some way to design f or the even the container (std::vector) to avoid this pitfall.

With iterator loops is more obvious why this problem happens (begin and end come from different temporary objects)

for(auto it = f()[5].begin(); it != f()[5].end(); ++it)

But in a for-range loop, as in the first example, it seems very easy to make this mistake.

2

There are 2 best solutions below

9
On BEST ANSWER

Note that using a temporary as the range expression directly is fine, its lefetime will be extended. But for f()[5], what f() returns is the temporary and it's constructed within the expression, and it'll be destroyed after the whole expression where it's constructed.

From C++20, you can use init-statement for range-based for loop to solve such problems.

(emphasis mine)

If range_expression returns a temporary, its lifetime is extended until the end of the loop, as indicated by binding to the rvalue reference __range, but beware that the lifetime of any temporary within range_expression is not extended.

This problem may be worked around using init-statement:

for (auto& x : foo().items()) { /* .. */ } // undefined behavior if foo() returns by value
for (T thing = foo(); auto& x : thing.items()) { /* ... */ } // OK

e.g.

for(auto thing = f(); auto e : thing[5])
    std::cout << e << std::endl;
4
On

I wonder exactly why for(auto e : f()[5]) is an error

I'll just answer this part. The reason is that range-based for statements are just syntactic sugar for, approximately:

{
    auto&& __range = f()[5]; // (*)
    auto __begin = __range.begin(); // not exactly, but close enough
    auto __end = __range.end();     // in C++17, these types can be different
    for (; __begin != __end; ++__begin) {
        auto e = *__begin;
        // rest of body
    }
}

Take a look at that first line. What happens? operator[] on a vector returns a reference into that object, so __range is bound to that internal reference. But then the temporary goes out of scope at the end of the line, destroying all of its internals, and __range is immediately a dangling reference. There is no lifetime extension here, we are never binding a reference to a temporary.

In the more normal case, for(auto e : f()), we'd bind __range to f() directly, which is binding a reference to a temporary, so that temporary would get its lifetime extended to the lifetime of the reference, which would be the full for statement.

To add more wrinkles, there are other cases where indirect binding like this would still do lifetime extension. Like, say:

struct X {
    std::vector<int> v;
};
X foo();

for (auto e : foo().v) {
    // ok!
}

But rather than try to keep track of all of these little cases, it's much better to, as songyuanyao suggests, use the new for statement with initializer... all the time:

for (auto&& range = f(); auto e : range[5]) {
    // rest of body
}

Although in a way this gives a false sense of security, since if you did it twice, you'd still have the same issue...

for (auto&& range = f().g(); auto e : range[5]) {
    // still dangling reference
}