Weird behavior with std::string reference class member

266 Views Asked by At

Given this code:

#include <iostream>

class Foo {
    public:
        Foo(const std::string& label) : label_(label) {}

        void print() {
            std::cout << label_;
        }

    private:
        const std::string& label_;
};

int main() {
    auto x = new Foo("Hello World");
    x->print();
}

I get

Hello World!

when I run it. If I modify it like this:

// g++ -o test test.cpp -std=c++17
#include <iostream>

class Base {
    public:
        Base(const std::string& label) : label_(label) {}

        void print() {
            std::cout << label_;
        }

    private:
        const std::string& label_;
};

class Derived : public Base {
    public:
        Derived(const std::string& label) : Base(label) {}
};

int main() {
    auto x = new Derived("Hello World");
    x->print();
}

I still get:

Hello World

but if I modify it like this:

// g++ -o test test.cpp -std=c++17
#include <iostream>

class Base {
    public:
        Base(const std::string& label) : label_(label) {}

        void print() {
            std::cout << label_;
        }

    private:
        const std::string& label_;
};

class Derived : public Base {
    public:
        Derived() : Base("Hello World") {}
};

int main() {
    auto x = new Derived();
    x->print();
}

I do not get any output. Can anyone explain this to me? I am compiling the program like this:

g++ -o test test.cpp -std=c++17

This is on Mac if it makes a difference.

2

There are 2 best solutions below

4
On BEST ANSWER

All three pieces of code are incorrect, label_ is merely a pointer to a temporary std::string object "Hello World", being a temporary you can't guarantee that the string is still at the location pointed by label_ at the time of x->print().

The compiler will issue dangling reference warnings if we use optimization, curious that only then it becomes aware of the problem.

Using compiler flags -Wall -Wextra -O3 with gcc 13.2:

https://godbolt.org/z/9xjsxhrTT

Speculating, perhaps the fact that the temporary is in main, where the object is declared, and thus within scope, despite being an argument, allows it to live long enough. In the third case the temporary is passed directly to the base constructor, and therefore it may get discarded before x->print(). main, where the action takes place, has no knowledge of the temporary.

Coming from Java or C#, where everything but primitive types are passed by reference with no worries, this may cause some confusion, the fact is that with C++ this is not the case, it is incumbent upon the programmer to choose, a reference class member will not hold outside referenced data, if it's temporary it will go away as soon as the program sees fit in its memory management. In this case, as stated in the comment section, you should pass the data by value, not by reference, the owner of label_ is Foo, it is where it's supposed to be stored.

0
On

Under "normal" circumstances, binding a const reference to a temporary will extend the lifetime of the temporary to the lifetime of the reference. For example, consider code like this:

std::string foo() { return "Hello World"; }

void bar() {
    std::string const& extended_life = foo();
    std::cout << extended_life << "\n";
}

The string returned by foo is a temporary object whose lifetime would normally expire at the end of the full expression in which it was created (the return statement).

But, because we bind it to a const reference, its lifetime is extended to the lifetime of the reference, so when bar prints it out, the behavior is completely defined.

That doesn't apply when the reference involved is a member of a class though. The standard doesn't directly explain why that's the case, but I suspect it's mostly a matter of what's easy or difficult to implement.

Where I have something like Foo const &foo = bar();, the compiler has to "know" the declaration of bar(), and from that its return type. It also directly "knows" that foo is a reference to const, so the connection between what was returned and the lifetime extension is fairly directly and straightforward.

When you're storing something internally in a class, however, the compiler (at least potentially) has no access to the internals of that class. For example, in your third case, the compiler could compile main having seen only this much about Base and Derived:

class Base {
    public:
        Base(const std::string& label);
        void print();
    private:
        const std::string& label_;
};

class Derived : public Base {
    public:
        Derived();
};

Based on this, the compiler has no way to know that the string passed to the ctor is related in any way to label_ or that print() uses label_.

It's only by analyzing the data flow through the contents of the classes (which may not be available when compiling the calling code) that it can figure out what label_ stores or how it's used. Demanding the compiler to analyze that code when it's potentially not available would lead to a language that couldn't be implemented. Even if all the source code was available, the relationship could be arbitrarily complex, and at some point, the compiler is no longer going to be able to determine what's going on and figure out what it needs to do.