Correctly propagating a `decltype(auto)` variable from a function

1k Views Asked by At

(This is a follow-up from "Are there any realistic use cases for `decltype(auto)` variables?")

Consider the following scenario - I want to pass a function f to another function invoke_log_return which will:

  1. Invoke f;

  2. Print something to stdout;

  3. Return the result of f, avoiding unnecessary copies/moves and allowing copy elision.

Note that, if f throws, nothing should be printed to stdout. This is what I have so far:

template <typename F>
decltype(auto) invoke_log_return(F&& f)
{
    decltype(auto) result{std::forward<F>(f)()};
    std::printf("    ...logging here...\n");

    if constexpr(std::is_reference_v<decltype(result)>)
    {
        return decltype(result)(result);
    }
    else
    {
        return result;
    }
}

Let's consider the various possibilities:

  • When f returns a prvalue:

    • result will be an object;

    • invoke_log_return(f) will be a prvalue (eligible for copy elision).

  • When f returns an lvalue or xvalue:

    • result will be a reference;

    • invoke_log_return(f) will be a lvalue or xvalue.

You can see a test application here on godbolt.org. As you can see, g++ performs NRVO for the prvalue case, while clang++ doesn't.

Questions:

  • Is this the shortest possible way of "perfectly" returning a decltype(auto) variable out of a function? Is there a simpler way to achieve what I want?

  • Can the if constexpr { ... } else { ... } pattern be extracted to a separate function? The only way to extract it seems to be a macro.

  • Is there any good reason why clang++ does not perform NRVO for the prvalue case above? Should it be reported as a potential enhancement, or is g++'s NRVO optimization not legal here?


Here's an alternative using a on_scope_success helper (as suggested by Barry Revzin):

template <typename F>
struct on_scope_success : F
{
    int _uncaught{std::uncaught_exceptions()};

    on_scope_success(F&& f) : F{std::forward<F>(f)} { }

    ~on_scope_success()
    {
        if(_uncaught == std::uncaught_exceptions()) {
            (*this)();
        }
    }
};

template <typename F>
decltype(auto) invoke_log_return_scope(F&& f)
{
    on_scope_success _{[]{ std::printf("    ...logging here...\n"); }};
    return std::forward<F>(f)();
}

While invoke_log_return_scope is much shorter, this requires a different mental model of the function behavior and the implementation of a new abstraction. Surprisingly, both g++ and clang++ perform RVO/copy-elision with this solution.

live example on godbolt.org

One major drawback of this approach, as mentioned by Ben Voigt, is that the return value of f cannot be part of the log message.

4

There are 4 best solutions below

2
On

We can use a modified version of std::forward: (the name forward is avoided to prevent ADL problems)

template <typename T>
T my_forward(std::remove_reference_t<T>& arg)
{
    return std::forward<T>(arg);
}

This function template is used to forward a decltype(auto) variable. It can be used like this:

template <typename F>
decltype(auto) invoke_log_return(F&& f)
{
    decltype(auto) result{std::forward<F>(f)()};
    std::printf("    ...logging here...\n");
    return my_forward<decltype(result)>(result);
}

This way, if std::forward<F>(f)() returns

  • a prvalue, then result is a non-reference, and invoke_log_return returns a non-reference type;

  • an lvalue, then result is an lvalue-reference, and invoke_log_return returns an lvalue reference type;

  • an xvalue, then result is an rvalue-reference, and invoke_log_return returns an rvalue reference type.

(Essentially copied from my https://stackoverflow.com/a/57440814)

2
On

That's the simplest and most clear way to write it:

template <typename F>
auto invoke_log_return(F&& f)
{ 
    auto result = f();
    std::printf("    ...logging here... %s\n", result.foo());    
    return result;
}

The GCC gets the right (no needless copies or moves) expected result:

    s()

in main

prvalue
    s()
    ...logging here... Foo!

lvalue
    s(const s&)
    ...logging here... Foo!

xvalue
    s(s&&)
    ...logging here... Foo!

So if code is clear, have ever the same functionality but is't optimized to run as much as the competitors does it's a compiler optimization failure and clang should work it out. That's the kind of problem that make lot more sense solved in the tool instead the application layer implementation.

https://gcc.godbolt.org/z/50u-hT

0
On

Since P2266R3 was accepted in C++23, it's become as simple as:

template <typename F>
decltype(auto) invoke_log_return(F&& f)
{
    decltype(auto) result(std::forward<F>(f)());
    std::printf("    ...logging here...\n");
    return result;
}

Which will return an lvalue, xvalue or prvalue accordingly.

As for why clang is misbehaving, I've observed auto and decltype(auto) functions not performing NRVO before. It also doesn't seem to like the constexpr if. This is a clang quality-of-implementation issue. The following shows the desired elision in clang (C++23):

template <typename F>
decltype(std::declval<F>()()) invoke_log_return(F&& f)
{
    decltype(auto) result(std::forward<F>(f)());
    std::printf("    ...logging here...\n");
    return result;
}

https://gcc.godbolt.org/z/sKv3vcGbh

See this other great answer https://stackoverflow.com/a/63320152/5754656 for C++17/20. Their invoke_return "fixes" the NRVO in clang because it doesn't use decltype(auto).

0
On

Q1: "Is this the shortest possible way of "perfectly" returning a decltype(auto) variable out of a function? Is there a simpler way to achieve what I want?"

Well, proving optimality is always hard, but your first solution is already very short. Really the only thing that you could hope to remove is the if constexpr - everything else is necessary (w/o changing the point of the question).

Your second solution solves this at the expense of some additional mental contortion and the inability to use the variable inside the log statement - or, more generally, it only enables you to perform an operation that has nothing to do with your result.

The simple solution by @david-kennedy solves this problem in a neat way by creating a prvalue that can then be copy-elided into its final storage location. If your use-case supports this model and you use GCC, it is pretty much the best possible solution:

template <typename F>
auto invoke_log_return(F&& f)
{ 
    auto result = f();
    std::printf("    ...logging here...\n");    
    return result;
}

However, this solution does not implement perfect forwarding at all, as its return value has a different type than that of the wrapped function (it strips references). In addition to being a source of potential bugs (int& a = f(); vs. int& a = wrapper(f);), this also causes at least one copy to be performed.

To show this, I have modified the test harness to not perform any copies itself. TThis GCC output therefore displays the copies done by the wrapper itself (clang performs even more copy/move operations):

    s()
in main

prvalue
    s()
    ...logging here...

lvalue
    s(const s&)
    ...logging here...

xvalue
    s(s&&)
    ...logging here...

https://gcc.godbolt.org/z/dfrYT8

It is, however, possible to create a solution that performs zero copy/move operations on both GCC and clang, by getting rid of the if constexpr and moving the differing implementations into two functions that are discriminated via enable_if:

template <typename F>
auto invoke_log_return(F&& f)
    -> std::enable_if_t<
        std::is_reference_v<decltype(std::forward<F>(f)())>,
        decltype(std::forward<F>(f)())
    >
{
    decltype(auto) result{std::forward<F>(f)()};
    std::printf("    ...logging glvalue...\n");
    return decltype(result)(result);
}

template <typename F>
auto invoke_log_return(F&& f)
    -> std::enable_if_t<
        !std::is_reference_v<decltype(std::forward<F>(f)())>,
        decltype(std::forward<F>(f)())
    >
{
    decltype(auto) result{std::forward<F>(f)()};
    std::printf("    ...logging prvalue...\n");
    return result;
}

Zero copies:

    s()
in main

prvalue
    s()
    ...logging prvalue...

lvalue
    ...logging glvalue...

xvalue
    ...logging glvalue...

https://gcc.godbolt.org/z/YKrhbs

Now, of course, this increases the number of lines versus the original solution, even though it returns the variable arguably "more perfectly" (in the sense that NRVO is performed by both compilers). Extracting the functionality into a utility function leads to your second question.

Q2: "Can the if constexpr { ... } else { ... } pattern be extracted to a separate function? The only way to extract it seems to be a macro."

Nope, as you cannot elide passing a prvalue into the function, which means that passing result into the function will cause a copy/move. For glvalues this is not a problem (as is shown by std::forward).

However, it is possible to change the control flow of the previous solution a bit, so that it itself can be used as a library function:

template <typename F>
decltype(auto) invoke_log_return(F&& f) {
    return invoke_return(std::forward<F>(f), [](auto&& s) {
        std::printf("    ...logging value at %p...", static_cast<void*>(&s));
    });
}

https://gcc.godbolt.org/z/c5q93c

The idea is to use the enable_if solution to provide a function that takes a generator function and an additional function that can then operate on the temporary value - be it prvalue, xvalue or lvalue. The library function could look like this:

template <typename F, typename G>
auto invoke_return(F&& f, G&& g)
    -> std::enable_if_t<
        std::is_reference_v<decltype(std::forward<F>(f)())>,
        decltype(std::forward<F>(f)())
    >
{
    decltype(auto) result{std::forward<F>(f)()};
    std::forward<G>(g)(decltype(result)(result));
    return decltype(result)(result);
}

template <typename F, typename G>
auto invoke_return(F&& f, G&& g)
    -> std::enable_if_t<
        !std::is_reference_v<decltype(std::forward<F>(f)())>,
        decltype(std::forward<F>(f)())
    >
{
    decltype(auto) result{std::forward<F>(f)()};
    std::forward<G>(g)(result);
    return result;
}

Q3: "Is there any good reason why clang++ does not perform NRVO for the prvalue case above? Should it be reported as a potential enhancement, or is g++'s NRVO optimization not legal here?"

Checking my C++2a draft (N4835 §11.10.5/1.1 [class.copy.elision]), NRVO is stated really quite simply:

  • in a return statement [check] in a function [check] with a class return type [the function template instantiates into a function that returns s, so check], when the expression is the name of a non-volatile [check] automatic [check] object (other than a function parameter or a variable introduced by the exception-decleration of a * handler* (14.4) [check]) with the same type (ignoring cv-qualification) as the function return type [check], the copy/move operation can be omitted by constructing the automatic object directly into the function call's return object.

I am not aware of any other reason why this should be invalid.