I failed to find a flag that controls the named return value optimization for C language. For C++ it seems to be -fno-elide-constructors.
The source code implementing it is here but since it is a middle-end, no front end information is spoiled even in comments. The manual section did not exactly help either. However disassembling shows that as it is turned off on O0 and enabled on O1 it must be one of the following:
-fauto-inc-dec
-fcprop-registers
-fdce
-fdefer-pop
-fdelayed-branch
-fdse
-fguess-branch-probability
-fif-conversion2
-fif-conversion
-finline-small-functions
-fipa-pure-const
-fipa-reference
-fmerge-constants
-fsplit-wide-types
-ftree-builtin-call-dce
-ftree-ccp
-ftree-ch
-ftree-copyrename
-ftree-dce
-ftree-dominator-opts
-ftree-dse
-ftree-fre
-ftree-sra
-ftree-ter
-funit-at-a-time
C code:
struct p {
long x;
long y;
long z;
};
__attribute__((noinline))
struct p f(void) {
struct p copy;
copy.x = 1;
copy.y = 2;
copy.z = 3;
return copy;
}
int main(int argc, char** argv) {
volatile struct p inst = f();
return 0;
}
Compiled with O0 we see that the 'copy' structure is naively allocated on stack:
00000000004004b6 <f>:
4004b6: 55 push rbp
4004b7: 48 89 e5 mov rbp,rsp
4004ba: 48 89 7d d8 mov QWORD PTR [rbp-0x28],rdi
4004be: 48 c7 45 e0 01 00 00 mov QWORD PTR [rbp-0x20],0x1
4004c5: 00
4004c6: 48 c7 45 e8 02 00 00 mov QWORD PTR [rbp-0x18],0x2
4004cd: 00
4004ce: 48 c7 45 f0 03 00 00 mov QWORD PTR [rbp-0x10],0x3
4004d5: 00
4004d6: 48 8b 45 d8 mov rax,QWORD PTR [rbp-0x28]
4004da: 48 8b 55 e0 mov rdx,QWORD PTR [rbp-0x20]
4004de: 48 89 10 mov QWORD PTR [rax],rdx
4004e1: 48 8b 55 e8 mov rdx,QWORD PTR [rbp-0x18]
4004e5: 48 89 50 08 mov QWORD PTR [rax+0x8],rdx
4004e9: 48 8b 55 f0 mov rdx,QWORD PTR [rbp-0x10]
4004ed: 48 89 50 10 mov QWORD PTR [rax+0x10],rdx
4004f1: 48 8b 45 d8 mov rax,QWORD PTR [rbp-0x28]
4004f5: 5d pop rbp
4004f6: c3 ret
Compiled with O1 it is not allocated but a pointer is passed as an implicit argument
00000000004004b6 <f>:
4004b6: 48 89 f8 mov rax,rdi
4004b9: 48 c7 07 01 00 00 00 mov QWORD PTR [rdi],0x1
4004c0: 48 c7 47 08 02 00 00 mov QWORD PTR [rdi+0x8],0x2
4004c7: 00
4004c8: 48 c7 47 10 03 00 00 mov QWORD PTR [rdi+0x10],0x3
4004cf: 00
4004d0: c3 ret
The closest thing to that in GCC (i.e. a switch for copy elision) is
-fcprop-registers
. Copy elision doesn't exist in C, but this is the most similar feature to that. From the man page: