I am trying to understand the assembly of this simple C program.
#include<stdio.h>
#include<unistd.h>
#include<fcntl.h>
#include<string.h>
void foobar(char *a){
char c = a[0];
}
int main(){
int fd = open("file.txt", O_RDONLY);
char buf1[100]="\0";
char buf[100];
int aa=0,b=1,c=2,d=3,f=2,g=3;
read(fd,buf1,104);
if(strlen(buf1) > 100){
}else{
strcpy(buf,buf1);
}
//strcpy(buf,buf1);
foobar(buf1);
}
The disassembly of the executable using gdb which i got was foobar disassembly.
0x000000000040067d <+0>: push rbp
0x000000000040067e <+1>: mov rbp,rsp
0x0000000000400681 <+4>: mov QWORD PTR [rbp-0x18],rdi
0x0000000000400685 <+8>: mov rax,QWORD PTR [rbp-0x18]
0x0000000000400689 <+12>: movzx eax,BYTE PTR [rax]
0x000000000040068c <+15>: mov BYTE PTR [rbp-0x1],al
0x000000000040068f <+18>: pop rbp
main disassembly just before foobar
0x0000000000400784 <+243>: lea rax,[rbp-0xf0]
0x000000000040078b <+250>: mov rdi,rax
0x000000000040078e <+253>: call 0x40067d <foobar>
0x0000000000400793 <+258>: mov rbx,QWORD PTR [rbp-0x18]
0x0000000000400797 <+262>: xor rbx,QWORD PTR fs:0x28
0x00000000004007a0 <+271>: je 0x4007a7 <main+278>
0x0000000000400690 <+19>: ret
Now, i have a question regarding the disassembly of foobar
0x0000000000400681 <+4>: mov QWORD PTR [rbp-0x18],rdi
0x0000000000400685 <+8>: mov rax,QWORD PTR [rbp-0x18]
Wouldn't the instruction
mov rax, rdi
would do the work required by the above two instruction. Why using extra memory location rbp - 0x18 for rdi ?
Is it related to pass by reference?
Edit:
Another question which i want to ask is why the foobar function is accessing something(rbp - 0x18) which is not in the frame of foobar.?
My gcc version is gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
Edit: After using -O1 -O2 and -O3 optimization flag while compiling, the foobar assembly changes to
0x0000000000400670 <+0>: repz ret
and while using -O3 flag some of the disassembly of main is
0x0000000000400551 <+81>: rep stos QWORD PTR es:[rdi],rax
0x0000000000400554 <+84>: mov DWORD PTR [rdi],0x0
0x000000000040055a <+90>: mov cl,0x64
0x000000000040055c <+92>: mov edi,r8d
0x000000000040055f <+95>: call 0x4004b0 <__read_chk@plt>
0x0000000000400564 <+100>: mov rdx,QWORD PTR [rsp+0x68]
0x0000000000400569 <+105>: xor rdx,QWORD PTR fs:0x28
0x0000000000400572 <+114>: jne 0x400579 <main+121>
0x0000000000400574 <+116>: add rsp,0x78
0x0000000000400578 <+120>: ret
0x0000000000400579 <+121>: call 0x4004c0 <__stack_chk_fail@plt>
I can't find any call to foobar in main .
This is a good question. I commend you for "peeking under the hood", so to speak.
Tons of research has gone into compiling code. Sometimes you want code to run fast, sometimes you want it to be small, and sometimes you want it to compile quickly. Thanks to compilers research, a compiler can generate code that behaves in any of these mentioned ways. To allow users to pick which one of these options they want, gcc has command line options that control the level of optimization.
By default, gcc uses -O0, which does not optimize code much, but instead focuses on the fastest compile time. Because of this, you will sometimes find inefficient instruction sequences.
When you turn on the -O3 flag, the compiler inlines the code for foobar. As you know, function calls take time, so, if the function foobar is short enough, the compiler will just copy the whole code for foobar instead of calling it, thereby eliminating the need for the call and ret instructions. This makes the code a tiiiiiny bit faster, but it also makes it bigger.
Consider a 100-instruction function that is called 100 times. If this function is inlined, the code size will increase drastically, for not much extra speed. The compiler only inlines code if you have a high optimization level set and the function in question is quite small.
You have probably noticed that there is nothing in place of the foobar function. It has been "optimized out", meaning that the compiler completely deleted it. This is because the compiler can tell that foobar doesn't do anything useful. That is, it has no side effects. At -O0, nothing is optimized out. At higher optimization levels, gcc starts to optimize out functions with no side effects to save space.
I haven't written x86 assmembly in a few years (just arm nowadays), but if I recall correctly,
repz retis practically a more efficient form of ret due to branch prediciton. more info can be found here.I have to go to sleep now. If you still have questions, I will respond later :).