I have tried to compile c code to assembly code using gcc -S -fasm foo.c.
The c code declare global variable and variable in the main function as shown below:
int y=6;
int main()
{
int x=4;
x=x+y;
return 0;
}
now I looked in the assembly code that has been generated from this C code and I saw, that the global variable y is stored using the value of the rip instruction pointer.
I thought that only const global variable stored in the text segment but, looking at this example it seems that also regular global variables are stored in the text segment which is very weird.
I guess that some assumption i made is wrong, so can someone please explain it to me?
the assembly code generated by c compiler:
.file "foo.c"
.text
.globl y
.data
.align 4
.type y, @object
.size y, 4
y:
.long 6
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $4, -4(%rbp)
movl y(%rip), %eax
addl %eax, -4(%rbp)
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
The offsets between different sections of your executable are link-time constants, so RIP-relative addressing is usable for any section (including
.datawhere your non-constglobals are). Note the.datain your asm output.This applies even in a PIE executable or shared library, where the absolute addresses are not known until runtime (ASLR).
Runtime ASLR for position-independent executables (PIE) randomizes one base address for the entire program, not individual segment start addresses relative to each other.
All access to static variables uses RIP-relative addressing because that's most efficient, even in a position-dependent executable where absolute addressing is an option (because absolute addresses of static code/data are link-time constants, not relocated by dynamic linking in that case.)
Related and maybe duplicates:
In 32-bit x86, there are 2 redundant ways to encode an addressing mode with no registers and a
disp32absolute address. (With and without a SIB byte). x86-64 repurposed the shorter one asRIP+rel32, somov foo, %eaxis 1 byte longer thanmov foo(%rip), %eax.64-bit absolute addressing would take even more space, and is only available for
movto/from RAX/EAX/AX/AL unless you use a separate instruction to get the address into a register first.(In x86-64 Linux PIE/PIC, 64-bit absolute addressing is allowed, and handled via load-time fixups to put the right address into the code or jump table or statically-initialized function pointer. So code doesn't technically have to be position-independent, but normally it's more efficient to be. And 32-bit absolute addressing isn't allowed, because ASLR isn't limited to the low 31 bits of virtual address space.)
Note that in a non-PIE Linux executable, gcc will use 32-bit absolute addressing for putting the address of static data in a register. e.g.
puts("hello");will typically compile asIn the default non-PIE memory model, static code and data get linked into the low 32 bits of virtual address space, so 32-bit absolute addresses work whether they're zero- or sign-extended to 64-bit. This is handy for indexing static arrays, too, like
mov array(%rax), %edx;add $4, %eaxfor example.See 32-bit absolute addresses no longer allowed in x86-64 Linux? for more about PIE executables, which use position-independent code for everything, including RIP-relative LEA like 7-byte
lea .LC0(%rip), %rdiinstead of 5-bytemov $.LC0, %edi. See How to load address of function or label into registerI mention Linux because it looks from the
.cfidirectives like you're compiling for a non-Windows platform.