Motivation
Suppose I had:
int some_bss_values[8];
int some_data_values[] = {1,2,3,4,5,6,7,8};
int const some_rodata_values[] = {9,10,11,12,13,14,15,16};
//Some silly code that shows usage of bss, data, and rodata
int some_function(int x) {
some_bss_values[x] = some_rodata_values[x];
return some_data_values[x]++;
}
My ultimate goal is to compile this to a binary blob that I could load at runtime (this is an embedded system, so no dynamic linkers or even ELF loaders). Specifically, I want to be able to load this blob, data included, at any address and jump to some_function.
What I tried
I tried writing a simple linker script:
SECTIONS {
.text : {
*(.text);
}
.data : {
*(.bss); /*Placed here because I explicitly want the BSS to be part of the image*/
*(.data);
} =0
.rodata : { *(.rodata); }
}
I compiled the example code to an ELF with:
# Using -g and -O0 so we can have a readable disassembly. I'm actually using
# a cross-compiler but we'll use regular gcc with x86 for the sake of the question
gcc -Wl,-esome_function -o t.elf -fPIC -nostdlib -T my_linker_script.ld -g -O0 my_code.cpp
Then I generated an image with:
objcopy -Obinary -j.text -j.data -j.rodata t.elf t.bin
The problem
The above commands produce a binary blob for me to use, including the explicit zeroes for the BSS, but looking at the disassembly highlights a problem:
objdump -sSxC t.elf
...
Sections:
Idx Name Size VMA LMA File off Algn
0 .note.gnu.build-id 00000024 0000000000000000 0000000000000000 00200000 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .text 00000053 0000000000000024 0000000000000024 00200024 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .data 00000040 0000000000000080 0000000000000080 00200080 2**5
CONTENTS, ALLOC, LOAD, DATA
...
Disassembly of section .text:
0000000000000024 <some_function(int)>:
int some_bss_values[8];
int some_data_values[] = {1,2,3,4,5,6,7,8};
int const some_rodata_values[] = {9,10,11,12,13,14,15,16};
//Some silly code that shows usage of bss, data, and rodata
int some_function(int x) {
24: 55 push %rbp
25: 48 89 e5 mov %rsp,%rbp
28: 89 7d fc mov %edi,-0x4(%rbp)
some_bss_values[x] = some_rodata_values[x];
2b: 8b 45 fc mov -0x4(%rbp),%eax
2e: 48 98 cltq
30: 48 8d 14 85 00 00 00 lea 0x0(,%rax,4),%rdx
37: 00
38: 48 8d 05 a1 00 00 00 lea 0xa1(%rip),%rax # e0 <some_rodata_values>
3f: 8b 0c 02 mov (%rdx,%rax,1),%ecx
42: 48 c7 c0 80 00 00 00 mov $0x80,%rax
49: 8b 55 fc mov -0x4(%rbp),%edx
4c: 48 63 d2 movslq %edx,%rdx
4f: 89 0c 90 mov %ecx,(%rax,%rdx,4)
return some_data_values[x]++;
52: 48 c7 c0 a0 00 00 00 mov $0xa0,%rax
59: 8b 55 fc mov -0x4(%rbp),%edx
5c: 48 63 d2 movslq %edx,%rdx
5f: 8b 04 90 mov (%rax,%rdx,4),%eax
62: 8d 70 01 lea 0x1(%rax),%esi
65: 48 c7 c2 a0 00 00 00 mov $0xa0,%rdx
6c: 8b 4d fc mov -0x4(%rbp),%ecx
6f: 48 63 c9 movslq %ecx,%rcx
72: 89 34 8a mov %esi,(%rdx,%rcx,4)
}
75: 5d pop %rbp
76: c3 retq
Here we see that reading from rodata correctly uses instruction-relative addressing. However, it seems to be using a hardcoded address of 0 for the BSS, and a hardcoded address of 0xA0 for the data section.
How can I instruct gcc/ld to use instruction-relative addressing for data in the BSS and .data sections?