following this instructions I have managed to produce only 528 bytes in size a.out (when gcc main.c gave me 8539 bytes big file initially).
main.c was:
int main(int argc, char** argv) {
return 42;
}
but I have built a.out from this assembly file instead:
main.s:
; tiny.asm
BITS 64
GLOBAL _start
SECTION .text
_start:
mov eax, 1
mov ebx, 42
int 0x80
with:
me@comp# nasm -f elf64 tiny.s
me@comp# gcc -Wall -s -nostartfiles -nostdlib tiny.o
me@comp# ./a.out ; echo $?
42
me@comp# wc -c a.out
528 a.out
because I need machine code I do:
objdump -d a.out
a.out: file format elf64-x86-64
Disassembly of section .text:
00000000004000e0 <.text>:
4000e0: b8 01 00 00 00 mov $0x1,%eax
4000e5: bb 2a 00 00 00 mov $0x2a,%ebx
4000ea: cd 80 int $0x80
># objdump -hrt a.out
a.out: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .note.gnu.build-id 00000024 00000000004000b0 00000000004000b0 000000b0 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .text 0000000c 00000000004000e0 00000000004000e0 000000e0 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
SYMBOL TABLE:
no symbols
file is in little endian convention:
me@comp# readelf -a a.out
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x4000e0
Start of program headers: 64 (bytes into file)
Start of section headers: 272 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 2
Size of section headers: 64 (bytes)
Number of section headers: 4
Section header string table index: 3
now I want to execute this like this:
#include <unistd.h>
// which version is (more) correct?
// this might be related to endiannes (???)
char code[] = "\x01\xb8\x00\x00\xbb\x00\x00\x2a\x00\x00\x80\xcd\x00";
char code_v1[] = "\xb8\x01\x00\x00\x00\xbb\x2a\x00\x00\x00\xcd\x80\x00";
int main(int argc, char **argv)
{
/*creating a function pointer*/
int (*func)();
func = (int (*)()) code;
(int)(*func)();
return 0;
}
however I get segmentation fault. My question is: is this section of text
4000e0: b8 01 00 00 00 mov $0x1,%eax
4000e5: bb 2a 00 00 00 mov $0x2a,%ebx
4000ea: cd 80 int $0x80
(this machine code) all I really need? What I do wrong (endiannes??), maybe I just need to call this in different way since SIGSEGV?
The code must be in a page with execute permission. By default, stack and read-write static data (like non-const globals) are in pages mapped without exec permission, for security reasons.
The simplest way is to compile with
gcc -z execstack
, which links your program such that stack and global variables (static storage) get mapped in executable pages, and so do allocations withmalloc
.Another way to do it without making everything executable is to copy this binary machine code into an executable buffer.
Without
__builtin___clear_cache
, this could break with optimization enabled because gcc would think thememcpy
was a dead store and optimize it away. When compiling for x86,__builtin___clear_cache
does not actually clear any cache; there are zero extra instructions; it just marks the memory as "used" so stores to it aren't considered "dead". (See the gcc manual.)Another option would be to
mprotect
the page containing thechar code[]
array, giving itPROT_READ|PROT_WRITE|PROT_EXEC
. This works whether it's a local array (on the stack) or global in the.data
.Or if it's
const char code[]
in the.rodata
section, you might just give itPROT_READ|PROT_EXEC
.(In versions of binutils
ld
from before about 2019, the.rodata
got linked as part of the same segment as.text
, and was already mapped executable. But recentld
gives it a separate segment so it can be mapped without exec permission soconst char code[]
doesn't give you an executable array anymore, but it used to so you may this old advice in other places.)