I'm having trouble understanding a very basic x86 instruction. The instruction is
0x080491d7 <+1>: mov %esp,%ebp
I know that it moves the value of esp into ebp. But I'm trying to understand the opcodes. The instruction is 2 bytes long, not 1 which I'm confused about. I would've thought it would only be 1 byte.
The memory for this instruction is:
0x80491d7 <main+1>: 0x89 0xe5
I know that 0x89
is one of the opcodes for MOV. I've been reading the Intel manuals. I don't know what 0xe5
is for. Is it like a suffix or another opcode value or something else? The Intel manual is a little confusing.
The c program is compiled for x86 32 bit and the Linux server is x86_64.
You found that the
mov %esp, %ebp
instruction got encoded with 2 bytes: 0x89, 0xE5.Consulting the Intel manuals is the right thing to do, but I would advice to look at your instruction using the proper Intel syntax
mov ebp, esp
. It might save you from an inadvertent error interpreting the opcode tables.Looking up 89h in the one-byte opcode table, you see in the table mentioned "Ev, Gv".
The "Using opcode tables" chapter explains what these character combinations mean.
So that second byte is a ModR/M byte.
Your ModR/M byte is E5h or 11'100'101b in binary notation following the grouping 'mod-reg-r/m'.
Which registers? For that we look at the opcode 89h or 100010'0'1b in binary notation following the grouping 'TTTTTT-d-w'.
Bit 0 (w) tells us this is a (d)word-sized operation (which accords with the mention "v" above). Since this is 32-bit code and no operand size prefix (0x66) was used, what remains is
ESP/EBP
.Bit 1 (d) tells us which of these operands is the source or the destination (which accords with the mention "E,G" above). Since this bit is 0, the reg field (ESP) indicates the source and the r/m field (EBP) indicates the destination. With a set d-bit it would be the other way round, meaning the bytes 0x8B, 0xEC would also be a perfect encoding for your instruction
mov %esp, %ebp
.