I have this code which says
global main
[BITS 64]
section .text
main:
mov r13, 0x1234
mov rax, 60
mov rdi, 0
syscall
When I translate manually this instruction mov r13, 0x1234
, I've as hexadecimal code 0x48_BD_34_12_00_00
.
The op code of the instruction is REX.W + B8+ rd io (I guess).
When I translate my file on Linux, the hexadecimal traduction is 0x41_BD_34_12_00_00
.
41 is 0100_0001 b. But the REX.W says that W = 1, so it should be 0100_1001b.
So I don't understand why the REX prefix is 41h and not 49h.
There are two reasons for this.
First, the instruction NASM encodes is actually
mov r13d, 0x1234
instead ofmov r13, 0x1234
. This is because the former instruction is shorter but does the same thing.Now why do we see this encoding? Here's an explanation:
The register we want to encode has number 13. The low 3 bit of this register number are encoded in the opcode byte. The high bit is encoded in the REX.B bit. Hence, a REX.B prefix is needed.
If we wanted to encode
mov r13, 0x1234
asnasm -O0
would, likemov r13, strict qword 0x1234
, it would look like this:Here we have a REX.BW prefix
49
to encode both the additional register bit and the 64 bit operand width. This is themov r64, imm64
encoding, same opcode asmov r32, imm32
but with a REX.W.Assemblers that don't optimize to a 32-bit register but do pick the shortest encoding for what you wrote (e.g. YASM or GAS) would use the
mov r/m64, sign_extended_imm32
encoding, which you can get from NASM withmov r13, strict dword 0x1234
. The C7 and C5 bytes are opcode and Mod/RM, followed by a 4-byte immediate.