I have this code which says
global main
[BITS 64]
section .text
main:
mov r13, 0x1234
mov rax, 60
mov rdi, 0
syscall
When I translate manually this instruction mov r13, 0x1234, I've as hexadecimal code 0x48_BD_34_12_00_00.
The op code of the instruction is REX.W + B8+ rd io (I guess).
When I translate my file on Linux, the hexadecimal traduction is 0x41_BD_34_12_00_00.
41 is 0100_0001 b. But the REX.W says that W = 1, so it should be 0100_1001b.
So I don't understand why the REX prefix is 41h and not 49h.
There are two reasons for this.
First, the instruction NASM encodes is actually
mov r13d, 0x1234instead ofmov r13, 0x1234. This is because the former instruction is shorter but does the same thing.Now why do we see this encoding? Here's an explanation:
The register we want to encode has number 13. The low 3 bit of this register number are encoded in the opcode byte. The high bit is encoded in the REX.B bit. Hence, a REX.B prefix is needed.
If we wanted to encode
mov r13, 0x1234asnasm -O0would, likemov r13, strict qword 0x1234, it would look like this:Here we have a REX.BW prefix
49to encode both the additional register bit and the 64 bit operand width. This is themov r64, imm64encoding, same opcode asmov r32, imm32but with a REX.W.Assemblers that don't optimize to a 32-bit register but do pick the shortest encoding for what you wrote (e.g. YASM or GAS) would use the
mov r/m64, sign_extended_imm32encoding, which you can get from NASM withmov r13, strict dword 0x1234. The C7 and C5 bytes are opcode and Mod/RM, followed by a 4-byte immediate.