Hex machine code for mov immediate to 64-bit register doesn't have a REX.W prefix?

878 Views Asked by At

I have this code which says

global main
[BITS 64]

section .text
main:
     mov r13, 0x1234

     mov rax, 60
     mov rdi, 0
     syscall

When I translate manually this instruction mov r13, 0x1234, I've as hexadecimal code 0x48_BD_34_12_00_00.

The op code of the instruction is REX.W + B8+ rd io (I guess).

When I translate my file on Linux, the hexadecimal traduction is 0x41_BD_34_12_00_00.

41 is 0100_0001 b. But the REX.W says that W = 1, so it should be 0100_1001b.

So I don't understand why the REX prefix is 41h and not 49h.

1

There are 1 best solutions below

0
On

There are two reasons for this.

First, the instruction NASM encodes is actually mov r13d, 0x1234 instead of mov r13, 0x1234. This is because the former instruction is shorter but does the same thing.

Now why do we see this encoding? Here's an explanation:

41 bd 34 12 00 00
|| ||  ||||||||||
|| ||  ``````````-- immediate value
|| ``-------------- opcode b8 + reg (5)
``----------------- REX.B prefix

The register we want to encode has number 13. The low 3 bit of this register number are encoded in the opcode byte. The high bit is encoded in the REX.B bit. Hence, a REX.B prefix is needed.

If we wanted to encode mov r13, 0x1234 as nasm -O0 would, like mov r13, strict qword 0x1234 , it would look like this:

49 bd 34 12 00 00 00 00 00 00

Here we have a REX.BW prefix 49 to encode both the additional register bit and the 64 bit operand width. This is the mov r64, imm64 encoding, same opcode as mov r32, imm32 but with a REX.W.

Assemblers that don't optimize to a 32-bit register but do pick the shortest encoding for what you wrote (e.g. YASM or GAS) would use the mov r/m64, sign_extended_imm32 encoding, which you can get from NASM with mov r13, strict dword 0x1234. The C7 and C5 bytes are opcode and Mod/RM, followed by a 4-byte immediate.

49 c7 c5 34 12 00 00