How is address arithmetic handled in NASM for x86 in hardware

109 Views Asked by At

If I have an address in the rbx register and use an instruction like

mov rax, [rbx+1]

Is rbx+1 computed in hardware during runtime? If so are some registers used or is there a dedicated hardware piece?

I figured doing the same instruction but with a symbol instead of a register like so

string: db "I'm lost", 0

mov rax, [string+1]

would allow the calculation to be done at compile-time since it already has a location in memory reserved. Whereas rbx would be more variable and unknown until runtime.

2

There are 2 best solutions below

0
On BEST ANSWER

All CPUs, even original 8086, had some temporary buffers separate from the architectural registers. 8086 used the main ALU for address math, so add ax, [bx + si + 1] would need to use that temporary storage; the address math doesn't affect the software-visible value in the BX or RBX register.

Old CPUs like 8086 handled even simple x86 instructions by running a sequence of internal microcode instructions. Modern CPUs decode an instruction like mov eax, [rbx+1] to a single micro-op (uop) for a load execution unit. (They do still have buffering between pipeline stages, and even some temporary registers for instructions like xchg eax, ecx to use; on Intel that's a 3-uop instruction that's something like mov internal_tmp, ecx / mov ecx, eax / mov eax, internal_tmp.)

Modern CPUs have dedicated address-generation units (AGUs) as part of load and store-address execution units, separate from their ALU execution units. See https://realworldtech.com/sandy-bridge/10 for a block diagram.

Related:

Note that LEA is a separate animal; its result is written to a register, not used for load or store, so modern CPUs run it on an ALU execution unit as just a shift-and-add instruction. (With some of the ALUs that support it not supporting the shift part, or only supporting one add, depending on the CPU model. See Using LEA on values that aren't addresses / pointers? - although this is true regardless of whether the integer value happens to a valid pointer or not.

3
On

The operand [rbx+1] represents a valid addressing mode. If it wasn't, then your assembler would issue an error, because it would not know what to do with it, since it involves a register, whose value will only be known at runtime. Addressing modes are resolved at runtime by the hardware, and they never modify any registers other than the ones that are visible in the instruction.

The operand [string+1] is (as far as I know) not a valid addressing mode. But that is okay, because your assembler knows what to do with it: the address of string is known during assembly time, so the assembler uses that address plus one to compute the actual address, and replaces [string+1] with [actual_address], which is a valid addressing mode. The assembler could, in theory, (and might in fact do in practice,) allow an entire expression in place of +1, and as long as the expression can be evaluated during assembly time, all is good.

The same would hold true with [string+bx+1]; strictly speaking, this is not a valid addressing mode, but the assembler knows how to combine all the offsets that are known at assembly time into just one offset, which results in [computed_offset+bx], which is a valid addressing mode.