The 8086 documentation sites seem a bit vague when the MODR/M byte is mentioned and it's really difficult to comprehend what it is and does.
What are all the bits used for in the MODR/M byte and what are the possible options?
Some documentation I've found: https://www.scs.stanford.edu/05au-cs240c/lab/i386/s17_02.htm
The ModR/M byte contains three fields of information:
The mod field, which occupies the two most significant bits of the byte, combines with the r/m field to form 32 possible values: eight registers and 24 indexing modes
The reg field, which occupies the next three bits following the mod field, specifies either a register number or three more bits of opcode information. The meaning of the reg field is determined by the first (opcode) byte of the instruction.
The r/m field, which occupies the three least significant bits of the byte, can specify a register as the location of an operand, or can form part of the addressing-mode encoding in combination with the field as described above
What is an indexing mode? What is a register number? How is a register represented? etc.
Intel's own PDF manuals document this in detail; see vol.2 of the SDM, specifically the intro chapters before the entries for each instruction.
There are also detailed descriptions on various sites like https://wiki.osdev.org/X86-64_Instruction_Encoding#ModR.2FM_and_SIB_bytes (which covers 16-bit ModRM, so it's not just talking about x86-64 long mode.) Modern x86 uses the same instruction encoding (in 16-bit real mode) as 8086; that backwards compat is the whole point of x86, and why it's so nasty.
And of course you can get find PDF copies of the actual 8086 manual itself, in case that's more helpful to omit stuff that's only relevant for other modes.
The 8086 primer from page 23 onwards covers instruction encoding of operands. It's written as a book, not just a technical manual. It's available for free on Stephen Morse's web site (https://stevemorse.org/8086/), the guy who designed it when he was at Intel.
But maybe it would help to describe the basic overview of the purpose of ModRM, so you know what to look for in those docs.
ModR/M purpose and basics
Most (but not all) x86 instructions have one ModRM byte. It can code for 2 operands, up to one of them being memory, or both registers. e.g.
add cx, ax
, oradd cx, [bx+si]
.The opcode itself determines which of the r/m and r operands are the source and/or destination, or whether the
/r
field acts as extra opcode bits. (e.g. for shifts, that's why they can't copy-and-shift, or use a count register other than CL.)add [bx+si], cx
has the same ModRM byte asadd cx, [bx+si]
but a different opcode.The register-only operand is code by the 3-bit /r field. 3 bits can code for any of x86's 8 general-purpose registers. This is a "register number", like in any normal ISA with 2^n registers, groups of n bits in each instruction code for register operands.
The
r/m
operand can also be a register, but the 2-bit "mode" field determines whether the 3-bit r/m field is a register number (mod=0b11) or whether it's a memory addressing mode. (Plus an 8 or 16-bit displacement, so coding for a disp0/8/16 uses up the other 3 encodings of the mode field.)https://wiki.osdev.org/X86-64_Instruction_Encoding#ModR.2FM_and_SIB_bytes shows the fields and interpretation for 16-bit address-size, including register numbers.
So there are only 3 bits to specify a register or combination of registers for the memory address. 386 added an escape code for a SIB byte, allowing a full selection of addressing modes like
[eax + ecx*4]
, but 8086 (and 16-bit address-size on any CPU) must be some subset of[BX|BP] + [SI|DI] + disp0/8/16
.See Differences between general purpose registers in 8086: [bx] works, [cx] doesn't? / Why don't x86 16-bit addressing modes have a scale factor, while the 32-bit version has it?
Examples from assembling
foo.asm
and thenndisasm -b16 foo
, or from asking NASM itself to make a listing withnasm -l/dev/stdout foo.asm
. Then editing to simplify the output fields.To create more examples, use an assembler to create machine code yourself.
See also
/r
field of ModRM is used as extra opcode bits, likeFF /2
CALL r/m16 Call near, absolute indirect.