Does the fetch phase in the x86 CPU increment eip(PC) to the next instruction?

346 Views Asked by At

During the fetch phase of the instruction cycle in an x86 CPU, I've wondered if the eip(PC) register gets incremented to store the next instruction at the end of that phase(fetch phase) or after the execution phase?

I know that MIPS CPUs increment eip by the end of the fetch phase, but x86 CPUs are also doing it?

I assume it does because after I looked at a compiled code of some program, I've noticed that the address in the encoding of a "relative call instruction" is relative to the next instruction and not to the current instruction.

1

There are 1 best solutions below

0
On

"fetch phase?" What kind of chip you got in there, a Dorito? e.g. a 386? Even 486 was pipelined and P5 Pentium was dual-issue superscalar. So 386 was the only non-pipelined x86 with an EIP, not just an IP (at least from Intel). Of course, all commercial MIPS CPUs were pipelined as well, that was literally the whole point of the RISC ISA design and name (Microprocessor without Interlocked Pipelines Stages).

x86 machine code is a byte-stream of variable-length x86 instructions, so you definitely can't know the end of an instruction until after decoding it.

For pipelined fetch/decode, x86 CPUs have to just fetch a stream of blocks and decode a window from a fetch buffer. So the fetch address increments in the fetch stage (not phase), in parallel with decode and later stage(s) working on the results of previous fetches. (Modern x86 CPUs have up to 4-wide legacy decode (e.g. in Zen 2, or Skylake decoding 4 instructions per clock into up-to-5 uops, up from 4 insns -> 4 uops in Sandybridge). Perhaps even wider in Alder Lake. Usually they depend on the uop cache of already-decoded instructions to feed pipelines that are 5 or 6 uops wide; legacy decode is too hard to scale up)


As part of decode, any x86 CPU takes note of the end address (of each instruction decoded in parallel), because that's what relative jumps/calls are relative to, and same for x86-64 RIP-relative addressing modes. It's also the return address call has to push.

The start address is only needed for some kinds of exceptions, where the address of the faulting instruction is pushed. (So the OS can repair the situations, e.g. for a #PF page fault, and return to user-space to re-run the instruction and hopefully have it succeed.) But given speculative execution, a modern x86 does have to also note the start address of every instruction and track it throughout the pipeline, along with the end. (Or a start+length or end-length, since the length is at most 4 bits instead of 64 bits.)

Even original 8086 had pipelined prefetch separate from decode, but yes, decode would increment IP as it decoded, so it had the end (but not the start) of the instruction.

8086 did not remember the start address of the instruction at all during decode (which could iterate over an arbitrary number of prefixes (the 15-byte max insn length limit wasn't instituted until later). It didn't have many of the exceptions that modern x86 has (not even a #UD illegal-instruction trap: every byte-sequence executed as something.)

Even 8086 #DE divide exception pushed the final address, unlike later x86. (And even handling interrupts during interruptible instructions like rep cs movsb only pushed the address of the last prefix, not the first, so it would resume as cs movsb! Later x86 CPUs fixed that design flaw along with changing #DE semantics.)