I thought that there was zero. But, I see here,
Instructions with two memory operands are extremely rare
I can't find anything that explains what instructions, though rare, exist. What are the exceptions?
I thought that there was zero. But, I see here,
Instructions with two memory operands are extremely rare
I can't find anything that explains what instructions, though rare, exist. What are the exceptions?
Copyright © 2021 Jogjafile Inc.
An x86 instruction can have at most one ModR/M + SIB + disp0/8/32. So there are zero instructions with two explicit memory operands.
The x86 memory-memory instructions all have at least one implicit memory operand whose location is baked in to the opcode, like
pushwhich accesses the stack, or the string instructionsmovsandcmps.I'll use
[mem]to indicate a ModR/M addressing mode which can be[rdi],[RIP+whatever],[ebx+eax*4+1234], or whatever you like.push [mem]: reads[mem], writes implicit[rsp](after updatingrsp).pop [mem]call [mem]: reads a new RIP from[mem], pushes a return address on the stack.movsb/w/d/q: readsDS:(E)SI, writesES:(E)DI(or in 64-bit mode RSI and RDI). Both are implicit; only theDSsegment reg is overridable. Usable withrep.cmpsb/w/d/q: readsDS:(E)SIandES:(E)DI(or in 64-bit mode RSI and RDI). Both are implicit; only theDSsegment reg is overridable. Usable withrepe/repne.MPX
bndstx mib, bnd: "Store the bounds in bnd and the pointer value in the index register of mib to a bound table entry (BTE) with address translation using the base of mib." The Operation section shows a load and a store, but I don't know enough about MPX to grok it.movdir64b r16/r32/r64, m512. Has its own feature bit, available in upcoming Tremont (successor to Goldmont Plus Atom). Moves 64-bytes as direct-store (WC) with 64-byte write atomicity from source memory address to destination memory address. Destination operand is (aligned atomic)es:/rfrom ModRM, source is (unaligned non-atomic) the/mfrom ModRM.Uses write-combining for the store, see the description. It's the first time any x86 CPU vendor has guaranteed atomicity wider than 8 bytes outside of
lock cmpxchg16b. But unfortunately it's not actually great for multithreading because it forces NT-like cache eviction/bypass behaviour, so other cores will have to read it from DRAM instead of a shared outer cache.AVX2 gather and AVX512 scatter instructions are debatable. They obviously do multiple loads / stores, but all the pointers come from one SIMD vector (and a scalar base).
I'm not counting instructions like
pusha,fldenv,xsaveopt,iret, orenterwith nesting level > 1 that do multiple stores or loads to a contiguous block.I'm also not counting the
ins/outsstring instructions, because they copy memory to/from I/O space. I/O space isn't memory.I didn't look at VMX or SGX instructions on http://felixcloutier.com/x86/index.html, just the main list. I don't think I missed any, but I certainly could have.