How can I efficiently fill the most significant bit of a register with the least significant bit of another register in x64-Assembly. The Intended use is efficient division of a 128bit value by two (essentially a cross-register shift).
RDX:RAX (result after MUL-Operation)
Use the
shrdinstruction to shift a bit from the src into the destination:Alternatively, use a
shrand arcrinstruction, but note thatrcris multiple uops so this is slower on most CPUs: