In the case of a 64-bit x86 register, is it possible to hold more than one value at a time in the same register, if the size of an value is small enough such that multiple instructions could fit into a register? For example fitting two 32 bit ints into one register. Would this be a bad thing to do if it is possible? I've been reading up on registers and I'm quite new to the concept.
Can a register hold multiple values at a time?
2.3k Views Asked by Luke Davis At
2
There are 2 best solutions below
3
paxdiablo
On
Registers don't tend to hold instructions, they instead hold data to be worked on by instructions.
However, if you wanted to store instructions as data, I believe (from here) that the longest x86 instruction is about fifteen bytes, or 120 bits. So, no, it won't fit into a single 64-bit register.
In terms of holding multiple data values in a single register, that is certainly possible. This is even supported by the hardware, with even the earliest x86 chips having ah and al which together formed the ax register.
Even without that, you can certainly insert/extract "sub-registers" into/from registers, by using the bitwise operations (like and, or, not and xor), and the bit shift operations (like shl, shr, rol, and ror).
Related Questions in ASSEMBLY
- (x64 Nasm) Writeline function on Linux
- Is the compiler Xcode uses to produce Assembly code a bad compiler?
- Why do we need AX instead of MOV DS, data directly with a segment?
- Bootloader in Assembly with Linux kernel
- How should the byte sequence 0x40 0x55 be interpreted by an x86-64 emulator?
- C++ code into assembly
- Drawing circles of increasing radius
- Assembly print on screen using pop ecx
- Equivalent to asm volatile in Gfortran?
- Show 640x480 BMP image with inline ASM c++
- Keep track of numbers entered in by a user in assembly
- 8086 Assembly Arrays with I/O
- DB ASM variable in Inline ASM C++
- What does Jump to means in callgrind?
- How to convert binary into decimal in assembly x8086?
Related Questions in X86-64
- How should the byte sequence 0x40 0x55 be interpreted by an x86-64 emulator?
- os kern error : "ld: symbol(s) not found for architecture x86_64"
- Can feenableexcept hurt a program performance?
- MASM console window creation troubles (maybe my stack frame??)
- Bomb lab phase 5
- Displaying symbolic constants in Assembly Language
- puts implementation in assembly with nasm x86-64
- Can I use the mid bits on pointers (on 64-bit machines) to implement things (like tagged pointers)?
- Segmentation fault when exploit string overwrites the return pointer with the starting address on stack
- Reserve bytes in stack: x86 Assembly (64 bit)
- Assembly: why some x86 opcodes are invalid in x64?
- Clang runtime fault when throwing aligned type. Compiler bug?
- What fpu_xrstor_checking does? extra commentary?
- NASM: copying a pointer from a register to a buffer in .data
- Using an x64 assembler to reference memory
Related Questions in SIMD
- OpenMP SIMD on Power8
- How to add values from vector to each other
- Effective way to extract from SSE vector on AMD processors
- Running Yeppp library with Mono on Raspbery Pi
- Store, modify and retrieve strings with GCC Vector Extensions?
- parallelizing matrix multiplication through threading and SIMD
- SSE - AVX conversion from double to char
- 32-bit Hamming String formation from 32 8-bit comparisons
- Optimizing SIMD histogram calculation
- Initializing int4 using Swift; bug or expected behaviour?
- Vectorize 2d-array access (GCC)
- Is it really efficient to use Karatsuba algorithm in 64-bit x 64-bit multiplication?
- (Vec4 x Mat4x4) product using SIMD and improvements
- What are some rules of thumb for when SIMD would be faster? (SSE2, AVX)
- How can I use simd in MIPS?
Related Questions in CPU-REGISTERS
- How do compilers store hundreds of variables in only a few registers?
- Inline assembly in kernel module
- Reserve bytes in stack: x86 Assembly (64 bit)
- Inconsistent register values after setting up them in a Jprobes module
- x86 assembly registers addresses
- (Lower level of C++) When using "cout" on a piece of data, were does it go to before being displayed on screen?
- C++ Error Reading Register Value, can't debug
- GDB info registers command - Second column of output
- Why is the register length static in any CPU
- What is the difference between "mov (%rax),%eax" and "mov %rax,%eax"?
- Trouble understanding registers x86
- atmega: register data gets corrupted by division operation
- How are registers and other information preserved during function calls in C?
- How to use Hi(r8-r12) register in Cortex-m0?
- Storing variables in CPU registers
Related Questions in SWAR
- How does this color blending trick that works on color components in parallel work?
- Can packing variables or parameters into structures/unions introduce unforseen performance penalties?
- Fastest way to find 16bit match in a 4 element short array?
- SIMD-within-a-register version of min/max
- Performantly reverse the order of 16-bit quantities within a 64-bit word
- Multiplication of two packed signed integers in one
- Compare 64-bit integers by segments
- Can a register hold multiple values at a time?
- Signed INT Conversion of MSB ->LSB and LSB->MSB in C++
- SWAR byte counting methods from 'Bit Twiddling Hacks' - why do they work?
- How does this algorithm to count the number of set bits in a 32-bit integer work?
- How to implement SWAR unsigned less-than?
- How to write a SWAR comparison which puts 0xFF in a lane on matches?
- How to check if a register contains a zero byte without SIMD instructions
- Speed up strlen using SWAR in x86-64 assembly
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Registers don't hold instructions, but I'll assume you meant fitting multiple values into one register, so that you can add them both with one instruction.
Yes, this is called SIMD. (Single Instruction, Multiple Data) On x86-64, SSE2 (Streaming SIMD Extensions) is guaranteed to be available, so you have sixteen different 16-byte registers (xmm0..15). And you have instructions that can do packed FP add/sub/mul/div/sqrt/cmp of 4x 32-bit floats, 2x 64-bit double, packed integer add/sub/cmp/shift/etc for byte, word, dword, and qword operand-sizes.
(With some gaps; SSE2 is not very orthogonal, e.g. narrowest shift is 16-bit, packed min/max only available for certain sizes. Some of these gaps are filled in by SSE4.1).
And bitwise-boolean stuff where element width is irrelevant (until AVX512 with mask registers...)
See https://www.felixcloutier.com/x86/.
p...instructions likepaddware packed-integer....psandpdare floating point packed-single or packed-double.Compilers frequently use SSE/SSE2 instructions like
movdqato zero or copy memory in 16-byte chunks, as well as to "vectorize" (use SIMD computations) for loops over arrays. And GCC 7 or 8 and later know how to coalesce loads/stores of adjacent struct members or array elements into a scalar load or store using RAX, for example.e.g. this sum of an array:
compiles like this with GCC9.3 -O3 for x86-64 on the Godbolt compiler explorer
Vectorization is sort of like parallelization and for a reduction like this (summing an array down to scalar) requires associative operations. e.g. an FP version would only vectorize with
-ffast-mathor with OpenMP.In a general purpose register like RAX that doesn't have instructions to do SIMD addition without carry between byte boundaries (like
paddb xmm0, xmm1would), it's called SWAR (SIMD within a register).This technique was more useful in the past, on ISAs without a proper SIMD instruction set like Alpha or MIPS64. But it's still possible, and SWAR techniques can be useful as part of something like a popcount without the
popcntinstruction, e.g. masking out every other bit and shifting so you're effectively doing 32 separate additions (that can't overflow into each other) into 2-bit accumulators.The popcnt bithack shown in How to count the number of set bits in a 32-bit integer? does that, widening to 4-bit counters then 8-bit, then using a multiply to shift-and-add by 4 different shifts and produce the sum in the high byte.