Do AArch64 SIMD instructions zero/sign extend results?

101 Views Asked by John Källén At 20 December 2023 at 12:50

I'm maintaining the Reko decompiler and working on bugfixes in its support for AArch64. I've been asked to fix an issue in an AArch64 binary that contains the following instruction:

0EA0B9BF abs v31.2s,v13.2s

I've compared the output above (which comes from Reko's AArch64 disassembler) with objdump and it matches. I've consulted the ARM documentation for the abs instruction to understand what this instruction is doing.

Note that the .2s suffixes imply that the instruction is operating on two 32-bit signed integers, but v31 and v13 are 128-bit. My initial guess was that the instruction leaves the upper 64 bits of v31 untouched, but I'm not certain my interpretation is correct. Consulting scripture reveals the following pseudocode for abs:

CheckFPAdvSIMDEnabled64();
 bits(datasize) operand = V[n];
 bits(datasize) result;
 integer element;

 for e = 0 to elements-1
 element = SInt(Elem[operand, e, esize]);
 if neg then
   element = -element;
 else
   element = Abs(element);
 Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

In the pseudocode, the result variable is not initialized with the original value of V[d] but does write back the whole 128 bits in the final pseudo-statement.

So: is result actually zero-initialized, meaning that the upper 64 bits are cleared after execution of this instruction? And will this apply to all SIMD instructions whose outputs do not "cover" the full 128 bits of the destination SIMD register?

Unfortunately I don't have the appropriate silicon to test this myself, and available AArch64 emulators are crashing when I try using them with SIMD instructions.

Original Q&A

There are 1 best solutions below

fuz On 20 December 2023 at 13:19 BEST ANSWER

As per ARM Architecture Reference Manual (Armv8, for A-profile architecture), section C1.2.5:

SIMD and floating-point scalar register names

SIMD and floating-point instructions that operate on scalar data only access the lower bits of a SIMD and floating-point register. The unused high bits are ignored on a read and cleared to 0 on a write.

(...)

SIMD vector register names

If a register holds multiple data elements on which arithmetic is performed in a parallel, SIMD, manner, then a qualifier describes the vector shape. The vector shape is the element size and the number of elements or lanes. If the element size in bits multiplied by the number of lanes does not equal 128, then the upper 64 bits of the register are ignored on a read and cleared to zero on a write.

This confirms that writes to 64 bit ASIMD vectors zero out the upper bits of the vector. Note that behaviour is different in AArch32 state, where 64 bit and 128 bit vectors have different numbering schemes and merely overlap. There, no other half exists to be zeroed out as 64 bit operations target 64 bit vector registers.

Do AArch64 SIMD instructions zero/sign extend results?

There are 1 best solutions below

SIMD and floating-point scalar register names

SIMD vector register names

Related Questions in ASSEMBLY

Related Questions in SIMD

Related Questions in ARM64

Related Questions in CPU-REGISTERS

Related Questions in NEON

Trending Questions

Popular # Hahtags

Popular Questions