Altivec vec_all_gt equivalent on arm neon

80 Views Asked by At

I am porting an application from Altivec to Neon.

I see a lot of intrinsics in Altivec which return scalar values.

Do we have any such intrinsics on ARM ?

For instance vec_all_gt

1

There are 1 best solutions below

0
On

There are no intrinsics that give scalar comparison results. This is because the common pattern for SIMD comparisons is to use branchless lane-masking and conditional selects to multiplex results, not branch-based control flow.

You can build them if you need them though ...

// Do a comparison of e.g. two vectors of floats
uint32x4_t compare = vcgeq_f32(a, b)

// Shift all compares down to a single bit in the LSB of each lane, other bits zero
uint32x4_t tmp = vshrq_n_u32(a.m, 31);

// Shift compare results up so lane 0 = bit 0, lane 1 = bit 1, etc.
static const int shifta[4] { 0, 1, 2, 3 };
static const int32x4_t shift = vld1q_s32(shifta);
tmp = vshlq_u32(tmp, shift)

// Horizontal add across the vector to merge the result into a scalar
return vaddvq_u32();

... at which point you can define any() (mask is non-zero) and all() (mask is 0xF) comparisons if you need branchy logic.