About vsubq_u16(uint16x8_t, uint16x8_t)

606 Views Asked by At

About

vsubq_u16(uint16x8_t a, uint16x8_t b)

The return value is also uint16x8_t. Then if a is smaller than b, we will get a very large uint16x8_t instead of a negative value, it's not what I need.

If I have such requirement,

uint16_t c = fabs(uint16_t a - uint16_t b);

How can I transform to neon intrinsics? Thanks.

2

There are 2 best solutions below

0
On BEST ANSWER

looks like you want the absolute difference between your inputs. If so the following intrinsic does exactly this:

uint16x8_t vabdq_u16 (uint16x8_t, uint16x8_t) 
2
On

I had seen a series of questions asked by you in the neon section and I guess you are so much confused with the neon code instructions when you keep thinking much into it. Hence I shall be giving a generalised answer to the question.

Some basic knowledge to be clear before entering deep into NEON intrinsics are:

  1. Binary representation of negative and postive numbers.
  2. Range of unsigned char, signed char, unsigned int, signed int etc.
    • Range of unsigned char -> 0 to 255
    • Range of signed char -> -128 to 127

The range must always hold true while applying the instructions. As an Intrinsic code programmer, we must first know the exact range of the results that we may get.

int8x8_t c = vsub_s8(int8x8_t a, int8x8_t b)

The range of all the variables in this equation must be -128 to 127.

uint8x8_t c = vsub_u8(uint8x8_t a, uint8x8_t b)

All the variables must be in the range [0 to 255]. We will have to be sure that the result is within the range. Hence this equation works correctly only if b is less than a. In other words, if a and b are of [0,255] then c will be of [-255,255]. Clearly c cannot be represented in 8-bit representation. Here the result will have to be a 16-bit representation. vsubl_u8 will store the result in 16-bit representation.

Visualizing the arithmetic operations on base 2 numbers will help in getting closer to intrinsic code. Do your own homework in neon intrinsics by creating a test project which loads two arrays and debug the output. The intrinsics are never so complex and hence there is nothing better than a good homework. :)