Sum of bytes in an __m128 register

94 Views Asked by At

I am trying to find the sum of all bytes in an __m128 register using SSE and SSE2.

So far what I have is

__m128i sum = _mm_sad_epu8(bytes, _mm_setzero_si128());
return  _mm_cvtsi128_si32(sum) + _mm_extract_epi16(sum, 4);

where bytes is the __m128 value that contains the bytes that I want to find the sum of.

This works, however I am getting a lot of overflows which leads to me getting the wrong values. Is there a way to do this without getting overflows?

Alternatively I was thinking about just adding them to an array and summing them up that way, however I haven't been able to find a store method for bytes.

Unfortunately I can only support SSE and SSE2 methods.

Thank you for your help!

0

There are 0 best solutions below