I try to port a SSE function which get absolute difference of two 8-bit unsigned integer arrays. It looks like:
uint64_t AbsDiffSum(const uint8_t * a, const uint8_t * b, size_t size)
{
assert(size%16 == 0);
__m128i _sum = _mm_setzero_si128();
for(size_t i = 0; i < size; i += 16)
{
const __m128i _a = _mm_loadu_si128((__m128i*)(a + i));
const __m128i _b = _mm_loadu_si128((__m128i*)(b + i));
_sum = _mm_add_epi64(_sum, _mm_sad_epu8(_a, _b));
}
return _mm_cvtsi128_si64(_mm_add_epi64(_sum, _mm_srli_si128(_sum, 8)));
}
Main work is performed by intrinsic function _mm_sad_epu8().
Is there an analogue for Altivec?
Unfortunately, there is no direct analogue of intrinsic function _mm_sad_epu8 for Altivec. But there is a possibility to emulate it: