How to check if a register contains a zero byte without SIMD instructions

334 Views Asked by At

Given a 64 Bit general purpose register (Not a xmm register) in x64 architecture, filled with one byte unsigned values. How can I check it for a zero value simultaneously without using SSE instructions?

Is there a way to do so in a parallel way, without iterating over the register in 4 bit steps?

I tried to compare it with certain 64-bit masks but it is not working.

1

There are 1 best solutions below

6
On

Technically, you could do something like that:

// True if any of the 8 bytes in the integer is 0
bool anyZeroByte( uint64_t v )
{
    // Compute bitwise OR of 8 bits in each byte
    v |= ( v >> 4 ) & 0x0F0F0F0F0F0F0F0Full;
    v |= ( v >> 2 ) & 0x0303030303030303ull;
    constexpr uint64_t lowMask = 0x0101010101010101ull;
    v |= ( v >> 1 ) & lowMask;
    // Isolate the lowest bit
    v &= lowMask;
    // Now these bits are 0 for zero bytes, 1 for non-zero;
    // Invert that bit
    v ^= lowMask;
    // Now these bits are 1 for zero bytes, 0 for non-zero
    // Compute the result
    return 0 != v;
}

However, SIMD gonna be way faster. SSE is an absolute requirement on x64 architecture, all AMD64 processors in the world are required to support SSE1 and SSE2. Here’s SSE2 version:

bool anyZeroByteSse2( uint64_t v )
{
    __m128i vec = _mm_cvtsi64_si128( (int64_t)v );
    __m128i zero = _mm_setzero_si128();
    __m128i eq = _mm_cmpeq_epi8( vec, zero );
    return 0 != ( _mm_movemask_epi8( eq ) & 0xFF );
}

That’s 6 instructions instead of 16: link.