Does this simple code not containing any loop generate a loop in assembly?

156 Views Asked by At

I was playing with this code to check if an integer is a power of 4:

// C++ version

bool is_pow_4(unsigned a) {
    return (std::popcount(a) == 1) && (std::countr_zero(a) % 2 == 0);
}
// C version

int is_pow_4(unsigned a) {
   return (__builtin_popcount(a) == 1) && (__builtin_ctz(a) % 2 == 0);
}

Basically I check that there is just one bit set and it is on an odd position.

I was expecting a branchless code, however I see two jumps and a rep instruction. From what I recall a rep is basically a loop, but on assembly level. It seems that std::countr_zero/__builtin_ctz generates the rep instruction.

C++ output:

is_pow_4(unsigned int):
        lea     edx, [rdi-1]
        mov     ecx, edi
        xor     eax, eax
        xor     ecx, edx
        cmp     edx, ecx
        jb      .L7
.L1:
        ret
.L7:
        mov     eax, 1
        test    edi, edi
        je      .L1
        xor     eax, eax
        rep bsf eax, edi
        not     eax
        and     eax, 1
        ret

C output is similar.

I understand that the loop is bound by the width of the integer (32), so I think the complexity of the code is O(1), but I was still surprised to find a loop.

Is my understanding correct? Is this code a loop on x86? Is this because while there is a x86 popcount instruction, there is no count leading/trailing zeros x86 instruction?

0

There are 0 best solutions below