Bit Twiddling Hacks contains the following macros, which count the number of bytes in a word x that are less than, or greater than, n:
#define countless(x,n) \
(((~0UL/255*(127+(n))-((x)&~0UL/255*127))&~(x)&~0UL/255*128)/128%255)
#define countmore(x,n) \
(((((x)&~0UL/255*127)+~0UL/255*(127-(n))|(x))&~0UL/255*128)/128%255)
However, it doesn't explain why they work. What's the logic behind these macros?
 
                        
Let's try for intuition on
countmore.First,
~0UL/255*(127-n)is a clever way of copying the value127-nto all bytes in the word in parallel. Why does it work?~0is 255 in all bytes. Consequently,~0/255is1in all bytes. Multiplying by(127-n)does the "copying" mentioned at the outset.The term
~0UL/255*127is just a special case of the above wherenis zero. It copies 127 into all bytes. That's0x7f7f7f7fif words are 4 bytes. "Anding" withxzeros out the high order bit in each byte.That's the first term
(x)&~0UL/255*127). The result is the same asxexcept the high bit in each byte is zeroed.The second term
~0UL/255*(127-(n))is as above:127-ncopied to each byte.For any given byte
x[i], adding the two terms gives us127-n+x[i]ifx[i]<=127. This quantity will have the high order bit set wheneverx[i]>n. It's easiest to see this as adding two 7-bit unsigned numbers. The result "overflows" into the 8th bit because the result is 128 or more.So it looks like the algorithm is going to use the 8th bit of each byte as a boolean indicating
x[i]>n.So what about the other case,
x[i]>127? Here we know the byte is more thannbecause the algorithm stipulatesn<=127. The 8th bit ought to be always 1. Happily, the sum's 8th bit doesn't matter because the next step "or"s the result withx. Sincex[i]has the 8th bit set to 1 if and only if it's 128 or larger, this operation "forces" the 8th bit to 1 just when the sum might provide a bad value.To summarize so far, the "or" result has the 8th bit set to 1 in its i'th byte if and only if
x[i]>n. Nice.The next operation
&~0UL/255*128sets everything to zero except all those 8th bits of interest. It's "anding" with 0x80808080...Now the task is to find the number of these bits set to 1. For this,
countmoreuses some basic number theory. First it shifts right 7 bits so the bits of interest are b0, b8, b16... The value of this word isA beautiful fact is that 1 == 2^8 == 2^16 == ... mod 255. In other words, each 1 bit is 1 mod 255. It follows that finding mod 255 of the shifted result is the same as summing b0+b8+b16+...
Yikes. We're done.