Rewrite the code without branches

371 Views Asked by At

I have the following function:

uint16_t foo(uint8_t input)
{
     uint16_t N = 38;
     if (!(input&1))
     {
       N = 0;
     }
     else 
     {
        if ((input&2) >> 1)
        {
           N = ~N;
        }
     }
     return N;
}

And I would like to have it rewritten without ifs, just as an inline function which transforms 38 into either 0,38 or 65497 given input and using only standard C bit-twiddling operations.

The point is not that compiler could inline the function, or the function to be fast, but just to get rid of the branches and be constant-time regardless what the input is.

The first if is easy:

uint16_t c = ((input&1)^1)-1;
N &= c;

but I'm having troubles finding some simple way to do that conditional negation.

5

There are 5 best solutions below

1
On BEST ANSWER

7 operations and no multiplies, builds on njuffa's comment:

uint16_t bar(uint8_t input)
{
    return (0-(input&1)) & (38^(0-((input>>1)&1)));
}

Demo

4
On

Use a table lookup.

uint16_t foo(uint8_t input) {
    int index= input & 0x3;
    const uint16 table[4]= {0,~38,0,38};
    return table[index];
}
3
On

If I read your code in the right way, it says the following:

uint16_t foo1(uint8_t input)
{
     uint16_t N = 38;

     uint16_t bit0 = input & 1;
     uint16_t nbit0 = bit0 ^ 1;
     uint16_t bit1 = (input & 2) >> 1;
     uint16_t nbit1 = bit1 ^ 1;

     N = nbit0 * ( bit1 * ~N + nbit1 * N);

     return N;
}

Feel free to get rid of variables. They are just for readability.

1
On

To have guaranteed run-time and balanced code, will have to go assembler. As the MSP430 assembler is not that complicated, that will not be much of a problem. Note that things like multiplication are likely performed by a function which has not constant run-time.

There is little sense to avoid branches. Instead, you should balance the execution paths, e.g.using NOP (no operation). The MSP430 user's guide includes instruction timing. Note that the timing is fixed, which is different with larger CPUs like ARM where it depends on pipeline and memory timing.

General note: Doing timing through CPU cycles is most times a bad idea. Modern MCUs like MSP430 provide internal timers and connections to other peripherals like ADC to trigger e.g. conversions with a highly accurate timing. They can generate an interrupt so the CPU can prepare the next value or read the sample without caring about run-time of the code (unless it takes too long).

Using the CPU for such forbids for instance interrupt, as these will blow any timing. That makes maintenance of such a system a nightmare.

1
On

This should work.

#include<stdio.h>

int main(){
    int input;
    scanf("%d", &input);
    int N = (input&1)*(((input&2)*32729)+(38+((input&2)>>1)));
    printf("%d\n", N);
}