C bit counting causes unexpected behaviour

134 Views Asked by At

I was writing a Hamming weight function in C. I wanted to know if I did it correctly, so my plan was to test on the number 5. I didn't want to google the binary for 5 and wrote some code to do it myself, however when I ran the code I got this:

testing code
bit length of 5: 32

five: -1354390872 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0

This didn't make sense as nothing should be able to output such a large negative number. Here's my code

#include <stdio.h>

int main() {
      printf("\ntesting code\n");
      int fv[32];
      int fiv = 5;
      for (int i = 0; i < 32; i++) {
            if (fiv & 1) fv[32 - i] = 1; else fv[32 - i] = 0;
            fiv >>= 1;
      }
      printf("bit length of 5: %ld\n\n", (sizeof(5) * 8));
      printf("five: ");
      for (int i = 0; i < 32; i++) {
            printf("%d ", fv[i]);
      }
      printf("\n");
      return 0;
}

I was expecting to populate the array with integers 0 or 1 where the array would represent the binary value of the number. It certainly populated the array with 32 bit integers representing the number, however -1354390872 is not 0, nor 1. I've tried changing the number I iterate through to be (sizeof(n) * 8 to account for the size of a byte, however that just hangs and I couldn't figure it out as there is no manpage for sizeof(). I also tried replacing the line where I set fv[32 - i] to this unoptimised mess: if (fiv & 1) fv[32 - i] = 1; else fv[32 - i] = 0; in hopes that it would change the output

Can someone help, I'm sure there's a binary format that I don't know about that I could be using, but I honestly don't know it.

Edit: Running the code multiple times gives completely random outputs of very large numbers. Example:

testing code
bit length of 5: 32

five: -956652888 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
forward:~/cproblem/declareyourvariablesc $ ./main

testing code
bit length of 5: 32

five: -754855256 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
forward:~/cproblem/declareyourvariablesc $ ./main

testing code
bit length of 5: 32

five: 104133288 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
forward:~/cproblem/declareyourvariablesc $ ./main

testing code
bit length of 5: 32

five: -1662766424 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
forward:~/cproblem/declareyourvariablesc $ ./main

testing code
bit length of 5: 32

five: -103538008 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
forward:~/cproblem/declareyourvariablesc $ ./main

testing code
bit length of 5: 32

five: 1071841960 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
2

There are 2 best solutions below

2
chqrlie On

Instead of fv[32 - i], you should use fv[32 - 1 - i] to avoid accessing beyond the end of the array. As posted, the code has undefined behavior, which explains the observed output, but anything else could happen too.

To void these computations, you can use a down loop (note the use of the postdecrement operator in i--):

      for (int i = 32; i-- > 0;) {
          if (fiv & 1)
              fv[i] = 1;
          else
              fv[i] = 0;
          fiv >>= 1;
      }

And you can simplify the loop body:

      for (int i = 32; i-- > 0;) {
          fv[i] = fiv & 1;
          fiv >>= 1;
      }

Also note that sizeof(5) is the size in byte of type int and has type size_t which may be different from unsigned long. size_t requires the conversion %zu for printf. To avoid compatibility issues on legacy systems, I recommend casting the size_t expression as (unsigned) and using %u. Also note that * 8 assume 8-bit bytes, which are quite ubiquitous today, except for some DSPs and ancient CPUs. Purists would use CHAR_BIT defined in <limits.h>.

Here is the modified code:

#include <stdio.h>

int main(void) {
    printf("\ntesting code\n");

    int fv[32];
    int fiv = 5;

    for (int i = 32; i-- > 0;) {
        fv[i] = fiv & 1;
        fiv >>= 1;
    }
    printf("bit length of 5: %u\n\n", (unsigned)(sizeof(5) * 8));
    printf("five: ");
    for (int i = 0; i < 32; i++) {
        printf("%d ", fv[i]);
    }
    printf("\n");
    return 0;
}
0
nick On

@chqrlie already answered the correct solution.

Just to explain what is happening here: Since fv is defined locally in the main function and is not set to zero, its values are undefined. Your for loop iterates from 0 to 31 so you are changing the values of fv at position 32 to 1. Position 32 is out of range and results in undefined behavior.

But in particular the position 0 was never written keeping its "random" value.