Sequence of binary input file for NIST

625 Views Asked by At

I've developed an RNG program, for now, I want to test whether my numbers are random. Thus, I decided to use NIST Test Suite.

I'm still confused about the input file format, they say: "The user may want to construct as many files of arbitrary length as desired. Files should contain binary sequences stored as either ASCII characters consisting of zeroes and ones, or as binary data where each byte contains eight bits worth of 0’s and 1’s"

My python RNG program will return a sequence of numbers line by line as:

69
11
68
55
33
20
75
96

How can I convert them to the proper input file for NIST?

2

There are 2 best solutions below

4
On

Your first random number is 69, which is 1000101 in binary. You can either put that in your test file as the ASCII string "1000101" or as seven bits in a binary file 1000101... The ASCII option is probably easier, but the file will be eight times the size. With either case you might have to be careful with leading zeros in binary, I am not sure what NIST wants without reading a lot more of SP 800-22 than I currently have time for.

0
On

The appropriate input for NIST would be a set of binary sequences. You can simply convert your integers into binary numbers, write them to a file and use the file as input for the NIST program. Converting to binary and feeding them to NIST does not necessarily mean your RNG will pass all the tests there. Let's try to answer the following question.

How many bits do you produce for each integer?

For example, let's say your RNG generates integers between 0 to 5 (uniform distribution, all values are equiprobable). As representing 5 would require at least 3 bits, we will use 3 bits for each integer.

0: 000
1: 001
2: 010
3: 011
4: 100
5: 101

Look at the first (most significant) bit for each of the numbers. Four of them are 0, and the remaining two are 1. So, whenever you pick a random integer from 0 to 5, the probability of the first bit being 0 is higher than it being 1. Remember that, for an RNG, we need p(0)=p(1)=0.5 for each of the bits.

Now, if an RNG produced values from 0 to 7 (uniformly), we could convert each of them to 3 bits and maintain p(0)=p(1)=0.5 at all the indices. Why is that? Because we have all 23 different values (i.e., 0 to 23-1), it does not face any bias at any index (equal number of zeros and ones).

The above discussion leads us to the conclusion that, if you have integer values coming out of an RNG and they range from 0 to 2n-1, and each of them are equiprobable, you can convert them to n bits and concatenate them for NIST evaluation. If those conditions do not hold (e.g., the number of outcomes is not a power of 2), one way is to settle for the maximum power of 2 that can be accommodated in the output range of your RNG and discard the rest of the values.