How to generate uniform single precision floating point random number between 0 and 1 in FPGA?

970 Views Asked by At

I am trying to generate single precision floating point random number using FPGA by generating number between 0 and 0x3f80000 (IEEE format for 1). But since there are more number of discreet points near to zero than 1, I am not getting uniform generation. Is there any transformation which I can apply to mimic uniform generation. I am using LFSR(32 Bit) and Xoshiro random number generation.

2

There are 2 best solutions below

0
On BEST ANSWER

A standard way to generate uniformly distributed floats in [0,1) from uniformly distributed 32-bit unsigned integers is to multiply the integers with 2-32. Obviously we wouldn't instantiate a floating-point multiplier on the FPGA just for this purpose, and we do not have to, since the multiplier is a power of two. In essence what is needed is a conversion of the integer to a floating-point number, then decrementing the exponent of the floating-point number by 32. This does not work for a zero input which has to be handled as a special case. In the ISO-C99 code below I am assuming that float is mapped to IEEE-754 binary32 type.

Other than for certain special cases, the significand of an IEEE-754 binary floating-point number is normalized to [1,2). To convert an integer into the significand, we need to normalize it, so the most significant bit is set. We can do this by counting the number of leading zero bits, then left shifting the number by that amount. The count of leading zeros is also needed to adjust the exponent.

The significand of a binary32 number comprises 24 bits, of which only 23 bits are stored; the most significant bit (the integer bit) is always one and therefore implicit. This means not all of the 32 bits of the integer can be incorporated into the binary32, so in converting a 32-bit unsigned integer one usually rounds to 24-bit precision. To simplify the implementation, in the code below I simply truncate by cutting off the least significant eight bits, which should have no noticeable effect on the uniform distribution. For the exponent part, we can combine the adjustments due to normalization step with the subtraction due to the scale factor of 2-32.

The code below is written using hardware-centric primitives. Extracting a bit is just a question of grabbing the correct wire, and shifts by fixed amounts are likewise simply wire shifts. The circuit needed to count the number of leading zeros is typically called a priority encoder.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>

#define USE_FP_MULTIPLY  (0)

uint32_t bit (uint32_t, uint32_t);
uint32_t mux (uint32_t, uint32_t, uint32_t);
uint32_t clz (uint32_t);
float uint32_as_float (uint32_t);

/* uniform float in [0, 1) from uniformly distributed random integers */
float uniform_rand_01 (uint32_t i)
{
    const uint32_t FP32_EXPO_BIAS = 127;
    const uint32_t FP32_MANT_BITS = 24;
    const uint32_t FP32_STORED_MANT_BITS = FP32_MANT_BITS - 1;
    uint32_t lz, r;

    // compute shift amount needed for normalization
    lz = clz (i);
    // normalize so that msb is set, except when input is zero
    i = mux (bit (lz, 4), i << 16, i);
    i = mux (bit (lz, 3), i <<  8, i);
    i = mux (bit (lz, 2), i <<  4, i);
    i = mux (bit (lz, 1), i <<  2, i);
    i = mux (bit (lz, 0), i <<  1, i);
    // build bit pattern for IEEE-754 binary32 floating-point number
    r = (((FP32_EXPO_BIAS - 2 - lz) << FP32_STORED_MANT_BITS) + 
         (i >> (32 - FP32_MANT_BITS)));
    // handle special case of zero input
    r = mux (i == 0, i, r);
    // treat bit-pattern as 'float'
    return uint32_as_float (r);
}

// extract bit i from x
uint32_t bit (uint32_t x, uint32_t i)
{
    return (x >> i) & 1;
}

// simulate 2-to-1 multiplexer: c ? a : b ; c must be in {0,1}
uint32_t mux (uint32_t c, uint32_t a, uint32_t b)
{
    uint32_t m = c * 0xffffffff;
    return (a & m) | (b & ~m);
}

// count leading zeros. A priority encoder in hardware.
uint32_t clz (uint32_t x)
{
    uint32_t m, c, y, n = 32;

    y = x >> 16; m = n - 16; c = (y != 0); n = mux (c, m, n); x = mux (c, y, x);
    y = x >>  8; m = n -  8; c = (y != 0); n = mux (c, m, n); x = mux (c, y, x);
    y = x >>  4; m = n -  4; c = (y != 0); n = mux (c, m, n); x = mux (c, y, x);
    y = x >>  2; m = n -  2; c = (y != 0); n = mux (c, m, n); x = mux (c, y, x);
    y = x >>  1; m = n -  2; c = (y != 0); n = mux (c, m, n - x);
    return n;
}

// re-interpret bit pattern of a 32-bit integer as an IEEE-754 binary32 
float uint32_as_float (uint32_t a)
{
    float r;
    memcpy (&r, &a, sizeof r);
    return r;
}

// George Marsaglia's KISS PRNG, period 2**123. Newsgroup sci.math, 21 Jan 1999
// Bug fix: Greg Rose, "KISS: A Bit Too Simple" http://eprint.iacr.org/2011/007
static uint32_t kiss_z=362436069, kiss_w=521288629;
static uint32_t kiss_jsr=123456789, kiss_jcong=380116160;
#define znew (kiss_z=36969*(kiss_z&65535)+(kiss_z>>16))
#define wnew (kiss_w=18000*(kiss_w&65535)+(kiss_w>>16))
#define MWC  ((znew<<16)+wnew )
#define SHR3 (kiss_jsr^=(kiss_jsr<<13),kiss_jsr^=(kiss_jsr>>17), \
              kiss_jsr^=(kiss_jsr<<5))
#define CONG (kiss_jcong=69069*kiss_jcong+1234567)
#define KISS ((MWC^CONG)+SHR3)

#define N 100

uint32_t bucket [N];

int main (void)
{
    for (int i = 0; i < 100000; i++) {
        uint32_t i = KISS;
#if USE_FP_MULTIPLY
        float r = i * 0x1.0p-32f;
#else // USE_FP_MULTIPLY
        float r = uniform_rand_01 (i);
#endif // USE_FP_MULTIPLY
        bucket [(int)(r * N)]++;
    }
    for (int i = 0; i < N; i++) {
        printf ("bucket [%2d]: [%.5f,%.5f): %u\n", 
                i, 1.0f*i/N, (i+1.0f)/N, bucket[i]);
    }
    return EXIT_SUCCESS;
}
0
On

Please check the xoshiro128+ here https://prng.di.unimi.it/xoshiro128plus.c

The VHDL code written by someone can be found here: https://github.com/jorisvr/vhdl_prng/tree/master/rtl

The seed value is generated from another random number generation algorithm so don't get confused by this.

Depending on the seed value used it should give a uniform distribution.