several random numbers c++

281 Views Asked by At

I am a physicist, writing a program that involves generating several (order of a few billions) random numbers, drawn from a Gaussian distribution. I am trying to use C++11. The generation of these random numbers is separated by an operation that should take very little time. My biggest worry is if the fact that I am generating so many random numbers, with such a little time gap, could potentially lead to sub-optimal performance. I am testing certain statistical properties, which rely heavily on the independence of the randomness of the numbers, so, my result is particularly sensitive to these issues. My question is, with the kinds of numbers I mention below in the code (a simplified version of my actual code), am I doing something obviously (or even, subtly) wrong?

#include <random>

// Several other includes, etc.

int main () {

  int dim_vec(400), nStats(1e8);
  vector<double> vec1(dim_vec), vec2(dim_vec);

  // Initialize the above vectors, which are order 1 numbers.

  random_device rd;
  mt19937 generator(rd());
  double y(0.0);
  double l(0.0);

  for (int i(0);i<nStats;i++)
    {
      for (int j(0);j<dim_vec;j++)
        {
          normal_distribution<double> distribution(0.0,1/sqrt(vec1[j]));
          l=distribution(generator);
          y+=l*vec2[j];
        }
      cout << y << endl;
      y=0.0;
    }
}
2

There are 2 best solutions below

5
On

The normal_distribution is allowed to have state. And with this particular distribution, it is common to generate numbers in pairs with every other call, and on the odd calls, return the second cached number. By constructing a new distribution on each call you are throwing away that cache.

Fortunately you can "shape" a single distribution by calling with different normal_distribution::param_type's:

 normal_distribution<double> distribution;
 using P = normal_distribution<double>::param_type;
 for (int i(0);i<nStats;i++)
    {
      for (int j(0);j<dim_vec;j++)
        {
          l=distribution(generator, P(0.0,1/sqrt(vec1[j])));
          y+=l*vec2[j];
        }
      cout << y << endl;
      y=0.0;
    }

I'm not familiar with all implementations of std::normal_distribution. However I wrote the one for libc++. So I can tell you with some amount of certainty that my slight rewrite of your code will have a positive performance impact. I am unsure what impact it will have on the quality, except to say that I know it won't degrade it.

Update

Regarding Severin Pappadeux's comment below about the legality of generating pairs of numbers at a time within a distribution: See N1452 where this very technique is discussed and allowed for:

Distributions sometimes store values from their associated source of random numbers across calls to their operator(). For example, a common method for generating normally distributed random numbers is to retrieve two uniformly distributed random numbers and compute two normally distributed random numbers out of them. In order to reset the distribution's random number cache to a defined state, each distribution has a reset member function. It should be called on a distribution whenever its associated engine is exchanged or restored.

0
On

Some thoughts on top of excellent HH answer

  1. Normal distribution (mu,sigma) is generated from normal (0,1) by shift and scale:

N(mu, sigma) = mu + N(0,1)*sigma

if your mean (mu) is always zero, you could simplify and speed-up (by not adding 0.0) your code by doing something like

normal_distribution<double> distribution;
for (int i(0);i<nStats;i++)
{
  for (int j(0);j<dim_vec;j++)
    {
      l  = distribution(generator);
      y += l*vec2[j]/sqrt(vec1[j]);
    }
  cout << y << endl;
  y=0.0;
}
  1. If speed is of utmost importance, I would try to precompute everything I can outside the main 10^8 loop. Is it possible to precompute sqrt(vec1[j]) so you save on sqrt() call? Is it possible to have vec2[j]/sqrt(vec1[j]) as a single vector?

  2. If it is not possible to precompute those vectors, I would try to save on memory access. Keeping pieces of vec2[j] and vec1[j] together might help with fetching one cache line instead of two. So declare vector<pair<double,double>> vec12(dim_vec); and use in sampling y+=l*vec12[j].first/sqrt(vec12[j].second)