RcppArmadillo, OpenMP and RNG seeds

197 Views Asked by At

What is currently the best solution to allow seed setting when using openMP?

Simple example:

#include "RcppArmadillo.h"
using namespace Rcpp;
using namespace arma;

// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
vec rngtest(int N, double seed, int cores){
  arma_rng::set_seed(seed);

  omp_set_num_threads(cores);
  vec out(N);
    
#pragma omp parallel for schedule(static)
  for(int k=0; k<N; k++){
    out(k)=randn(1)[0];
  }
  
  return(out);
}

Issues

Firstly, one can't set seeds in Armadillo (anymore?) when calling from:

rngtest(10,2,4)

Warning message:
In rngtest(10, 2, 4) :
  When called from R, the RNG seed has to be set at the R level via set.seed()

Now if I use set.seed and use let's say 4 cores, I do get deviations:

set.seed(123)
rngtest(10,2,4)
            [,1]
 [1,]  1.0034247
 [2,] -0.5099541
 [3,]  1.5406472
 [4,] -1.2821739
set.seed(123)
rngtest(10,2,4)
            [,1]
 [1,]  1.0034247
 [2,] -0.5099541
 [3,]  1.5406472
 [4,] -0.5437432

Some thoughts I think once can figure the sequence in which computations are assigned and try to arrange the results in the right order. How do I do that and is it generating computational overhead at all? Shouldn't. I think this is a general armadillo/openMP question, regardless of calling via Rcpp or not.

I'd also love to be able to set RNGs for armadillo independent from R, even while calling code from R. It'd be so nice when developing packages for different environments.

1

There are 1 best solutions below

2
On

There are several topics comingled here.

  1. We default to RNGs from R for convenience of R users, but there is a toggle you set via a #define to turn this off

  2. With OpenMP you can also simply ignore the R RNGs, use your own and rely one of the parallel RNGs such as sitmo or dqrng.