I am trying to port finite field CPU code over GPU and in the process, I would like to generate random vectors to test the speed of my functions.
I need two random vectors of uint64_t
(and the corresponding two vectors of double, with float representation of finite field elements), each of size N.
As far as I know, uint64_t types are not natively supported over GPU and are emulated using two 32-bit registers.
These vectors will contain integers in the range (0, p-1) where p is a prime number, e.g. (1<<25) - 39. (This prime uses 25 bits, but I still need 64 bits, to store intermediate results before remaindering).
I have tried to understand Curand API and generate random vectors with it.
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <cuda.h>
#include <time.h>
#include <curand.h>
int main() {
uint64_t p = (1 << 25) - 39;
const uint32_t N = (1 << 27);
uint64_t *au;
double *ad;
cudaError_t handle;
handle = cudaMallocManaged(&au, N*sizeof(uint64_t));
handle = cudaMallocManaged(&ad, N*sizeof(double));
curandGenerator_t gen_type;
curandCreateGenerator(&gen_type, CURAND_RNG_PSEUDO_MRG32K3A);
curandSetPseudoRandomGeneratorSeed(gen_type, (uint64_t)time(NULL));
curandGenerateLongLong(gen_type, au, p);
cudaFree(au);
cudaFree(ad);
return 0;
}
Nvcc returns that au
has an incompatible type in the curandGenerateLongLong
call.
According to the Curand API, I am constrained to use SOBOL64 quasirandom generator. Why is it so?
Is there some pseudorandom generator for uint64_t
or is a quasirandom generator adapted to my case?
If I want to avoid quasirandom generation, I am forced to generate randomly over the CPU and copy my random vectors to the GPU. Can I use the device curand library (curand_kernel.h) for my use case?
On linux 64-bit supported by CUDA (at least) there is no numerical difference between the representation and semantics of
uint64_t
andunsigned long long
. I acknowledge the types are different but the difference here isn't meaningful for the use case you have shown here.It should be fine for you to modify your code as follows:
and you will get an array of
uint64_t
generated.(on 64-bit windows, I suspect you would not even get the error you are reporting, but I have not tested it.)