I have written a method that is called from a .cpp file for the purpose of running cudaMemcpy. The method is below:
void copy_to_device(uint32_t *host, uint32_t *device, int size)
{
cudaError_t ret;
ret = cudaMemcpy(device, host, size*sizeof(uint32_t), cudaMemcpyHostToDevice);
if(ret == cudaErrorInvalidValue)
printf("1!\n");
else if(ret == cudaErrorInvalidDevicePointer)
printf("2!\n");
else if(ret == cudaErrorInvalidMemcpyDirection)
printf("3!\n");
}
my .cpp file calls it like this:
uint32_t *input_device;
device_malloc(input_device, INPUT_HEIGHT*INPUT_WIDTH);
uint32_t *oneDinput = TwoDtoOneD(input, INPUT_HEIGHT, INPUT_WIDTH);
copy_to_device(oneDinput, input_device, INPUT_HEIGHT*INPUT_WIDTH);
All that TwoDtoOneD does is take in a 2D array and convert it to a 1D array and return it. Whenever I try and use copy_to_device
method, it returns cudaErrorInvalidValue which isn't well documented on NVIDIA's website. Do you guys happen to know what is wrong with the parameters I am passing to my function that is causing this error? It's causing issues down the road during kernel execution. If you need any more details, please ask.
Here's the method device_malloc
:
void device_malloc(uint32_t *buffer, int size)
{
cudaMalloc((void **) &buffer, size*sizeof(uint32_t));
}
The problem is here:
Whatever
device_malloc
does, it does not modify theinput_device
value. That is, unless the first argument is a reference to pointer, but I am ready to bet it is not.You need to change the first argument of
device_malloc
to a pointer to pointer, and call it like that:Or just have
device_malloc
return a pointer to the allocated memory.To answer your question more directly,
cudaMemcpy
returns an error because its first argument,device
, is not a valid device pointer, which CUDA runtime has a way of checking. It probably holds garbage value since you never initialize it due to the above issue.As a side note and unrelated to the issue, you may want to use
cudaGetErrorString
funciton for a more convenient way to print out the status.