cudaMemcpy() gives segfault when using Type**

517 Views Asked by aagam94 At 28 July 2025 at 08:00

I want to copy a double pointer object to the host and compute over it on the GPU Device. When doing cudaMemcpy of the object to device it throws SEGFAULT.

BMP Input;
Input.ReadFromFile( fileName );
WIDTH = Input.TellWidth();
HEIGHT = Input.TellHeight();
RGBApixel** imageData = new RGBApixel* [HEIGHT];
for (int i = 0; i < HEIGHT; i++) 
    imageData[i] = new RGBApixel [WIDTH];

for(int j=0;j<Input.TellHeight();j++){
    for(int i=0;i<Input.TellWidth();i++){
      imageData[j][i] = Input.GetPixel(i,j);
    }
  }
long long imageSize = WIDTH*HEIGHT*sizeof(RGBApixel *);

RGBApixel** dev_imgdata,dev_imgdata_out;
//Allocating cudaMemory
cudaMalloc( (void **) &dev_imgdata, imageSize );
cudaMalloc( (void **) &dev_imgdata_out, imageSize );

Now the below line throws segfault

cudaMemcpy(dev_imgdata,imageData,imageSize,cudaMemcpyHostToDevice);

Original Q&A

There are 2 best solutions below

VAndrei On 16 November 2014 at 13:32

When declaring RGBApixel** imageData = new RGBApixel* [HEIGHT]; you have absolutely no guarantee that imageData will occupy a contiguous block of memory.

cudaMemcpy copies contiguous blocks of memory into the device RAM. Your statement tries to copy the start addresses of each matrix row but not the actual data. Also when using cudaMalloc, you need to properly allocate for each line, exactly as you did for the host buffer.

What you need to do is to declare imageData as just a RGMAPixel* - basically put the matrix in a single vector and use proper indexing and it will work.

You can also copy each line at a time but that's not a very good practice since every memory access will require an extra indirection and you will mess the caching efficiency.

fcdimitr On 17 November 2014 at 07:23

Also, make sure that when you compile your program, you use -arch sm_20 to enable extra options for your graphic card ( if it has Capability 2.0). Without it I believe you can't use double and the result is unpredictable (or the double is diminished to float)

cudaMemcpy() gives segfault when using Type**

There are 2 best solutions below

Related Questions in MEMORY-MANAGEMENT

Related Questions in CUDA

Related Questions in SEGMENTATION-FAULT

Related Questions in CUDA-GDB

Trending Questions

Popular # Hahtags

Popular Questions