Say you have a cuArray
for binding a surface object.
Something of the form:
// These are inputs to a function really.
cudaArray* d_cuArrSurf
cudaSurfaceObject_t * surfImage;
const cudaExtent extent = make_cudaExtent(width, height, depth);
cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<float>();
cudaMalloc3DArray(&d_cuArrSurf, &channelDesc, extent);
// Bind to Surface
cudaResourceDesc surfRes;
memset(&surfRes, 0, sizeof(cudaResourceDesc));
surfRes.resType = cudaResourceTypeArray;
surfRes.res.array.array = d_cuArrSurf;
cudaCreateSurfaceObject(surfImage, &surfRes);
Now, I want to initialize this cuArray
to zero. Apparently there is non memset
for cuArray
type of objects. What would be the best way to do this? Maybe multiple options are possible, and some may have better or worse features. Which are these options?
I can think of
allocate and zero host memory and copy it using
cudaMemcpy3D()
.create an initialization kernel and write it with
surf3Dwrite()
Here is a rough example, roughly extending the previous rough example:
The (total) extent above is 256x256x256. So I chose to do a 256x256 transfer (per-transfer extent) (basically each z-slice) over 256 iterations of
cudaMemcpy3D
. It seems to pass the sniff test.I used 1 as my initializing value for device memory here "just because". If you wanted to make this faster and initialize to zero, skip the host->device copy and just use cudaMemset to initialize the linear memory (source for 3D transfer) to zero.