Is there any reason why clEnqueueNDRangeKernel may block?

216 Views Asked by Francisco At 30 June 2025 at 06:04

I am developing an application making use of OpenCL, targeted to 1.2 versión. I use DX11 interoperability to display the kernel results. I try my code in Intel (iGPU) and Nvidia platforms, in both I recon the same behaviour.

My call to clEnqueueNDRangeKernel is blocking the CPU thread. I have checked the documentation and I can not find an statement declaring the situations in which a kernel call may block. I have read in some forums that those things happens sometimes with some OpenCL implementations. The code is working properly and outputting valid results. The API does not return any error at any given point, all seems smooth.

I can not paste the full source but I will paste the in-loop part:

    size_t local = 64;
    size_t global = ctx->dec_in_host->horizontal_blocks * ctx->dec_in_host->vertical_blocks * local;

    print_if_error(clEnqueueWriteBuffer(ctx->queue, ctx->blocks_gpu, CL_TRUE, 0, sizeof(block_input) * TOTAL_BLOCKS, ctx->blocks_host, 0, NULL, &ctx->blocks_copy_status), "copying data");
    print_if_error(clEnqueueWriteBuffer(ctx->queue, ctx->dec_in_gpu, CL_TRUE, 0, sizeof(decoder_input), ctx->dec_in_host, 0, NULL, &ctx->frame_copy_status), "copying data");
    
    if (ctx->mode == nv_d3d11_sharing)
        print_if_error(ctx->fp_clEnqueueAcquireD3D11ObjectsNV(ctx->queue, 1, &(ctx->image_gpu), 0, NULL, NULL), "Adquring texture");
    else if (ctx->mode == khr_d3d11_sharing)
        print_if_error(ctx->fp_clEnqueueAcquireD3D11ObjectsKHR(ctx->queue, 1, &(ctx->image_gpu), 0, NULL, NULL), "Adquring texture");
    
    t1 = clock();
    print_if_error(clEnqueueNDRangeKernel(ctx->queue, ctx->kernel, 1, NULL, &global, &local, 0, NULL, &ctx->kernel_status), "kernel launch");
    t2 = clock();
    
    if (ctx->mode == nv_d3d11_sharing)
        print_if_error(ctx->fp_clEnqueueReleaseD3D11ObjectsNV(ctx->queue, 1, &(ctx->image_gpu), 0, NULL, NULL), "Releasing texture");
    else if (ctx->mode == khr_d3d11_sharing)
        print_if_error(ctx->fp_clEnqueueReleaseD3D11ObjectsKHR(ctx->queue, 1, &(ctx->image_gpu), 0, NULL, NULL), "Releasing texture");
printf("Elapsed time %lf ms\n", (double)(t2 - t1)*1000 / CLOCKS_PER_SEC);

So my question is:

¿Do you know any reason why the clEnqueueNDRangeKernel would block?
¿Do you know if the Dx11 interop might cause this?
¿Do you know if some OpenCL configuration can create a syncronous kernel launch?

Thank you :)

EDIT 1: Thanks to doqtor comment I realize that commenting out parts of the kernel the kernel launch becomes asyncronous. The result is not Ok but I have some hint to work out the answer.

Original Q&A

Is there any reason why clEnqueueNDRangeKernel may block?

There are 0 best solutions below

Related Questions in OPENCL

Related Questions in OPENCL-C

Trending Questions

Popular # Hahtags

Popular Questions