Why am I getting multiple OpenCL 'binaries' when I built my program for one device?

93 Views Asked by At

I'm building an OpenCL program - using NVIDIA CUDA 11.2's OpenCL library (and its C++ bindings). After invoking cl::Program::build() successfully, for a single device (passing a vector with a single device index), I obtain the generated "binaries" sizes using: built_program.getInfo<CL_PROGRAM_BINARY_SIZES>(), which also succeeds, but gives me 3 values: A non-zero value and two zeros. When I print the first binary, I see the PTX code I expect.

My question: Why am I given two (empty) extra binaries?

1

There are 1 best solutions below

0
On

Even though the program is built for specific devices you specify (see documentation for clBuildProgram), the binaries are made available for each device in the context. In your case, you probably have three GPUs on your system; you built the program for a single device, so for one of the three devices, you see a non-empty PTX.

Confusing? Sure. Convoluted? Yes. But is it entirely senseless? Admittedly, not really.

Digging around a bit further, it seems this is even officially documented (emphasis mine):

Returns an array that contains the size in bytes of the program binary (could be an executable binary, compiled binary or library binary) for each device associated with program. The size of the array is the number of devices associated with program. If a binary is not available for a device(s), a size of zero is returned.

Not every device for which you built, but every device associated with the program; which is probably every device in the OpenCL context with which you created the program.