I have tried this code on two different systems:
using System;
using System.Collections.Generic;
using System.Linq;
using Cloo;
namespace MinimalExample
{
class Program
{
static void Main(string[] args)
{
var input = Enumerable.Range(0, 10).ToArray();
var output = new int[input.Length];
var platform = ComputePlatform.Platforms.First();
var context = new ComputeContext(
platform.Devices,
new ComputeContextPropertyList(platform),
null,
IntPtr.Zero
);
var queue = new ComputeCommandQueue(
context,
platform.Devices.First(),
ComputeCommandQueueFlags.None
);
var program = new ComputeProgram(
context,
"void kernel some_test(constant int* a, global int* b) { " +
int i = get_global_id(0);
b[i] = a[i];
}");
program.Build(null, string.Empty, null, IntPtr.Zero);
using (var kernel = program.CreateKernel("some_test"))
using (var inBuff = new ComputeBuffer<int>(context, ComputeMemoryFlags.ReadOnly | ComputeMemoryFlags.UseHostPointer, input))
using (var outBuff = new ComputeBuffer<int>(context, ComputeMemoryFlags.WriteOnly | ComputeMemoryFlags.UseHostPointer, output))
{
kernel.SetMemoryArgument(0, inBuff);
kernel.SetMemoryArgument(1, outBuff);
var events = new List<ComputeEventBase>();
queue.Execute(kernel, null, new long[] { input.Length }, null, events);
queue.Finish();
}
if (output.All(x => x == 0)) throw new Exception("Output buffer not written.");
}
}
}
On my desktop system, this program does not throw (
output
has the values you would expect from reading the kernel code). This system has an AMD GPU with OpenCL 2.1 support.On my laptop, this program throws at the last line. This means that the kernel did not run the way I intended, or at all. This system has an Intel CPU with OpenCL 2.1 support and an Nvidia GPU with OpenCL 1.2 support, they all exhibit the same behavior. OpenCL samples downloaded from Nvidia seem to self-test and pass normally.
Cloo checks for non-success return codes and throws if an error is returned anywhere. But just in case, I directly inspected the codes returned to Cloo by OpenCL and indeed they all indicate just "success".
It doesn't seem to matter if I try different combinations of flags and address spaces — one system never throws, one system always throws.
What am I missing here?