Measure custom kernel execution time in ViennaCL

249 Views Asked by At

I've got a custom kernel executing through ViennaCL with an OpenCL backend. While I know how to benchmark ViennaCL in general (provided in the docs) and how to execute an OpenCL kernel execution time when it is directly executed with events (both covered in OpenCL documentation and in abundant examples on the internet), I am at a loss as to how combine the two.

Consider this example:

const char * kernel = ...; // some kernel text
viennacl::ocl::program &testProg = viennacl::ocl::current_context().add_program(kernel, "kernel");
testProg.add_kernel("TestKernel");

viennacl::ocl::kernel &TestKernel = testProg.get_kernel("TestKernel");

// provide kernel arguments, set local and global worker sizes

// START TIMING
viennacl::ocl::enqueue(TestKernel);
viennacl::ocl::get_queue().finish();
// END TIMING

What I'm coming up with so far, is using Boost timers to measure the complete time that ViennaCL takes to send data to the device through PCI-Express, enqueue and finish kernel execution. While this is acceptable (since what I'm benchmarking is very dependent on data send speed, the data is rather large), I'd also like to measure what fraction of time the actual execution of the kernel takes in this.

This is an academia project, so accurate measurements can help me to make or break my case.

0

There are 0 best solutions below