OpenCL Index operations: algorithmic vs constant index buffer

448 Views Asked by technotheist At 18 August 2013 at 05:45

So I'm writing a neural network library using Aparapi (which generates OpenCL from Java code). Anyway there are many situations where I need to do complex index operations to find the source/destination node for a given weight when doing forward passes and backpropagation.

In many cases this is very simple 1D to 2D formula, but in some cases, such as for convolution nets, I need to do a somewhat more complex operation to find the index (often something like 3D to 1D to 3D).

I have been sticking with algorthims to compute these indices. The alternative would be to simple store the source and destination indices for each weight in a constant int array. I have avoided this as this would almost double the amount of memory storage.

I was wondering what the speed differences would be for computing indices vs reading them from a constant array? Am I losing speed in exchange for memory? Is the difference significant?

Original Q&A

There are 1 best solutions below

Dithermaster On 23 August 2013 at 00:50

Computation is almost always faster on the GPU than global memory access to do the same thing (like a look-up-table). In particular, because the GPU keeps so many kernels "in flight" the math happens while it is waiting on the I/O from the previous kernel slot. So if your math is not too complex, prefer to do it rather than burn a global memory access.

OpenCL Index operations: algorithmic vs constant index buffer

There are 1 best solutions below

Related Questions in PERFORMANCE

Related Questions in INDEXING

Related Questions in OPENCL

Related Questions in NEURAL-NETWORK

Related Questions in APARAPI

Trending Questions

Popular # Hahtags

Popular Questions