I've been looking for a solution to this for days, but I'm at a loss.
I'm running openCL kernels on a Galaxy S6 and it all seems to be working fine except for when I try to use these functions in my kernel code: vload16() and vstore16().
When I run the kernel on my Mac's CPU using the Mali SDK, everything runs fine. I compared the header files in the Mali SDK to those in my android project and the only difference seems to be that the Mali SDK on my laptop has a header file named cl_d3d10.h whereas the Android project doesn't.
Does anybody have any suggestions to figure out how to get vload and vstore to work on Android? It's fine loading vectors manually when they have 4 components, but loading a uchar16 vector component by component is clearly inefficient.