boost::compute, passing pointer to a closure

176 Views Asked by At

Good evening! I am writing a high-performance application and trying to use boost to speed up complex computations.

The essence of my question: is there a way to pass an external pointer to array (like float4_ *) to a BOOST_COMPUTE_CLOSURE? I'd like to get something like:

float4_ *normals = new float4_[NORMALS_NO];
BOOST_COMPUTE_CLOSURE(void, evalNormals, (int4_ indices), (normals), {
    ...
});
2

There are 2 best solutions below

0
On

The documentation of BOOST_COMPUTE_CLOSURE is slightly sparse as reported here by the library author, but some test cases show how to capture vectors and arrays. It actually works transparently, the same than with scalar.

For instance, capturing vec:

int data[] = {6, 7, 8, 9};
compute::vector<int> vec(data, data + 4, queue);

BOOST_COMPUTE_CLOSURE(int, get_vec, (int i), (vec), { return vec[i]; });

// run using a counting iterator to copy from vec to output
compute::vector<int> output(4, context);
compute::transform(
    compute::make_counting_iterator(0),
    compute::make_counting_iterator(4),
    output.begin(),
    get_vec,
    queue);
CHECK_RANGE_EQUAL(int, 4, output, (6, 7, 8, 9));
1
On

Okay, I have finally found out how to implement the declared options. First thing to do is to pass boost::compute::detail::device_ptr<float4_> instance to the function. Next we should declare a typename generator for `OpenCL backend` and operator<< to write pointer information into meta_kernel instance, which is being used in a hidden way in closure definition. So, the code:

1) Passing device_ptr instance

...
#include <boost/compute/detail/device_ptr.hpp>
...
float4_ *normalsData = new float4_[NORMALS_NO];
device_ptr<float4_> normalsDataDP = normalsData;
...
BOOST_COMPUTE_CLOSURE(void, evalNormals, (int4_ indices), (normalsDataDP), {
    ...
});
...

2) Implement typename generator:

...
namespace boost {
    namespace compute {
        template<>
        inline const char *type_name<detail::device_ptr<float4_>>()
        {
            return "__global float4 *";
        }
    }
}
...

3) Implement operator<<

...
namespace boost {
    namespace compute {
        namespace detail {
            meta_kernel &operator<<(meta_kernel &kern, 
                                    const device_ptr<float4_> &ptr)
            {
                std::string nodes_info = kern.get_buffer_identifier<float4_>(ptr.get_buffer());
                kern << kern.var<float4_ *>(nodes_info);
                return kern;
            }
        }
    }
}
...