OpenMP target offloading matrix multiplication compilation error

190 Views Asked by At

I am currently trying to implement a simple matrix multiplication of 2 nxn matrices using OpenMP target offloading. The code is taken from here:

template<typename T>
void multiplyJIK(T *A, T *B, T *C, uint64_t size) {

    #pragma omp target data device(0) map(to: A[0:size*size], B[0:size * size], size) map(tofrom:     C[0:size * size])
    {
        #pragma omp target teams device(0) num_teams(32768) thread_limit(512) \
            map(to: A[0:size*size], B[0:size * size], size) map(tofrom: C[0:size * size]) \
            default(none) shared(A, B, C, size)

        #pragma omp distribute parallel for num_threads(512) dist_schedule(static, 512) \
            default(none) shared(A, B, C, size)
    
        for (uint64_t j = 0; j < size; ++j) {
            for (uint64_t i = 0; i < size; ++i) {
                for (uint64_t k = 0; k < size; ++k) {
                    C[i * size + j] += A[i * size + k] * B[k * size + j];
                }
            }
        }
    }
}

It should multiply the 2 matrices A and B and store the results in C. The matrices are represented as onedimensional arrays of length size * size.

For my test, T is a float and I try to compile the code using the nvhpc toolkit: nvc++ -std=c++17 -mp=gpu -target=gpu main.cpp -o matmul and get this error:

error: item must appear in a SHARED or PRIVATE clause:
                          C[i * size + j] += A[i * size + k] * B[k * size + j];
                          ^
       detected during instantiation of "void Target::multiplyJIK(T *, T *, T *, uint64_t) [with T=float]"

I dont understand this error as the C array should be correctly mapped (map(tofrom: C...)) and is present in the shared(...) clause. Am I missing something in the code or is this a problem with the compile flags?

0

There are 0 best solutions below