Using OpenMP "for simd" in matrix-vector multiplication?

638 Views Asked by fdrek1988 At 27 July 2025 at 16:14

I'm currently trying to get my matrix-vector multiplication function to compare favorably with BLAS by combining #pragma omp for with #pragma omp simd, but it's not getting any speedup improvement than if I were to just use the for construct. How do I properly vectorize the inner loop with OpenMP's SIMD construct?

vector dot(const matrix& A, const vector& x)
{
  assert(A.shape(1) == x.size());

  vector y = xt::zeros<double>({A.shape(0)});

  int i, j;
#pragma omp parallel shared(A, x, y) private(i, j)
  {
#pragma omp for // schedule(static)
    for (i = 0; i < y.size(); i++) { // row major
#pragma omp simd
      for (j = 0; j < x.size(); j++) {
        y(i) += A(i, j) * x(j);
      }
    }
  }

  return y;
}

Original Q&A

There are 1 best solutions below

Jérôme Richard On 02 May 2021 at 17:28

Your directive is incorrect because there would introduce in a race condition (on y(i)). You should use a reduction in this case. Here is an example:

vector dot(const matrix& A, const vector& x)
{
  assert(A.shape(1) == x.size());

  vector y = xt::zeros<double>({A.shape(0)});

  int i, j;

  #pragma omp parallel shared(A, x, y) private(i, j)
  {
    #pragma omp for // schedule(static)
    for (i = 0; i < y.size(); i++) { // row major
      decltype(y(0)) sum = 0;

      #pragma omp simd reduction(+:sum)
      for (j = 0; j < x.size(); j++) {
        sum += A(i, j) * x(j);
      }

      y(i) += sum;
    }
  }

  return y;
}

Note that it may not be necessary faster because some compilers are able to automatically vectorize the code (ICC for example). GCC and Clang often fail to perform (advanced) SIMD reductions automatically and such a directive help them a bit. You can check the assembly code to check how the code is vectorized or enable vectorization reports (see here for GCC).

Using OpenMP "for simd" in matrix-vector multiplication?

There are 1 best solutions below

Related Questions in PARALLEL-PROCESSING

Related Questions in OPENMP

Related Questions in VECTORIZATION

Related Questions in SIMD

Related Questions in XTENSOR

Trending Questions

Popular # Hahtags

Popular Questions