GEMV BLAS and cuBLAS missing the "conjugate-only" ordering option?

511 Views Asked by At

I have a question about the BLAS and cublas interface and how to do a particular matrix-vector multiplication.

I have a call that I currently do with cblas_zgemv and gives the correct result.

Basically I have a complex 3x3 matrix A and a complex vector v (3 components), in c-ordering all contiguous, and I want to multiply the hermitian-conjugate (conjugate and transpose) of the matrix to the vector. This call can be made with cblas by calling

std::complex<double> const alpha=1.0;
std::complex<double> const beta=0.0;
cblas_zgemv(CblasRowMajor, CblasConjTrans, 3 /*A size*/, 3/*A size*/, &alpha, A_pointer, 3 /*A stride*/, v_pointer, 1, &beta, result_pointer, 1 /*result_stride*/)

It looks as if this cannot be implemented by the BLAS interface. http://www.netlib.org/lapack/explore-html/dc/dc1/group__complex16__blas__level2_gafaeb2abd9fffa7442b938dc384aeaf47.html#gafaeb2abd9fffa7442b938dc384aeaf47

The reason seems to be that BLAS doesn't have the "conjugate-only" option in the "trans" argument.

The equivalent call in BLAS (not cblas) would be:

gemv( trans , 3, 3, alpha, A_pointer, 3, v_pointer, 1, beta, result_pointer, 1);

The trans options are only N (no-transpose, i.e. tranpose in C-ordering), T (tranpose, i.e no-transpose in C), C (conjugate-transpose, only-conjugate in C).

By symmetry, there another combination missing, something like CO ("conjugate-only" --without transpose--, which would correspond to conjugate-tranpose in C-ordering).

So, it seems that there is an inconvenient hole in BLAS for complex elements. What is worst, this limitation seems to propagate to CUDA BLAS (cuBLAS) which has the same interface as BLAS basically. Making a cuBLAS call would be my final goal. https://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-gemv

Like in BLAS, cuBLAS has only three options (via enum) for the transposition option: CUBLAS_OP_N, CUBLAS_OP_T, CUBLAS_OP_H.

Am I missing something obvious? Is this a known limitation and there is a well known workaround?


NOTE: I know that perhaps this particular case can be implemented with GEMM in a case with dimension 1xn, but then the same argument can be made about GEMM, it lacks the the "conjugate-only" option.

1

There are 1 best solutions below

1
On

There is no direct way to do conjugate only with standard BLAS API.

But there is a easy way to do Conjugate only. When you have a row major matrix, this can be done by setting CblasConjTrans and using CblasColMajor instead of CblasRowMajor and vice versa for col major matrix. To summarise

// Conj Transpose
cblas_zgemv(CblasRowMajor, CblasConjTrans, 3 /*A size*/, 3/*A size*/, &alpha, A_pointer, 3 /*A stride*/, v_pointer, 1, &beta, result_pointer, 1 /*result_stride*/)

// No transpose  and No Conj
cblas_zgemv(CblasRowMajor, CblasNoTrans, 3 /*A size*/, 3/*A size*/, &alpha, A_pointer, 3 /*A stride*/, v_pointer, 1, &beta, result_pointer, 1 /*result_stride*/)

// Only Trans
cblas_zgemv(CblasRowMajor, CblasTrans, 3 /*A size*/, 3/*A size*/, &alpha, A_pointer, 3 /*A stride*/, v_pointer, 1, &beta, result_pointer, 1 /*result_stride*/)

//Only conj()
cblas_zgemv(CblasColMajor, CblasConjTrans, 3 /*A size*/, 3/*A size*/, &alpha, A_pointer, 3 /*A stride*/, v_pointer, 1, &beta, result_pointer, 1 /*result_stride*/)

If you want a direct way, try using BLIS library.