I have a question about the BLAS and cublas interface and how to do a particular matrix-vector multiplication.
I have a call that I currently do with cblas_zgemv
and gives the correct result.
Basically I have a complex 3x3 matrix A
and a complex vector v
(3 components), in c-ordering all contiguous, and I want to multiply the hermitian-conjugate (conjugate and transpose) of the matrix to the vector.
This call can be made with cblas by calling
std::complex<double> const alpha=1.0;
std::complex<double> const beta=0.0;
cblas_zgemv(CblasRowMajor, CblasConjTrans, 3 /*A size*/, 3/*A size*/, &alpha, A_pointer, 3 /*A stride*/, v_pointer, 1, &beta, result_pointer, 1 /*result_stride*/)
It looks as if this cannot be implemented by the BLAS interface. http://www.netlib.org/lapack/explore-html/dc/dc1/group__complex16__blas__level2_gafaeb2abd9fffa7442b938dc384aeaf47.html#gafaeb2abd9fffa7442b938dc384aeaf47
The reason seems to be that BLAS doesn't have the "conjugate-only" option in the "trans" argument.
The equivalent call in BLAS (not cblas) would be:
gemv( trans , 3, 3, alpha, A_pointer, 3, v_pointer, 1, beta, result_pointer, 1);
The trans
options are only N
(no-transpose, i.e. tranpose in C-ordering), T
(tranpose, i.e no-transpose in C), C
(conjugate-transpose, only-conjugate in C).
By symmetry, there another combination missing, something like CO
("conjugate-only" --without transpose--, which would correspond to conjugate-tranpose in C-ordering).
So, it seems that there is an inconvenient hole in BLAS for complex elements. What is worst, this limitation seems to propagate to CUDA BLAS (cuBLAS) which has the same interface as BLAS basically. Making a cuBLAS call would be my final goal. https://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-gemv
Like in BLAS, cuBLAS has only three options (via enum) for the transposition option: CUBLAS_OP_N
, CUBLAS_OP_T
, CUBLAS_OP_H
.
Am I missing something obvious? Is this a known limitation and there is a well known workaround?
NOTE: I know that perhaps this particular case can be implemented with GEMM in a case with dimension 1xn
, but then the same argument can be made about GEMM, it lacks the the "conjugate-only" option.
There is no direct way to do conjugate only with standard BLAS API.
But there is a easy way to do Conjugate only. When you have a row major matrix, this can be done by setting
CblasConjTrans
and usingCblasColMajor
instead ofCblasRowMajor
and vice versa for col major matrix. To summariseIf you want a direct way, try using BLIS library.