I'm attempting to set up an interface to use cublas.lib in fortran without any separate c-code. I have seen a few examples of this and tried to duplicate those but I having trouble.
Both of these examples work for me (cudart and cusolver)
Find available graphics card memory using Fortran
https://forums.developer.nvidia.com/t/using-cusolverdn-in-fortran-code/39732/5
I have an additional include directory of C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\lib\x64 and additional dependencies of cublas.lib cusolver.lib cudart.lib. Everything compiles fine (As I was able to run the examples above.
When I run the code below cublasCreate returns 7 (CUBLAS_STATUS_INVALID_VALUE)
!==================================================================
!Interface to cusolverDn and CUDA C functions
!==================================================================
! C binding
! https://gcc.gnu.org/onlinedocs/gfortran/ISO_005fC_005fBINDING.html
!
! Similar CUDA examples
! https://stackoverflow.com/questions/27507169/find-available-graphics-card-memory-using-fortran%5B/url%5D
! https://forums.developer.nvidia.com/t/using-cusolverdn-in-fortran-code/39732/5
! https://stackoverflow.com/questions/22390812/returning-a-pointer-to-a-device-allocated-matrix-from-c-to-fortran
! https://stackoverflow.com/questions/35150748/mixed-language-cuda-programming
module cudaThings
interface
! cudaMalloc
integer (c_int) function cudaMalloc ( buffer, size ) bind (C, name="cudaMalloc" )
use iso_c_binding
implicit none
type (c_ptr) :: buffer
integer (c_size_t), value :: size
end function cudaMalloc
! cudaMemcpy
! A_mem_stat = cudaMemcpy(gpuPtr,cpuPtr,sizeof(ptr),cudaMemcpyHostToDevice)
! note: cudaMemcpyHostToDevice = 1
! note: cudaMemcpyDeviceToHost = 2
integer (c_int) function cudaMemcpy ( dst, src, count, kind ) bind (C, name="cudaMemcpy" )
use iso_c_binding
type (C_PTR), value :: dst, src
integer (c_size_t), value :: count, kind
end function cudaMemcpy
! cudaFree
integer (c_int) function cudaFree(buffer) bind(C, name="cudaFree")
use iso_c_binding
implicit none
type (C_PTR), value :: buffer
end function cudaFree
! get memory info
integer (c_int) function cudaMemGetInfo(fre, tot) bind(C, name="cudaMemGetInfo")
use iso_c_binding
implicit none
type(c_ptr),value :: fre
type(c_ptr),value :: tot
end function cudaMemGetInfo
integer(c_int) function cusolverDnCreate(cusolver_Hndl) bind(C,name="cusolverDnCreate")
use iso_c_binding
implicit none
type(c_ptr)::cusolver_Hndl
end function
integer(c_int) function cusolverDnDestroy(cusolver_Hndl) bind(C,name="cusolverDnDestroy")
use iso_c_binding
implicit none
type(c_ptr),value::cusolver_Hndl
end function
integer(c_int) function cublasCreate(cublas_Hndl) bind(C,name="cublasCreate_v2")
use iso_c_binding
implicit none
type(c_ptr),value::cublas_Hndl
end function
integer(c_int) function cublasDestroy(cublas_Hndl) bind(C,name="cublasDestroy_v2")
use iso_c_binding
implicit none
type(c_ptr),value::cublas_Hndl
end function
end interface
end module
program cudaTest
use iso_c_binding
use cudaThings
implicit none
! GPU stuff
type(c_ptr) :: cublas_Hndl
integer*4 :: cublas_stat
! get handle
cublas_stat = cublasCreate(cublas_Hndl)
write(*,*) cublas_stat
if (cublas_stat .ne. 0 ) then
write (*, '(A, I2)') " cublasCreate error: ", cublas_stat
stop
end if
end program
I'm on windows 10, intel fortran, cuda 12.2, with a 930M graphics card.
To understand what is going on, it is worth analyzing how the underlying C code works before writing an interface to it.
In C, the correct canonical call looks like this:
which is passing idiomatically the
cublasHandle_t(itself a pointer to an opaque structure) by reference (even though C doesn't have explicit pass by reference semantics).If you did this:
you are passing an uninitialized pointer to the routine, which should result in a failure. I haven't done much work with F2003 stype C interop, but to my eyes this:
is the same as the theoretically non-working C version, whereas this:
would be like the first working C version and more likely to work correctly.