CUDA_ERROR_INVALID_SOURCE: device kernel image is invalid

325 Views Asked by At

I'm trying using cupy in my docker container. I use to containers which one is for CUDA and cuDNN, and the other is for cupy.

I tried this code.

import cupy as cp

cupy_array = cp.array([1, 2, 3])
cupy_result = cupy_array + 5 
print("CuPy Result:", cupy_result)

The full error log is like

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>
  File "cupy/_core/core.pyx", line 1191, in cupy._core.core.ndarray.__add__
  File "cupy/_core/core.pyx", line 1591, in cupy._core.core.ndarray.__array_ufunc__
  File "cupy/_core/_kernel.pyx", line 1292, in cupy._core._kernel.ufunc.__call__
  File "cupy/_core/_kernel.pyx", line 1319, in cupy._core._kernel.ufunc._get_ufunc_kernel
  File "cupy/_core/_kernel.pyx", line 1025, in cupy._core._kernel._get_ufunc_kernel
  File "cupy/_core/_kernel.pyx", line 72, in cupy._core._kernel._get_simple_elementwise_kernel
  File "cupy/_core/core.pyx", line 2141, in cupy._core.core.compile_with_cache
  File "/usr/local/lib/python3.8/dist-packages/cupy/cuda/compiler.py", line 492, in _compile_module_with_cache
    return _compile_with_cache_cuda(
  File "/usr/local/lib/python3.8/dist-packages/cupy/cuda/compiler.py", line 614, in _compile_with_cache_cuda
mod.load(cubin)
  File "cupy/cuda/function.pyx", line 264, in cupy.cuda.function.Module.load
  File "cupy/cuda/function.pyx", line 266, in cupy.cuda.function.Module.load
  File "cupy_backends/cuda/api/driver.pyx", line 210, in cupy_backends.cuda.api.driver.moduleLoadData
  File "cupy_backends/cuda/api/driver.pyx", line 60, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_INVALID_SOURCE: device kernel image is invalid

The result of nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4080        Off | 00000000:01:00.0  On |                  N/A |
|  0%   32C    P8               6W / 320W |    483MiB / 16376MiB |      4%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                     
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

The result of nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0

The result of cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

#define CUDNN_MAJOR 8
#define CUDNN_MINOR 4
#define CUDNN_PATCHLEVEL 0
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#endif /* CUDNN_VERSION_H */

The result of pip3 freeze | grep cupy is cupy-cuda116==10.6.0

The results above are all shown in docker container for cupy.

I ran docker for CUDA and cuDNN with sudo docker run --name cuda11.6.1-cudnn8 --gpus all --runtime=nvidia -it \ --privileged --env="DISPLAY=:0:0" -v=/tmp/.X11-unix:/tmp/.X11-unix:ro \ -v=/home/youngjoo/Documents/Elevation_ws:/home/youngjoo/Documents/Elevation_ws \ -v=/dev:/dev -w=/home/youngjoo/Documents/Elevation_ws \ nvidia/cuda:11.6.1-cudnn8-devel-ubuntu20.04

My OS is Ubuntu 20.04.

Docker version is 24.0.7, build afdd53b.

How can I resolve this?

I deleted all docker containers and restarted but the result was the same.

0

There are 0 best solutions below