I am trying to create a GPU microservice using Nvidia cuda Base image, but during the docker build, I am facing Driver not found issue, can someone point out what is missing here?
DockerFile:
FROM nvidia/cuda:10.1-devel
# Install some basic utilities
RUN apt-get update && apt-get install -y \
curl \
ca-certificates \
sudo \
git \
bzip2 \
libx11-6 \
&& rm -rf /var/lib/apt/lists/*
ENV CONDA_AUTO_UPDATE_CONDA=false
ENV PATH=/home/user/miniconda/bin:$PATH
RUN curl -sLo ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh \
&& chmod +x ~/miniconda.sh \
&& ~/miniconda.sh -b -p ~/miniconda \
&& rm ~/miniconda.sh \
&& conda install -y python==3.7 \
&& conda clean -ya
ENV PATH="/usr/local/cuda-10.1/bin:$PATH"
ENV LD_LIBRARY_PATH="/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH"
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
ENV NVIDIA_VISIBLE_DEVICES=all
ENV FORCE_CUDA="1"
RUN conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
RUN pip install -v -e .
Error:
"/home/user/miniconda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1013, in _get_cuda_arch_flags
capability = torch.cuda.get_device_capability()
File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 320, in get_device_capability
prop = get_device_properties(device)
File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 325, in get_device_properties
_lazy_init() # will define _get_device_properties and _CudaDeviceProperties
File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 196, in _lazy_init
_check_driver()
File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 101, in _check_driver
http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
The issues happens during execution of last step in docker file.
I tried using multiple Nvidia base docker images, but didn't help much. (cuda:10.1-base-ubuntu18.04, cuda:10.1-runtime-ubuntu18.04)
Any pointers appreciated.
After lot of trial and errors and going through a lot of documentation, this is what worked fine.
Hope this helps!
Good luck!