NVIDIA Driver Not found during Nvidia + Cuda - Docker Image build

Question

NVIDIA Driver Not found during Nvidia + Cuda - Docker Image build

1.5k Views Asked by Vihar At 21 April 2021 at 14:28

I am trying to create a GPU microservice using Nvidia cuda Base image, but during the docker build, I am facing Driver not found issue, can someone point out what is missing here?

DockerFile:

FROM nvidia/cuda:10.1-devel
        
# Install some basic utilities
    RUN apt-get update && apt-get install -y \
        curl \
        ca-certificates \
        sudo \
        git \
        bzip2 \
        libx11-6 \
     && rm -rf /var/lib/apt/lists/*

ENV CONDA_AUTO_UPDATE_CONDA=false
ENV PATH=/home/user/miniconda/bin:$PATH
RUN curl -sLo ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh \
 && chmod +x ~/miniconda.sh \
 && ~/miniconda.sh -b -p ~/miniconda \
 && rm ~/miniconda.sh \
 && conda install -y python==3.7 \
 && conda clean -ya

ENV PATH="/usr/local/cuda-10.1/bin:$PATH"
ENV LD_LIBRARY_PATH="/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH"
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
ENV NVIDIA_VISIBLE_DEVICES=all
ENV FORCE_CUDA="1"

RUN conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

RUN pip install -v -e .

Error:

"/home/user/miniconda/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1013, in _get_cuda_arch_flags
capability = torch.cuda.get_device_capability()
File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 320, in get_device_capability
    prop = get_device_properties(device)

File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 325, in get_device_properties
  _lazy_init()  # will define _get_device_properties and _CudaDeviceProperties
File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 196, in _lazy_init
  _check_driver()

File "/home/user/miniconda/lib/python3.7/site-packages/torch/cuda/__init__.py", line 101, in _check_driver
  http://www.nvidia.com/Download/index.aspx""")
    AssertionError:
    Found no NVIDIA driver on your system. Please check that you
    have an NVIDIA GPU and installed a driver from
    http://www.nvidia.com/Download/index.aspx

The issues happens during execution of last step in docker file.

I tried using multiple Nvidia base docker images, but didn't help much. (cuda:10.1-base-ubuntu18.04, cuda:10.1-runtime-ubuntu18.04)

Any pointers appreciated.

Original Q&A

There are 1 best solutions below

**Vihar** · Answer 1 · 2022-09-09T14:30:03.983000

After lot of trial and errors and going through a lot of documentation, this is what worked fine.

ARG PYTORCH=1.3
ARG CUDA=10.1
ARG CUDNN=7

FROM pytorch/pytorch:1.3-cuda10.1-cudnn7-devel
RUN mkdir /app
WORKDIR /app
ENV TORCH_CUDA_ARCH_LIST="5.2 6.0 6.1 7.0+PTX"
ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"

RUN apt-get update && apt-get install -y libglib2.0-0 libsm6 libxrender-dev libxext6 \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

# Install some basic utilities
RUN apt-get update && apt-get install -y \
    curl \
    ca-certificates \
    sudo \
    git \
    bzip2 \
    libx11-6 \
 && rm -rf /var/lib/apt/lists/*

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    build-essential g++ \
    libglib2.0-0 libsm6 libxrender-dev libxext6 wget

# Create a non-root user and switch to it
RUN adduser --disabled-password --gecos '' --shell /bin/bash user \
 && chown -R user:user /app
RUN echo "user ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-user
USER user

# All users can use /home/user as their home directory
ENV HOME=/home/user
RUN chmod 777 /home/user

# Install Miniconda and Python 3.7
ENV CONDA_AUTO_UPDATE_CONDA=false
ENV PATH=/home/user/miniconda/bin:$PATH
RUN curl -sLo ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh \
 && chmod +x ~/miniconda.sh \
 && ~/miniconda.sh -b -p ~/miniconda \
 && rm ~/miniconda.sh \
 && conda install -y python==3.7 \
 && conda clean -ya

RUN conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
RUN pip install -v -e .

Hope this helps!

Good luck!

NVIDIA Driver Not found during Nvidia + Cuda - Docker Image build

There are 1 best solutions below

Related Questions in DOCKER

Related Questions in PYTORCH

Related Questions in NVIDIA-DOCKER

Trending Questions

Popular # Hahtags

Popular Questions