I'm using the Container Optimized OS to run an application that takes advantage of GPUs. I have a separate system that creates VMs to run this application on-demand (to minimize cost) and I've been trying to reduce the time to get my application running.

To do this, I've started using a custom VM image, which at the moment is just my application's docker container being pre-downloaded and saved to the COS image. I would also like to pre-install the Nvidia drivers for the GPU, but I can't seem to get it to stick. Despite installing the drivers, verifying they work, and then creating the image when I create a new VM using that image it's like the drivers weren't installed. The files appear to all be present though. I've tried running

sudo cos-extensions install gpu

In the startup script when creating the image, but the instances created from my image throw back an error when I try to run nvidia-smi

nvidia-smi and nvidia mounting commands

sudo mount --bind /var/lib/nvidia /var/lib/nvidia
sudo mount -o remount,exec /var/lib/nvidia
/var/lib/nvidia/bin/nvidia-smi

Error:

NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
Please also try adding directory that contains libnvidia-ml.so to your system PATH.

Despite this complaint, the libnvidia-ml.so file DOES exist at: /var/lib/nvidia/lib64

The contents of my /var/lib/nvidia directory are:

$ ls -lh /var/lib/nvidia/
total 354M
-rw-r--r-- 1 root root 354M Mar 10 23:12 NVIDIA-Linux-x86_64-470.141.03_101-17162-40-42.cos
drwxr-xr-x 2 root root 4.0K Mar 10 23:12 bin
drwxr-xr-x 3 root root 4.0K Mar 10 23:12 bin-workdir
drwxr-xr-x 2 root root 4.0K Mar 10 23:12 drivers
drwxr-xr-x 3 root root 4.0K Mar 10 23:12 drivers-workdir
drwxr-xr-x 3 root root 4.0K Mar 10 23:12 firmware
drwxr-xr-x 4 root root 4.0K Mar 10 23:12 lib64
drwxr-xr-x 3 root root 4.0K Mar 10 23:12 lib64-workdir
-rw-r--r-- 1 root root 2.2K Mar 10 23:12 nvidia-installer.log
-rw-r--r-- 1 root root 1.2K Mar 10 23:12 pubkey.der
drwxr-xr-x 3 root root 4.0K Mar 10 23:12 share

Is there a way to create a custom image with the Nvidia driver's pre-installed that I can use?

0

There are 0 best solutions below