CNTK on Azure Data Science VM

606 Views Asked by At

I have an N-Series Azure VM (the Data Science VM) with Tesla K80 GPU. According to the NVIDIA scanner my GPU driver is up to date. When I run my CNTK Brainscript it says "No GPUs Found" and runs in CPU mode. What can I do to troubleshoot?

requestnodes [MPIWrapper]: using 1 out of 1 MPI nodes on a single host (1 reques
ted); we (0) are in (participating)
-------------------------------------------------------------------
Build info:

            Built time: Dec 22 2016 01:43:24
            Last modified date: Thu Dec 22 01:35:04 2016
            Build type: Release
            Build target: GPU
            With 1bit-SGD: yes
            With ASGD: yes
            Math lib: mkl
            CUDA_PATH: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8
.0
            CUB_PATH: c:\src\cub-1.4.1
            CUDNN_PATH: C:\local\cudnn-8.0-windows10-x64-v5.1
            Build Branch: HEAD
            Build SHA1: 8e8b5ff92eff4647be5d41a5a515956907567126
            Built by svcphil on DPHAIM-24
            Build Path: C:\jenkins\workspace\CNTK-Build-Windows\Source\CNTK\

-------------------------------------------------------------------
No GPUs found

Edit: here is the output from NVidia_smi.exe:

C:\Program Files\NVIDIA Corporation\NVSMI>.\nvidia-smi.exe
Fri Jan 13 19:00:43 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 369.30                 Driver Version: 369.30                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           TCC  | 0BD1:00:00.0     Off |                  Off |
| N/A   43C    P8    27W / 149W |      0MiB / 12189MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           TCC  | 5871:00:00.0     Off |                  Off |
| N/A   35C    P8    34W / 149W |      0MiB / 12189MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
3

There are 3 best solutions below

1
On BEST ANSWER

The Windows Data Science VM bydefault does not come with the GPU drivers, CUDA etc. We do have an extension called "Deep Learning toolkit for DSVM" that adds on drivers, CUDA and GPU edition of deep learning software like CNTK, Tensorflow, MxNet.

More Info: http://aka.ms/dsvm/deeplearning

We also recently released a Ubuntu version of DSVM with builtin CUDA, GPU drivers and several more deep learning tools and can be deployed either on GPU VM or CPU only VMs on Azure.

3
On

Would it be possible for you to run the python notebooks and see if you could run them with the device being set to gpu(id)? or from activated CNTK python environment you could try setting some device.

import cntk as C
from cntk.device import set_default_device, gpu
C.device.set_default_device(C.device.gpu(0))

This might give you some clues whether it is Brainscript specific issue.

1
On

Well the python script and Brainscript work now, after installing CUDA (I installed it to run NVIDIA_SMI). I should not have assumed that the Azure Data Science image (that only works with an N Series VM) has the necessary NVIDIA libraries pre-installed. :-)