How to check the version of NCCL

25.5k Views Asked by At

I remotely access High-performance computing nodes. I am not sure about NVIDIA Collective Communications Library (NCCL) is installed in my directory or not. Is there any way to check whether the NCCL is installed or not?

2

There are 2 best solutions below

2
On

You can try

locate nccl| grep "libnccl.so" | tail -n1 | sed -r 's/^.*\.so\.//'

or if you use PyTorch:

python -c "import torch;print(torch.cuda.nccl.version())"

Check it this link Command Cheatsheet: Checking Versions of Installed Software / Libraries / Tools for Deep Learning on Ubuntu

For containers, where no locate is available sometimes, one might replace it with ldconfig -v:

ldconfig -v | grep "libnccl.so" | tail -n1 | sed -r 's/^.*\.so\.//'
2
On

You can usually do this in the command line:

nvcc --version

you might have to run:

sudo apt install nvidia-cuda-toolkit

too.


As the other answerer mentioned, you can do:

torch.cuda.nccl.version()

in pytorch. Copy paste this into your terminal:

python -c "import torch;print(torch.cuda.nccl.version())"

I am sure there is something like that in tensorflow.