How to update ptxas (nvidia toolchain) on google compute engine

1.6k Views Asked by At

I have a debian-based GCE with nvidia A100 40GB GPU where the app I'm running complains:

external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:114]
*** WARNING *** You are using ptxas 11.0.221, which is older than 11.1.
ptxas before 11.1 is known to miscompile XLA code, leading to incorrect results or invalid-address errors.

I see the following nvidia related packages installed:

$ apt list --installed | grep nvidia 
libnvidia-container-tools/buster,now 1.13.1-1 amd64 [installed,upgradable to: 1.13.5-1]
libnvidia-container1/buster,now 1.13.1-1 amd64 [installed,upgradable to: 1.13.5-1]
nvidia-container-toolkit-base/buster,now 1.13.1-1 amd64 [installed,upgradable to: 1.13.5-1]
nvidia-container-toolkit/buster,now 1.13.1-1 amd64 [installed,upgradable to: 1.13.5-1]
nvidia-docker2/buster,now 2.13.0-1 all [installed]

Is the ptxas code in one of the above packages?

How do I update the ptxas code? and how do I specify a version to update to?

1

There are 1 best solutions below

0
On

I'm not sure this is the "right" solution, but it appears the newer verison of ptxas requires a later version of the os. The os I was using was the "default" debian 10 version suggested when I originally created the deep-learning gpu vm. I destroyed the original vm and recreated it using

Debian 11 based Deep Learning VM with M109 and CUDA 11.3

That at least got me the desired ptxas version.