I have a debian-based GCE with nvidia A100 40GB GPU where the app I'm running complains:
external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:114]
*** WARNING *** You are using ptxas 11.0.221, which is older than 11.1.
ptxas before 11.1 is known to miscompile XLA code, leading to incorrect results or invalid-address errors.
I see the following nvidia related packages installed:
$ apt list --installed | grep nvidia
libnvidia-container-tools/buster,now 1.13.1-1 amd64 [installed,upgradable to: 1.13.5-1]
libnvidia-container1/buster,now 1.13.1-1 amd64 [installed,upgradable to: 1.13.5-1]
nvidia-container-toolkit-base/buster,now 1.13.1-1 amd64 [installed,upgradable to: 1.13.5-1]
nvidia-container-toolkit/buster,now 1.13.1-1 amd64 [installed,upgradable to: 1.13.5-1]
nvidia-docker2/buster,now 2.13.0-1 all [installed]
Is the ptxas code in one of the above packages?
How do I update the ptxas code? and how do I specify a version to update to?
I'm not sure this is the "right" solution, but it appears the newer verison of ptxas requires a later version of the os. The os I was using was the "default" debian 10 version suggested when I originally created the deep-learning gpu vm. I destroyed the original vm and recreated it using
That at least got me the desired ptxas version.