Why compile to cubin and not just to PTX?

48 Views Asked by At

Cuda's compatibility model for binaries is "binaries compiled for architecture 7.lower run on 7.higher, but not on any other major versions" -- except for PTX, where "PTX generated for architecture for 7.lower will run on any higher version (minor or major)". I understand that PTX is an intermediate language, where compilation is actually finished at application load time (using the driver you have installed); the resulting binary is cached, and updated when you update the driver. The question I have is -- why should I, as a developer, ever want to generate "proper" cuda binaries rather than just the PTX?

(Context: my company compiles to cuda binaries for several different architectures, and does not compile to PTX. The people who set this up have left, and those of us still here are trying to figure out what they were thinking.)

I haven't actually tried to generate PTX files to see what would go wrong; I'm hoping someone can enlighten me.

Based on Nvidia's documentation, I'm expecting that the first start-up of my application would take longer because it'll be compiling to a binary, that subsequent start-ups will probably be fast because the binary will be cached (until I update the driver, or the cache is cleared for some other reason), but that once it loads, performance will be the same as if I had compiled to a binary in the first place -- which would be great as far as I'm concerned.

0

There are 0 best solutions below