Can nvlink inline device functions from separate compilation units?

144 Views Asked by user1823664 At 25 July 2018 at 04:39

If the separate compilation units that are fed as input to nvlink contain cuda kernels and device functions that invoke device functions marked as __forceinline__, will these functions be inlined? Assume they would be inlined if one put all the source code into a single file.

Original Q&A

There are 1 best solutions below

talonmies On 25 July 2018 at 07:01

If the separate compilation units that are fed as input to nvlink contain cuda kernels and device functions that invoke device functions marked as __forceinline__, will these functions be inlined?

To the best of my knowledge, the CUDA device code linker can't do this. The __forceinline__ directive is a compiler level operation, and after compilation there is no way of marking code as inlineable in either PTX or SASS. The CUDA device code compiler should emit a warning that an external inline function was used but not defined if you try this.

If you want functions to be compiled inline, you have to (unsurprisingly) use a compiler, not a linker.

Can nvlink inline device functions from separate compilation units?

There are 1 best solutions below

Related Questions in CUDA

Related Questions in INLINE

Related Questions in LINK-TIME-OPTIMIZATION

Related Questions in NVLINK

Trending Questions

Popular # Hahtags

Popular Questions