How to access sparse tensor core functionality in CUDA?

727 Views Asked by Krupip At 10 October 2022 at 17:51

Tensor cores can be programmatically accessed through the WMMA interface in CUDA (see https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#wmma and https://developer.nvidia.com/blog/programming-tensor-cores-cuda-9/) . Recently, in the Ampere generation of cards, Nvidia announced the ability to perform sparse tensor operations with sparse matrices, as seen here: https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt/

The format presented appears to take in pairs of elements and their order within four element segments (2 bit indices). However looking at the wmma documentation I can't find any mention of this, or how to access those special tensor core operations. This is not illuminated by the announcement page of this functionality either AFAICT.

How do I access sparse tensor core functionality in cuda?

Original Q&A

There are 1 best solutions below

Abator Abetor On 10 October 2022 at 19:17 BEST ANSWER

The blog post in your question links to the following paper: Accelerating Sparse Deep Neural Networks https://arxiv.org/pdf/2104.08378.pdf

In Section 3.2 it says

It is the application’s responsibility to ensure that the first operand is a matrix stored in the compressed 2:4 format. cuSPARSELt and other libraries provide APIs for compression and sparse math operations, while, starting in version 8.0, the TensorRT SDK performs these functions for 2:4 sparse weights automatically. NVIDIA libraries require that input dimensions of a sparse matrix multiplication be multiples of 16 and 32 for 16-bit (FP16/BF16) and 8b-integer formats, respectively.

Sparse tensor operations can manually be performed using ptx mma.sp which is explained in the ptx documentation Section 9.7.13.5 : https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-for-sparse-mma

How to access sparse tensor core functionality in CUDA?

There are 1 best solutions below

Related Questions in CUDA

Related Questions in GPU

Related Questions in NVIDIA

Related Questions in CUDA-WMMA

Trending Questions

Popular # Hahtags

Popular Questions