Is there still shared mem bank conflict in nvidia cuda compute capability 7.0 and above?

637 Views Asked by cctv At 05 February 2022 at 02:14

If all threads in same block visit the same address i.e. array[0] for some old compute capability, there is a bank conflict. But does this conflict still exist for the latest compute capabilities (i.e. 7.0 for GPU V100 or 8.0 for A100)?

Original Q&A

There are 1 best solutions below

huseyin tugrul buyukisik On 05 February 2022 at 08:32

In this Nvidia blog compute capability 2.0 is said to have a multicast (and broadcast) feature which converts address collisions into single memory requests. Not all bank conflicts are caused by accesses to the same address but are caused by different addresses having the same result from the modulo calculation with the number of banks.

In your example, all threads accessing same address will do a broadcast operation. To generate a true bank conflict, you need to access multiple addresses like 0, stride, stride x2, stride x3, etc. such that there is no multicast but serialization on the same (shared) memory bank.

Volta architecture still has shared bank conflicts.

If shared memory has 32 banks, then it will have bank conflicts for 32 bit aligned nth, n+32nd, n+64th, ... addresses accessed at the same time. Unless they invent a dual-pipelined shared memory bank.

Is there still shared mem bank conflict in nvidia cuda compute capability 7.0 and above?

There are 1 best solutions below

Related Questions in CUDA

Related Questions in NVIDIA

Related Questions in GPU-SHARED-MEMORY

Related Questions in BANK-CONFLICT

Trending Questions

Popular # Hahtags

Popular Questions