Bank-Conflict-Free Access in shared memory

812 Views Asked by Behzad Baghapour At 31 March 2012 at 14:28

I have to use shared memory that is 64 elements in size, twice the number of banks and threads in a warp. How should I address them to yield a bank-conflict-free access?

Original Q&A

There are 2 best solutions below

geek On 31 March 2012 at 15:23

In case of 32-bit memory access you can use default memory access pattern.

__shared__ int shared[32];
int data = shared[base + stride * tid];

there stride is odd.

If you have 64-bit access you can use some trick like this:

struct type 
{  
   int x, y, z;
};
__shared__ struct type shared[32];
struct type data = shared[base + tid];

P O'Conbhui On 03 April 2012 at 08:57

Let's assume you're using compute capability 1.x, so your shared memory has 16 banks, and each thread has to access 2 elements in shared memory.

What you want is for a thread to access the same memory bank for both elements, so if you organize it such that the required elements are 16 away from each other, you should avoid bank conflicts.

__shared__ int shared[32];
int data = shared[base + stride * tid];
int data = shared[base + stride * tid + 16];

I used this pattern for storing complex floats, but I had an array of complex floats, so it looked like

#define TILE_WIDTH 16

__shared__ float shared[TILE_WIDTH][2*TILE_WIDTH + 1];
float real = shared[base + stride * tid];
float imag = shared[base + stride * tid + TILE_WIDTH];

Where the +1 is to avoid serialization in transposed access patterns.

Bank-Conflict-Free Access in shared memory

There are 2 best solutions below

Related Questions in CUDA

Related Questions in GPU

Related Questions in GPU-SHARED-MEMORY

Related Questions in MEMORY-ACCESS

Related Questions in BANK-CONFLICT

Trending Questions

Popular # Hahtags

Popular Questions