Numbapro cuda python defining array in thread register in gpu

667 Views Asked by jalatif At 28 June 2025 at 02:19

I know how to create a global device function inside Host using np.array or np.zeros or np.empty(shape, dtype) and then using cuda.to_device to copy.

Also, one can declare shared array as cuda.shared.array(shape, dtype)

But how to create an array of constant size in the register of a particular thread inside gpu function.

I tried cuda.device_array or np.array but nothing worked.

I simply want to do this inside a thread -

x = array(CONSTANT, int32) # should make x for each thread

Original Q&A

There are 1 best solutions below

talonmies On 29 May 2015 at 06:49 BEST ANSWER

Numbapro supports numba.cuda.local.array(shape, type) for defining thread local arrays.

As with CUDA C, whether than array is defined in local memory or register is a compiler decision based on usage patterns of the array. If the indexing pattern of the local array is statically defined and there is sufficient register space, the compiler will use registers to store the array. Otherwise it will be stored in local memory. See this question and answer pair for more information.

Numbapro cuda python defining array in thread register in gpu

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in CUDA

Related Questions in PYCUDA

Related Questions in NUMBA

Related Questions in NUMBA-PRO

Trending Questions

Popular # Hahtags

Popular Questions