vectorize the assignment of 3d numpy arrays conditioned on the associate values at other dimensions

115 Views Asked by At

Is it possible to vectorize the following code in Python? It runs very slowly when the size of the array becomes large.

import numpy as np

# A, B, C are 3d arrays with shape (K, N, N). 
# Entries in A, B, and C are in [0, 1]. 
# In the following, I use random values in B and C as an example.

K = 5
N = 10000
A = np.zeros((K, N, N))
B = np.random.normal(0, 1, (K, N, N))
C = np.random.normal(0, 1, (K, N, N))

for k in range(K):
    for m in [x for x in range(K) if x != k]:
        for i in range(N):
            for j in range(N):
                if A[m, i, j] not in [0, 1]:
                    if A[k, i, j] == 1:
                        A[m, i, j] = B[m ,i ,j]
                    if A[k ,i, j] == 0:
                        A[m, i, j] = C[m, i, j]

2

There are 2 best solutions below

1
frederick-douglas-pearce On BEST ANSWER

I can help with a partial vectorization that should speed things up quite a bit, but I'm not sure on your logic for k vs. m, so didn't try to include that part. Essentially, you create a mask with the conditions you want checked across the 2nd and 3rd dimensions of A. Then map between A and either B or C using the appropriate mask:

# A, B, C are 3d arrays with shape (K, N, N). 
# Entries in A, B, and C are in [0, 1]. 
# In the following, I use random values in B and C as an example.

np.random.seed(10)

K = 5
N = 1000
A = np.zeros((K, N, N))
B = np.random.normal(0, 1, (K, N, N))
C = np.random.normal(0, 1, (K, N, N))

for k in range(K):
    for m in [x for x in range(K) if x != k]:
        #if A[m, i, j] not in [0, 1]:
        mask_1 = A[k, :, :] == 1
        mask_0 = A[k, :, :] == 0
        A[m, mask_1] = B[m, mask_1]
        A[m, mask_0] = C[m, mask_0]

I omitted the A[m, i, j] not in [0, 1] part because this made it difficult to debug since nothing happens (A is initialized as all zeros). If you need to include additional logic like this, just create another mask for it and include it in with an and in each mask's logic.

Update on 7/6/22 If you want to update the above code to remove the loop over m, then you can initialize an array with all the values of k, and use that to expand the mask to include all 3 dimensions, excluding each value of k that matches m as follows:

np.random.seed(10)

K = 5
N = 1000
A_2 = np.zeros((K, N, N))
B = np.random.normal(0, 1, (K, N, N))
C = np.random.normal(0, 1, (K, N, N))
K_vals = np.array(range(K))

for k in range(K):
    #for m in [x for x in range(K) if x != k]:
        #if A[m, i, j] not in [0, 1]:
    k_dim_2_skip = K_vals == k
    mask_1 = np.tile(A_2[k, :, :] == 1, (K, 1, 1))
    mask_1[k_dim_2_skip, :, :] = False
    mask_0 = np.tile(A_2[k, :, :] == 0, (K, 1, 1))
    mask_0[k_dim_2_skip, :, :] = False
    A_2[mask_1] = B[mask_1]
    A_2[mask_0] = C[mask_0]

Use these masks with the & np.logical_not... code you added in the comment below and that should do it. Note the more you vectorize, the larger the arrays you're manipulating for masks, etc. get, so there is a tradeoff with memory consumption. There is usually a sweet spot to balance run time vs memory usage for a given problem.

3
AudioBubble On

I cannot identify a way to vectorize this, but I can suggest using numba package to reduce the computation time. At here, you can import njit with the nogil=True parameter to speed up your code.

from numba import njit

@njit(nogil=True)
def function():
    for k in range(K):
        for m in [x for x in range(K) if x != k]:
            for i in range(N):
                for j in range(N):
                    if A[k, i, j] == 1 and A[m, i, j] not in [0, 1]:
                        A[m, i, j] = B[m ,i ,j]
                    if A[k ,i, j] == 0 and A[m, i, j] not in [0, 1]:
                        A[m, i, j] = C[m, i, j]

%timeit function()
7.35 s ± 252 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

With njit and nogil parameter, it took me 7 seconds to run the whole thing, but without the njit, my code is running for hours(and it still is now). Python has a global interpreter lock (GIL) to make sure it sticks to single-threading. By releasing the GIL, you can execute the code in multithreading. However, when using nogil=True, you’ll have to be wary of the usual pitfalls of multi-threaded programming (consistency, synchronization, race conditions, etc.).

You can look at the documentation about Numba here. https://numba.pydata.org/numba-doc/dev/user/jit.html?highlight=nogil