Is it possible to vectorize the following code in Python? It runs very slowly when the size of the array becomes large.
import numpy as np
# A, B, C are 3d arrays with shape (K, N, N).
# Entries in A, B, and C are in [0, 1].
# In the following, I use random values in B and C as an example.
K = 5
N = 10000
A = np.zeros((K, N, N))
B = np.random.normal(0, 1, (K, N, N))
C = np.random.normal(0, 1, (K, N, N))
for k in range(K):
for m in [x for x in range(K) if x != k]:
for i in range(N):
for j in range(N):
if A[m, i, j] not in [0, 1]:
if A[k, i, j] == 1:
A[m, i, j] = B[m ,i ,j]
if A[k ,i, j] == 0:
A[m, i, j] = C[m, i, j]
I can help with a partial vectorization that should speed things up quite a bit, but I'm not sure on your logic for k vs. m, so didn't try to include that part. Essentially, you create a mask with the conditions you want checked across the 2nd and 3rd dimensions of
A. Then map betweenAand eitherBorCusing the appropriate mask:I omitted the
A[m, i, j] not in [0, 1]part because this made it difficult to debug since nothing happens (A is initialized as all zeros). If you need to include additional logic like this, just create another mask for it and include it in with anandin each mask's logic.Update on 7/6/22 If you want to update the above code to remove the loop over
m, then you can initialize an array with all the values ofk, and use that to expand themaskto include all 3 dimensions, excluding each value ofkthat matchesmas follows:Use these masks with the
& np.logical_not...code you added in the comment below and that should do it. Note the more you vectorize, the larger the arrays you're manipulating for masks, etc. get, so there is a tradeoff with memory consumption. There is usually a sweet spot to balance run time vs memory usage for a given problem.