How to reduce the dimensions of a numpy array by using the sum over n elements?

3k Views Asked by At

I have the following array (numbers are placeholders for illustration):

arr = np.array([[1,  1,  1,    2,  2,  2,    3,  3,  3,    4  ,4,  4 ],
                [1,  1,  1,    2,  2,  2,    3,  3,  3,    4,  4,  4 ],
                [1,  1,  1,    2,  2,  2,    3,  3,  3,    4,  4,  4 ],

                [5,  5,  5,    6,  6,  6,    7,  7,  7,    8,  8,  8 ],
                [5,  5,  5,    6,  6,  6,    7,  7,  7,    8,  8,  8 ],
                [5,  5,  5,    6,  6,  6,    7,  7,  7,    8,  8,  8 ],

                [9,  9,  9,    10, 10, 10,   11, 11, 11,   12, 12, 12],
                [9,  9,  9,    10, 10, 10,   11, 11, 11,   12, 12, 12],
                [9,  9,  9,    10, 10, 10,   11, 11, 11,   12, 12, 12],

                [13, 13, 13,   14, 14, 14,   15, 15, 15,   16, 16, 16],
                [13, 13, 13,   14, 14, 14,   15, 15, 15,   16, 16, 16],
                [13, 13, 13,   14, 14, 14,   15, 15, 15,   16, 16, 16]])

I would like to reduce the dimensions in a way that every 9 elements (3x3 area) having the same numer here would be summed up. So the 12*12 array should become a 4x4 array.

I was looking here for other answers and have found something for a 1D array I adapted. Hoewever, it is not working as expected:

result = np.sum(arr.reshape(-1,3), axis=1)
result = np.sum(result .reshape(3,-1), axis=0)

What is the correct was to achieve the desrired result?

4

There are 4 best solutions below

1
Nils Werner On BEST ANSWER

If we look at the flattened array

arr.ravel()
# array([ 1,  1,  1,  2,  2,  2,  3,  3,  3,  4,  4,  4,  1,  1,  1,  2,  2,
#         2,  3,  3,  3,  4,  4,  4,  1,  1,  1,  2,  2,  2,  3,  3,  3,  4,
#         4,  4,  5,  5,  5,  6,  6,  6,  7,  7,  7,  8,  8,  8,  5,  5,  5,
#         6,  6,  6,  7,  7,  7,  8,  8,  8,  5,  5,  5,  6,  6,  6,  7,  7,
#         7,  8,  8,  8,  9,  9,  9, 10, 10, 10, 11, 11, 11, 12, 12, 12,  9,
#         9,  9, 10, 10, 10, 11, 11, 11, 12, 12, 12,  9,  9,  9, 10, 10, 10,
#        11, 11, 11, 12, 12, 12, 13, 13, 13, 14, 14, 14, 15, 15, 15, 16, 16,
#        16, 13, 13, 13, 14, 14, 14, 15, 15, 15, 16, 16, 16, 13, 13, 13, 14,
#        14, 14, 15, 15, 15, 16, 16, 16])

We can see a pattern

  1. Groups of 3 digits
  2. in groups of 4
  3. in super-groups of 3

Use that to reshape your array (from back to front), and take the sum

arr.reshape(-1, 3, 4, 3).sum((-1, -3))
# array([[  9,  18,  27,  36],
#        [ 45,  54,  63,  72],
#        [ 81,  90,  99, 108],
#        [117, 126, 135, 144]])
0
Felix On

I suggest following Nils' answer, as it is simpler and more efficient for this particular case, although what I suggest below is more general, if you wanted something else than just a sum.


You are looking for convolution. A small kernel is run over the array, performing element-wise multiplications and summing the results, generating values at each step for a new array. In this case we want a simple sum, so we'll use a kernel of ones with the appropriate size (3x3). Because we want no overlap, our stride is also 3 in both directions.

2D convolution is not available in NumPy, so we'll have to import from SciPy. But that function doesn't have stride (skipping) functionality, so we'll implement our own manually.

from scipy.signal import convolve2d

kernel = np.ones((3, 3))
convolved = convolve2d(arr, kernel, mode='valid')
strided = convolved[::3, ::3]

strided contains the result here, and we can check the final result by dividing by nine, to get the original value of each cell.

>>> strided / 9
array([[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12.],
       [13., 14., 15., 16.]])
3
yatu On

We can use skimage's view_as_blocks to take a strided view of the array and then take the sum of each block:

from skimage.util.shape import view_as_blocks
n = 3
view_as_blocks(arr, (n,n)).sum((-1,2))
array([[  9,  18,  27,  36],
       [ 45,  54,  63,  72],
       [ 81,  90,  99, 108],
       [117, 126, 135, 144]])
0
MarcMush On

after a bit fiddling with reshape, I came up with this


arr = np.array([[ 1,  1,  1,  2,  2,  2,  3,  3,  3,  4,  4,  4],
        [ 1,  1,  1,  2,  2,  2,  3,  3,  3,  4,  4,  4],
        [ 1,  1,  1,  2,  2,  2,  3,  3,  3,  4,  4,  4],
        [ 5,  5,  5,  6,  6,  6,  7,  7,  7,  8,  8,  8],
        [ 5,  5,  5,  6,  6,  6,  7,  7,  7,  8,  8,  8],
        [ 5,  5,  5,  6,  6,  6,  7,  7,  7,  8,  8,  8],
        [ 9,  9,  9, 10, 10, 10, 11, 11, 11, 12, 12, 12],
        [ 9,  9,  9, 10, 10, 10, 11, 11, 11, 12, 12, 12],
        [ 9,  9,  9, 10, 10, 10, 11, 11, 11, 12, 12, 12]])

a = np.size(arr,0)//3
b = np.size(arr,1)//3

np.sum(arr.reshape(a, 3, b, 3), axis=(1,3))

# result

array([[  9,  18,  27,  36],
       [ 45,  54,  63,  72],
       [ 81,  90,  99, 108]])