Product and sum without addtional memory allocation

66 Views Asked by At

Is there away to multiply two arrays and sum along an axis (or multiple axes) without allocating extra memory?

In this example:

import numpy as np

A = np.random.random((10, 10, 10))
B = np.random.random((10, 10, 10))

C = np.sum(A[:, None, :, :, None] * B[None, :, None, :, :], axis=(-1,-2))

When computing C, an intermediate matrix of size 10x10x10x10x10 is created only to be collapsed immediately. Is there a way to avoid this in numpy?

1

There are 1 best solutions below

0
mozway On BEST ANSWER

This looks like a dot product with the transpose of the second array:

C = A @ B.T

NB. The operation in the original question was: C = np.sum(A[:, None, :] * B[None, :, :], axis=-1).

Quick check:

C1 = np.sum(A[:, None, :] * B[None, :, :], axis=-1)
C2 = A @ B.T

assert np.allclose(C1, C2)

You can generalize with einsum:

np.einsum('ikl,jlm->ijk', A, B)

Quick check:

A = np.random.random((2, 3, 4))
B = np.random.random((2, 4, 5))

#             i    j   k  l    m         i   j    k   l  m
C1 = np.sum(A[:, None, :, :, None] * B[None, :, None, :, :], axis=(-1,-2))
C2 = np.einsum('ikl,jlm->ijk', A, B)

assert np.allclose(C1, C2)