Is there away to multiply two arrays and sum along an axis (or multiple axes) without allocating extra memory?
In this example:
import numpy as np
A = np.random.random((10, 10, 10))
B = np.random.random((10, 10, 10))
C = np.sum(A[:, None, :, :, None] * B[None, :, None, :, :], axis=(-1,-2))
When computing C, an intermediate matrix of size 10x10x10x10x10 is created only to be collapsed immediately. Is there a way to avoid this in numpy?
This looks like a dot product with the transpose of the second array:
NB. The operation in the original question was:
C = np.sum(A[:, None, :] * B[None, :, :], axis=-1).Quick check:
You can generalize with
einsum:Quick check: