We have two tensors:
a = np.arange(8.).reshape(4,2,1)
b = np.arange(16.).reshape(2,4,2)
We are going to implement
np.einsum('ijk,jil->kl', a, b)
Although we could obtain its results, we were persist to understand the process details about the summation over tensors element.
Firstly we know how
np.einsum('jil', b)
changes the elements' orders of b
tensor.
but we cannot understand how np.einsum('ijk,jil->kl', a, b)
combine (sums) tensors elements.
For tracking the process we used strings:
aa=[[['e'],
['r']],
[['t'],
['y']],
[['u'],
['o']],
[['p'],
['q']]]
and
bb=[[[ 'x', 'c'],
[ 'v' , 'n'],
[ 'm', 'h'],
[ 'f' , 'd']],
[[ 's', 'w'],
[ 'a','z'],
['j', 'k'],
['l', 'b']]]
Because we wanted to see how different elements combine for obtaining np.einsum('ijk,jil->kl', aa, bb)
!
However np.einsum('jil', bb)
works correctly but it did not show me the details of summation over elements.
There are a few ways to understand this.
One is to use the example @Onyambu suggests.
By including i and j as indexes on the output, the output array no longer has shape (k, l), but (i, j, k, l). Also, none of the multiplied elements are being summed together. Each element of the output array is the product of one element from each of the original arrays.
To get back to the original behavior, we can sum by axis 1:
Then sum by axis 0:
Another way to understand this is to convert it to an explicit loop.
The following code is equivalent to this einsum, but slower. (It also does not check that the shapes of A and B are compatible.)
This gives us the same result,
array([[252., 280.]])
.Notice how the inner line of the loop,
ret[k, l] += A[i, j, k] * B[j, i, l]
is similar to the einsum subscript'ijk,jil->kl'
, except that thekl
has been moved to the beginning, andijk
is being used to index A, andjil
is being used to index B.More information
Understanding NumPy's einsum