I have two 3D tensors, tensor A
which has shape [B,N,S]
and tensor B
which also has shape [B,N,S]
. What I want to get is a third tensor C
, which I expect to have [B,B,N]
shape, where the element C[i,j,k] = np.dot(A[i,k,:], B[j,k,:]
. I also want to achieve this is a vectorized way.
Some further info: The two tensors A
and B
have shape [Batch_size, Num_vectors, Vector_size]
. The tensor C
, is supposed to represent the dot product between each element in the batch from A
and each element in the batch from B
, between all of the different vectors.
Hope that it is clear enough and looking forward to you answers!
The suggested
einsum
, working directly from theexpression:
matmul
does adot
on the last 2 dimensions, and treats the leading one(s) as batch. In your case 'k' is the batch dimension, and 'm' is the one that should obey thelast A and 2nd to the last of B
rule. So rewriting theikm,jkm...
to fit, and transposingA
andB
accordingly:Not much difference in performance. But now use
matmul
:and verify that values match (though more often than not, if shapes match, values do to).
I won't try to measure memory usage, but the time improvement suggests it too is better.
In some cases
einsum
is optimized to usematmul
. Here that doesn't seem to be the case, though we could play with its parameters. I'm a little surprised thematmul
is doing so much better.===
I vaguely recall another SO about
matmul
taking a short cut when the two arrays are the same thing,A@A
. I usedB=A
in these tests.But that only made a modest difference.
My BLAS etc is standard Linux, nothing special.