I am currently implementing the LoFTR model and came across the following code:
feature_c0.shape
-> torch.Size([1, 256, 60, 60])
rearrange(feature_c0, 'n c h w -> n (h w) c').shape
-> torch.Size([1, 3600, 256])
feature_c0.view(1, -1, 256).shape
-> torch.Size([1, 3600, 256])
I thought I understood the functionality of both, tensor.view
and rearrange
. The problem: the output of these 2 is different, even if their shape is the same. I don't really understand what is going on here.
The
torch.view
automatically reshape the inner dimension to fit the output dimension especially using-1
index.For example,
using
tensor.view()
is still possible to reshape to "last_dimension=6" with the order of tail tensor, whilerearrange()
should involve specified dimension to be reshaped, divided or grouped.In your case, the 256 * 60 * 60 is somehow grouped into [x * 256] in the order of last dimension, not [(60*60) * 256] you wanted.
As a result,
rearrange
is more specified function in your case.