Convert one-hot encoded dimension into the index of position of 1

754 Views Asked by At

I have a tensor of three dimensions [batch_size, sequence_length, number_of_tokens]. The last dimension is one-hot encoded. I want to receive a tensor of two dimensions, where sequence_length consists of the index position of '1' of the number_of_tokens dimension.

For example, to turn a tensor of shape (2, 3, 4):

[[[0, 1, 0, 0]
[1, 0, 0, 0]
[0, 0, 0, 1]]
[[1, 0, 0, 0]
[1, 0, 0, 0]
[0, 0, 1, 0]]]

into a tensor of shape (2, 3) where number_of_tokens dimension is converted into the 1's position:

[[1, 0, 3]
[0, 0, 2]]

I'm doing it to prepare the model result to compare to reference answer when computing loss, I hope it is correct way.

3

There are 3 best solutions below

0
On BEST ANSWER

If your original tensor is as specified in your previous question, you can bypass the one-hot encoding and directly use the argmax:

t = torch.rand(2, 3, 4)
t = t.argmax(dim=2)
0
On

You can do what you want through successive list comprehension:

x=[[[0, 1, 0, 0],
[1, 0, 0, 0],
[0, 0, 0, 1]],
[[1, 0, 0, 0],
[1, 0, 0, 0],
[0, 0, 1, 0]]]

y=[[ell2.index(1) for ell2 in ell1] for ell1 in x]

print(y) # prints [[1, 0, 3], [0, 0, 2]]

which iterates over the elements of your main tensor and at each element, returns the list of "1" indices in the components of that element.

0
On

Simply do:

res = x.argmax(axis = 2)