I am fairly new to seq2seq models and transformers.
Basically I am working on a sequence generation problem. I want to use the transformer. I am using python and pyTorch.
I know how the transformer model works for a sequence generation like given [1,2,3]
it can generate [1,2,3,4,5]
.
But the problem I am facing is that, in my dataset each point/element in the sequence has 2 attributes. So, the sequence looks like following:
[(2,4), (1,8), (1,9)]
and the generated sequence will be:
[(2,4), (1,8), (1,9), (3,1), (2,9)]
The first element of each tuple can be between 1-5 and the second element can be between 1-10.
I want to follow the general approach of Transformers like creating embedding for each element, pass it through a decoder block of multihead attention and pointwise feed forward network and finally use softmax to sample out the output.
My question is, as each data point contains 2 values, how do I change the transformer model to work with this kind of sequences? Any direction will be appreciated. Thanks.