Seq2Seq Models for Chatbots

626 Views Asked by At

I am building a chat-bot with a sequence to sequence encoder decoder model as in NMT. From the data given I can understand that when training they feed the decoder outputs into the decoder inputs along with the encoder cell states. I cannot figure out that when i am actually deploying a chatbot in real time, how what should I input into the decoder since that time is the output that i have to predict. Can someone help me out with this please?

1

There are 1 best solutions below

3
On

The exact answer depends on which building blocks you take from Neural Machine Translation model (NMT) and which ones you would replace with your own. I assume the graph structure exactly as in NMT.

If so, at inference time, you can feed just a vector of zeros to the decoder.


Internal details: NMT uses the entity called Helper to determine the next input in the decoder (see tf.contrib.seq2seq.Helper documentation).

In particular, tf.contrib.seq2seq.BasicDecoder relies solely on helper when it performs a step: the next_inputs that the are fed in to the subsequent cell is exactly the return value of Helper.next_inputs().

There are different implementations of Helper interface, e.g.,

The code is in BaseModel._build_decoder method. Note that both GreedyEmbeddingHelper and SampleEmbeddingHelper don't care what the decoder input is. So in fact you can feed anything, but the zero tensor is the standard choice.