Why does NStepLSTM not have reset_state method?

464 Views Asked by At

I firstly use L.LSTM , then I found this NStepLSTM, which is uncovered part of offical tutorial document. https://docs.chainer.org/en/stable/reference/generated/chainer.links.NStepLSTM.html?highlight=Nstep

  1. Why does chainer.links.NStepLSTM or chainer.links.NStepBiLSTM not have reset_state? how to reset_state?

  2. is it pass a list of sequences(each is one sequence chainer.Variable, e.g. one article contains multiple words is one Variable)? Is this class purpose is to deal with vary length sequence?

  3. can we use truncate BPTT to save memory in chainer.links.NStepLSTM ? how

1

There are 1 best solutions below

3
On

1. NStepLSTM gets a batch of sequences and returns a batch of output sequences, though LSTM gets a batch of words. You don't need to use for-loop to use NStepLSTM. NStepLSTM uses cuDNN, that is a library NVIDIA provides, and is very fast. NStepLSTM does not have a state. If you want to chain NStepLSTMs, use outputs of NStepLSTM. See seq2seq example: https://github.com/chainer/chainer/blob/master/examples/seq2seq/seq2seq.py

2. Yes. It gots such as a batch of sequences of embed vectors created from sentences. You can use sequences with different lengths. See seq2seq example. Note that L.NStepLSTM can get a sequence of sentences, but F.NStepLSTM can get transposed sequences. I mean it can get a sequence of batches of words. Actually L.NStepLSTM calls F.transpose_sequences and F.NStepLSTM in its implementation.

3. Sorry it is difficult. As I said, NStepLSTM is a wrapper of cuDNN's RNN library.It does not support BPTT. Of course you can split sentences and call NStepLSTM twice.