Question About Stacked BiLSTM (Multi-layer BiLSTM)

671 Views Asked by At

I noticed that the single-layer BiLSTM comprises of two independent (unidirectional) LSTM layers, one for forward direction and the other for reverse. Then the outputs of two LSTM layers will be concatenated together, along with the hidden dim (always dim = -1), to form the output of this BiLSTM layer.

And for a multi-layer model, each inner layer accepts the output of the previous layer, and then outputs the calculation results to the next layer.

So far, there is no ambiguity.

But for a multi-layer BiLSTM, I found some ambiguity. Since each BiLSTM layer has two independent LSTM, I don't know the correct input the inner layer accepts.

The concatenated output from the previous layer? (If that true, this means the input_size of the inner LSTM layer, no matter left-to-right or right-to-left, is 2 * (hidden_size of the previous layer)) (see this implementation) (and see this picture from: Illustrating the use of two BiLSTMs for Semantic Role Labelling. Source: He et al. 2017, fig. 1.)

Or treat multi-layer BiLSTM as two unidirectional multi-layer LSTM (one for left_to_right and another for right_to_left), each unidirectional LSTM only accepts only the outputs from previous layers. Then, after two multi-layer unidirectional LSTM calculations finished, we will concatenate the left-to-right and right-to-left outputs of each layer to form the output of each BiLSTM layer? (see this picture from: Arrhythmia Classification in Multi-Channel ECG Signals Using Deep Neural Networks: Kim. 2018, fig. 3.2))

0

There are 0 best solutions below