Is there any other reason why we make sequence length the same using padding?

196 Views Asked by aerin At 24 January 2019 at 21:07

Is there any other reason why we make sequence length the same length using padding? Other than in order to do matrix multiplication (therefore doing parallel computation).

Original Q&A

There are 1 best solutions below

Umang Gupta On 24 January 2019 at 22:24 BEST ANSWER

It may depend on the specific situation you are dealing with. But in general, the only reason I would do zero padding or any kind of padding to RNN would be to make batch-wise computations work. Also, padding should be done in a way that it doesn't affect the results. So, it should not contribute to computing hidden state computation that you would be using for downstream tasks. For example, you may pad the end of the particular sequences from {t+1:T}, but then for further task or processing we should use only h{0:t}

However, if you are doing anything different than simple RNN (for eg. bidirectional-RNN), it can be complicated to do padding. For example: for the forward direction you would pad in the end and for the reverse direction, you would want to pad the front part of sequences.

Even for batching or doing parallel computations pytorch has packed sequences which should be faster than padding IMO.

Is there any other reason why we make sequence length the same using padding?

There are 1 best solutions below

Related Questions in TENSORFLOW

Related Questions in PYTORCH

Related Questions in RECURRENT-NEURAL-NETWORK

Related Questions in SEQUENCE-MODELING

Trending Questions

Popular # Hahtags

Popular Questions