Massive drop in training error after the first epoch

52 Views Asked by At

I am training an LSTM autoencoder to recreate the input consisting of eight features(floating-point numbers between 0 and 1). Currently, am utilizing a window size of two and am training the model for 50 epochs. However, while training the network I observed that the training error (Mean Square Error) drops significantly after the first epoch. For example, during the first epoch the training error was 17.25. It dropped to 1.8 at the very next and stagnates after the seventh epoch. I was wondering if random initialization of weights might be causing this therefore I retrained one more network and the same phenomenon repeated.

I am not able to deduce the reason for this significant drop in training error after the first epoch and would appreciate any help. I have attached the training error graph and model information for reference.

Model info:

LSTM_AutoencoderModel( (encoder): Encoder( (lstm1): LSTM(16, 64) (lstm2): LSTM(64, 16) ) (decoder): Decoder( (lstm1): LSTM(16, 64) (lin1): Linear(in_features=64, out_features=16, bias=True) ) )

Training error graph

0

There are 0 best solutions below