LSTM Autoencoder for Anomaly detection in time series, correct way to fit model

1.3k Views Asked by At

I'm trying to find correct examples of using LSTM Autoencoder for defining anomalies in time series data in internet and see a lot of examples, where LSTM Autoencoder model are fitted with labels, which are future time steps for feature sequences (as for usual time series forecasting with LSTM), but I suppose, that this kind of model should be trained with labels which are the same sequence as sequence of features (previous time steps).

The first link in the google by this searching for example - https://towardsdatascience.com/time-series-of-price-anomaly-detection-with-lstm-11a12ba4f6d9

1.This function defines the way to get labels (y feature)

def create_sequences(X, **y**, time_steps=TIME_STEPS):
    Xs, ys = [], []
    for i in range(len(X)-time_steps):
        Xs.append(X.iloc[i:(i+time_steps)].values)
        ys.append(y.iloc**[i+time_steps]**)
    
    return np.array(Xs), np.array(ys)

X_train, **y_train** = create_sequences(train[['Close']], train['Close'])
X_test, y_test = create_sequences(test[['Close']], test['Close'])

2.Model is fitted as follow

history = model.fit(X_train, **y_train**, epochs=100, batch_size=32, validation_split=0.1,
                    callbacks=[keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, mode='min')], shuffle=False)

Could you kindly comment the way how Autoencoder is implemented in the link on towardsdatascience.com/? Is it correct method or model should be fitted following way ?

model.fit(X_train,X_train)

Thanks in advance!

2

There are 2 best solutions below

4
On

This is time series auto-encoder. If you want to predict for future, it goes this way. The auto-encoder / machine learning model fitting is different for different problems and their solutions. You cannot train and fit one model / workflow for all problems. Time-series / time lapse can be what we already collected data for time period and predict, it can be for data collected and future prediction. Both are differently constructed. Like time series data for sub surface earth is differently modeled, and for weather forecast is differently. One model cannot work for both.

0
On

By definition an autoencoder is any model attempting at reproducing it's input, independent of the type of architecture (LSTM, CNN,...).

Framed this way it is a unspervised task so the training would be : model.fit(X_train,X_train)

Now, what she does in the article you linked, is to use a common architecture for LSTM autoencoder but applied to timeseries forecasting:

model.add(LSTM(128, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(RepeatVector(X_train.shape[1]))
model.add(LSTM(128, return_sequences=True))
model.add(TimeDistributed(Dense(X_train.shape[2])))

She's pre-processing the data in a way to get X_train = [x(t-seq)....x(t)] and y_train = x(t+1)

for i in range(len(X)-time_steps):
        Xs.append(X.iloc[i:(i+time_steps)].values)
        ys.append(y.iloc[i+time_steps])

So the model does not per-se reproduce the input it's fed, but it doesn't mean it's not a valid implementation since it produce valuable prediction.