LSTM Keras sequence to sequence prediction gives error (ValueError: Dimensions must be equal)

39 Views Asked by At

I am trying to predict wave heights using an LSTM Keras in Python 3.9. Just for the ease of my example here, I only used two features: significant wave height and H1/3 (wave height is offcourse dependent on multiple other factors). For some reason which I can't recall, I get an error that the y_pred and y_true are not matching in shape:

ValueError: Dimensions must be equal, but are 126 and 24 for '{{node mean_absolute_error/sub}} = Sub[T=DT_FLOAT](sequential/dense/BiasAdd, IteratorGetNext:1)' with input shapes: [?,126,2], [?,24,2].

How I set up my model: I use data of multiple buoys of which one buoy is my output and the rest is my input (3 buoys). The dataframes of the buoys look like this:

Datetime Hm0 H1/3
2022-08-01 00:10:00 85.0 85.0
2022-08-01 00:20:00 90.0 90.0
2022-08-01 00:30:00 93.0 91.0
2022-08-01 00:40:00 92.0 91.0
2022-08-01 00:50:00 89.0 88.0
... ... ...

I create sequences for the output buoy with a length of seq_length_future and input buoys with a length of seq_length_past (so the input and output sequences differ in length). All input buoys are put into input_data as Numpy array and the output buoy is put into output_data. The sequence lengths are: seq_length_past = 42 seq_length_future = 24

The shapes of the input and output data after creating the sequences are: sequenced_input_data.shape = (61117, 3, 42, 2) sequenced_output_data.shape = (61117, 24, 2)

For the input data this results in a 4D array. Using Numpy reshape, I then make one long sequence out of the subsequences of the different input buoys. I do this, because I read that the LSTM can't handle 4D arrays for timeseries prediction. The output it will result in a 3D array.

I then normalize all the data using the MinMaxScaler(), with separate scalers for input and output data.

I then split the data into train and test data:

#split the data into test and train data
input_train, input_test, output_train, output_test = train_test_split(reshaped_input_data, reshaped_output_data, test_size=0.2, shuffle=False, random_state=42)

The shapes are then: output_train.shape = (48893, 126, 2) output_test.shape = (12224, 126, 2) output_train.shape = (48893, 24, 2) output_test.shape = (12224, 24, 2)

Training of the model

tf.keras.backend.clear_session()

model = Sequential()
model.add(Masking(mask_value= 0, input_shape=(input_train.shape[1], input_train.shape[2])))
model.add(LSTM(units=64, return_sequences=True, kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(0.1))
model.add(Dense(units=output_test.shape[2]))

model.compile(loss='mean_mean_error', optimizer=Adam(learning_rate=0.0001))

early_stopping = EarlyStopping(monitor="val_loss", verbose=2, mode='min', patience=3)
model.fit(x=input_train, y=output_train, epochs=10, validation_split=0.2, batch_size=32, callbacks=[early_stopping])

loss = model.evaluate(input_test, output_test)
print(f'Test loss: {loss}')

predictions = model.predict(input_test)

I found in this link which explanes the predictions, with a different length for the input and output sequences

Can anybody explain to me what I'm doing wrong here :)? Hopefully I explained my code and problem well enough.

Hope to hear from you!

What I tried so far:

  • Create a custom loss function, this works but since the answer in the link i provided works with different sequence lengths there must be something with my code.
  • Using a mean_absolute_error instead of the squared, didn't work.
  • Using a RepeatVector but then the predictions became flat
  • I tried Flatten but the same error happens

PS: Would you recommend to use stateful = True or False?

0

There are 0 best solutions below