How do I fix the following error in my LSTM in Keras? If a RNN is stateful, specify the `batch_input_shape`

363 Views Asked by At

I have the following error from running my LSTM in Keras:

ValueError: If a RNN is stateful, it needs to know its batch size. Specify the batch size of your input tensors: 
- If using a Sequential model, specify the batch size by passing a `batch_input_shape` argument to your first layer.
- If using the functional API, specify the batch size by passing a `batch_shape` argument to your Input layer.

I have a Pandas dataframe, which is a time-series indexed by the minute, with training and test sets as X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = .2, shuffle = False). X_train.shape is (27932, 7) and X_test.shape is (6984, 7). I normalized and reshaped X_train and X_test into 3D as:

# Normalizing the data
ss = StandardScaler()
X_train = ss.fit_transform(X_train)
X_test = ss.transform(X_test)

X_train = X_train.reshape(6983, 4, X_train.shape[1])
X_test = X_test.reshape(1746, 4, X_test.shape[1])
y_train = y_train.values.reshape(6983, 4, 1)
y_test = y_test.values.reshape(1746, 4, 1)

The logic behind the X's reshaping is that I want my LSTM to learn a sample of 4 timesteps (i.e., 4 minutes) at 6983 samples. I'd like to test this against a reshaping of (X_train.shape[0], 1, X_train.shape[1]).

My LSTM is the following:

# Creating our model's structure
model = Sequential()
model.add(Bidirectional(LSTM(4, batch_input_shape = (X_train.shape[0], X_train.shape[1], X_train.shape[2]), return_sequences = True, stateful = True)))
model.add(Dropout(0.2))
model.add(Bidirectional(LSTM(4)))
model.add(Dense(1, activation = 'sigmoid'))
es = EarlyStopping(monitor = 'val_loss', patience = 10)

# Compiling the model
model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics = ['Recall'])

# Fitting the model
history = model.fit(X_train, y_train, epochs = 50, verbose = 1, validation_data = (X_test, y_test), callbacks = [es])

The irony is that I have the above error even though I explicitly stated the batch_input_shape and stateful = True in my LSTM's first layer. I have no problem running my LSTM if it uses X_train's 3D shape of (X_train.shape[0], 1, X_train.shape[1]) and X_test's 3D shape of (X_test.shape[0], 1, X_test.shape[1]).

What part of my code is triggering the error?

By the way, I'm not able to improve my LSTM's performance with more hidden cells than what I have described in the code. Does that seem surprising?

0

There are 0 best solutions below