How to train an LSTM model on multiple smaller individual datasets?

282 Views Asked by imaybedumb At 29 July 2025 at 06:53

I have 11000 datasets each having 52 entries corresponding to 52 weeks of data. I want to train a single LSTM model on all these 11000 datasets, as I feel that individually training the data on a single dataset, and predicting for every dataset will not provide a very good model considering each dataset has only 52 entries. It may not cover all possible cases as well.

(Please note that all datasets have the exact same 52 weeks)

Any suggestions would be of great help.

for i in range(0,11000):
    model.fit(X_train[i],y_train[i])
    pred[i] = model.predict(X_test[i])

Original Q&A

There are 1 best solutions below

Joakim Torsvik On 30 December 2022 at 09:14

Without further knowledge regarding the dataset I would do the following: If each of the 52 entries in the datasets are related to each other, then you could consider connecting the datasets into one tokenized string and then concatenating each dataset together. LSTMs should be able to understand start-of-sentence and end-of-sentence (SOS and EOS) without it having to be explicitly told that the end of a sentence is actually the end.

e.g.:

df1 = [['Today was a good day.'], ['Tomorrow will be even better.']...]

new_df1 = [sent for alist in df1 for sent in alist]
print(new_df1)
Out[1]: ['Today was a good day.', 'Tomorrow will be even better'...]

new_df1 = " ".join(new_df1)
print(new_df1)
Out[2]: 'Today was a good day. Tomorrow will be even better....'

If the entries are unrelated to each other, I would need more information as to why you wouldn't concatenate the datasets and treat each entry as an individual input. Best practice is to combine all datasets into a single dataset. If you absolutely don't want to do that, you will still have to capture each dataset within a variable in order for the model to be able to process the datasets.

How to train an LSTM model on multiple smaller individual datasets?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in MACHINE-LEARNING

Related Questions in MULTIVARIATE-TIME-SERIES

Trending Questions

Popular # Hahtags

Popular Questions