I want to forecast a Target using its history and the history of covariates (Cov1, Cov2,Cov3).
I have several samples (Id) each of them with 601 observations (time) of (Target, Cov1, Cov2,Cov3) and want to train my model (a TemporalFusionTransformer model) on the 1st 60 observations to predict the 541 remaining Target values.
I plan to train/validate my model using Pytorch TimeSeriesDataSet object and then test it on unseen samples.
I readed a lot of Pytorch TimeSeriesDataSet examples (pytorch-forecasting.readthedocs.io , Kaggle notebooks, data scientist posts like https://towardsdatascience.com/all-about-n-hits-the-latest-breakthrough-in-time-series-forecasting-a8ddcb27b0d5...) but most of them subset a single timeseries example in consecutive train/validation/test sets.
I don't find that many examples training on several samples and testing on others. So my questions are about data preprocessing prior to fit my model. Here is the code I used:
max_prediction_length = 540
max_encoder_length = 61
training_cutoff = df["time"].max() - max_prediction_length
training = TimeSeriesDataSet(df[lambda x: x.time <= training_cutoff],
time_idx="time",target="Target", group_ids=["id"],
min_encoder_length= max_encoder_length,max_encoder_length=max_encoder_length,
min_prediction_length=max_prediction_length, max_prediction_length=max_prediction_length,
time_varying_unknown_reals=["Cov1", "Cov2",”Cov3”,”Target”])
# creating validation set (predict=True) which means to predict the last max_prediction_length points in time for each series:
validation = TimeSeriesDataSet.from_dataset(training,df_patients, predict=True, stop_randomization=True)
# create dataloaders for model:
batch_size = 4
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=0)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size, num_workers=0)
I’m not sure if I should use
df[lambda x: x.time <= training_cutoff]
like I see on code examples I found (instead ofdf
) if I usemin_encoder_length= max_encoder_length, max_encoder_length=max_encoder_length, min_prediction_length=max_prediction_length, max_prediction_length=max_prediction_length
as length parameters ?I still don't clearly understand how the boolean
predict=True
,stop_randomization=True
&train=True/False
work to differentiate training and validation sets
Any help would be appreciated!