Understanding TimeSeriesDataSet in Pytorch-Forecasting

325 Views Asked by At

I have 913000 rows data:

data image

First, Let me explain this data

this data is sales data for 10 stores and 50 item from 2013-01-01 to 2017-12-31.

i understand why this data has 913000, by leap year.

anyway, i made my training set.

training = TimeSeriesDataSet(
    train_df[train_df.apply(lambda x:x['time_idx']<=training_cutoff,axis=1)],    
    time_idx = "time_idx",
    target = "sales",
    group_ids = ["store","item"], # list of column names identifying a time series
    max_encoder_length = max_encoder_length,
    max_prediction_length = max_prediction_length,
    static_categoricals = ["store","item"],
    # Categorical variables that do nat change over time (e.g. product length)
    time_varying_unknown_reals = ["sales"],
    
)

Now First Question: i have known as the TimeSeriesDataSet has data param, reflected data minus prediction horizon by training_cutoff and minus max_encoder_length for prediction. this is right? if no please tell me truth.

Second Question: Similarly, this is output of over code output image Why the length is 863500

i calculate the length on my knowledge.

prediction horizon by training_cutoff - 205010 =10000

max_encoder_length for prediction - 605010 = 30000

Thus 913000-40000 = 873000.

where is 9500?

i do my best in googling. please tell me truth..

0

There are 0 best solutions below