I'm working with the Darts timeseries package. I have done some work with timeseries data but mostly for anomaly detection. I have a few questions pertaining to data and validation.
Darts seems to have a feature that allows for a validation timeseries [or sequence of timeseries]. I'm aware of various training strategies that seek to avoid contaminating the validation process, techniques like rolling time series or walking-forward (see photo), which make best use of the data by dissecting the data into sequential batches such that the validation data of the subsequent set is yet unseen by the model. However, the model trains in epochs, with the default value being 100. If I'm not mistaken, if the model is trained over epochs then this method of data division is effectively worthless? I'd assume the best I could do was provide a holdout set that is trained on.
My second question relates to how darts works, or in particualr, what the roles of input_chunk_length and output_chunk_length play. From what I can tell, if you have a timeseries of length 100, and you provided an input_chunk_length=10 & output_chunk_length=5 then the data will be dissected up into subsequences of length 15 and the model is trained over that data where each subsequence acts like a sample. Assuming that's the case, would it be beneficial to have the output_chunk_length the same period as the window we want our prediction to be?
