We have to retrain our model every time we want to generate a new forecast in time series models?

776 Views Asked by At

I am using SARIMAX model doing below two steps 1 is training the model and 2 is doing forecasting.

1. Training Step

model = sm.tsa.statespace.SARIMAX(
   endog=train_y, 
   exog=train_X, order=(1, 1, 1),
   seasonal_order=(1, 1, 0, 12),
   trend='c'
)
model_fit = model.fit()

2. Forecasting Step

predictions = model_fit.predict(
    start = train_size,
    end = (train_size+test_size-1),
    exog = test_X
)

My Question

is do I need to do step 1 training every time when I want to do step 2 forecasting? In other words, training is needed every time if I want to do prediction?

Below is the reference article which says training is needed every time when we want to do prediction.

https://towardsdatascience.com/3-facts-about-time-series-forecasting-that-surprise-experienced-machine-learning-practitioners-69c18ee89387

1

There are 1 best solutions below

0
Oscar Lundberg On

No, I think the article and your interpretation is a bit mismatching. This is my interpretation:

  • The article: When the article states that you have to retrain your model everytime you are going to make a prediction, the author is talking about the case that the relations in the data are changing. Thus, to have a good model, you need to update (retrain) your model.

  • Your question: In step 1, you are creating the model (adding the endog and exog variables etc.). Then you conduct the fit, where you will get the best fit for the given data (endog and exog). In step 2, you now have your trained (fit) model, where you now add the exog variables (input), and then get the predictions for a certain test_x data. You are perfectly free to just make another prediction withouth retraining (e.g. test_x_prime) the model. However, whether this is a good idea or not is what the article is trying to emphasize.

  • Back to the article again: Consider the data in the article, where the model is first fit when there is an upward trend, and the model is doing well in predicting the future. However, when the trend is broken (new data) the model is having parameters fit for the upward trend, and thus miss the broken trend. The way to mitigate this problem is to fit a new model when there is new data available.

Hope this clarified somewhat.