I am using SARIMAX model doing below two steps 1 is training the model and 2 is doing forecasting.
1. Training Step
model = sm.tsa.statespace.SARIMAX(
endog=train_y,
exog=train_X, order=(1, 1, 1),
seasonal_order=(1, 1, 0, 12),
trend='c'
)
model_fit = model.fit()
2. Forecasting Step
predictions = model_fit.predict(
start = train_size,
end = (train_size+test_size-1),
exog = test_X
)
My Question
is do I need to do step 1 training every time when I want to do step 2 forecasting? In other words, training is needed every time if I want to do prediction?
Below is the reference article which says training is needed every time when we want to do prediction.
No, I think the article and your interpretation is a bit mismatching. This is my interpretation:
The article: When the article states that you have to retrain your model everytime you are going to make a prediction, the author is talking about the case that the relations in the data are changing. Thus, to have a good model, you need to update (retrain) your model.
Your question: In step 1, you are creating the model (adding the endog and exog variables etc.). Then you conduct the fit, where you will get the best fit for the given data (endog and exog). In step 2, you now have your trained (fit) model, where you now add the exog variables (input), and then get the predictions for a certain test_x data. You are perfectly free to just make another prediction withouth retraining (e.g. test_x_prime) the model. However, whether this is a good idea or not is what the article is trying to emphasize.
Back to the article again: Consider the data in the article, where the model is first fit when there is an upward trend, and the model is doing well in predicting the future. However, when the trend is broken (new data) the model is having parameters fit for the upward trend, and thus miss the broken trend. The way to mitigate this problem is to fit a new model when there is new data available.
Hope this clarified somewhat.