How to predict unseen data with auto arima using exogenous variables

2.3k Views Asked by At

I have the following understanding problem. I have trained an auto_arima model including an exogenous variable and now I would like to do forecasts based on an existing time series.

My training looked like this:

stepwise_model = auto_arima(train_data,exogenous=exo_train_data,start_p=1, start_q=1,
    max_p=7, max_q=7, seasonal=True,start_P=1,start_Q=1,max_P=7,max_D=7,max_Q=7,m=int(7),
    d=None,D=None, trace=True,error_action='ignore',suppress_warnings=True, stepwise=True)
forecast = stepwise_model.predict(n_periods=len(test_data),exogenous=exo_test_data)

This also works wonderfully and provides me with the performance values I wanted. But now that I have trained my model with the complete time series, the question arises how I can make predictions if I do not have future values of the exogenous variables....

# Full Training:
stepwise_model_final = auto_arima(all_data,exogenous=exo_all_data,start_p=1, start_q=1,
    max_p=7, max_q=7, seasonal=True,start_P=1,start_Q=1,max_P=7,max_D=7,max_Q=7,m=int(7),
    d=None,D=None, trace=True,error_action='ignore',suppress_warnings=True, stepwise=True)

The .predict function in this case requires me to also specify the exogenous variable, which of course I don't have available now:

n=tbd
forecast_final = stepwise_model_final.predict(n_periods=n,exogenous= ??? )

Am I fundamentally misunderstanding something here?

Would be great if you could help me here. I have already searched the internet but found no answer to my question.

Thank you very much !

1

There are 1 best solutions below

0
On

You need the exogenous variables to make the prediction. Basically, ARIMA performs a regression on the exogenous variables to improve the predictions, therefore you need to pass them to ARIMA.

If you do not have the exogenous variables, you have two options:

  1. Predict the exogenous variables (e.g. with ARIMA)
  2. Forecast the time series only with the time series itself (endogenous ARIMA) without any exogenous variables.