Statforecast - AutoArima: How to run different models for each unique ids

244 Views Asked by At

I am using statsforecast package and fitting AutoARIMA model. Code looks like the following.

forecast = {}
models = {}
for item in Ids:
    sf = StatsForecast(
        models=[
            AutoARIMA()
        ],
        freq='M'
    )
    d = train_data[train_data['unique_id']==item
    sf.fit(d)
    f = sf.predict(h=3)
    forecast[item] = f
    models[item] = sf.fitted_[0][0].model_

I would expect different models with different parameters to be fitted for different Ids. Because, there are variability in the data. But it is fitting the same model for all the cases.

When I run in two different Jupyter Notebooks, I get two different models for two different sources of data. But I run model for each id, I am getting only one model for all the ids. I explored about setting different seeds, but I couldn't find any. When I run auto ARIMA, i would expect to see atleast more than one model\parameters. How do I get this?

1

There are 1 best solutions below

0
On

Most of the models in the statsforecast are the local model which means they train one model per unique value so you don't need to loop to fit each unique value. You can access parameters of each model by sf.fitted_[<idx>, 0].model_

import pandas as pd
from statsforecast.models import AutoARIMA
from statsforecast import StatsForecast
from statsforecast.arima import arima_string

Y_df = pd.read_parquet('https://datasets-nixtla.s3.amazonaws.com/m4-hourly.parquet')

uids = Y_df['unique_id'].unique()[:5] # Select 5 ids to make the example faster

Y_df = Y_df.query('unique_id in @uids') 

Y_df = Y_df.groupby('unique_id').tail(7 * 24) #Select last 7 days of data to make example faster

# Create a list of models and instantiation parameters
models = [
    AutoARIMA(season_length=24),
]

# Instantiate StatsForecast class as sf
sf = StatsForecast(
    df=Y_df, 
    models=models,
    freq='H', 
    n_jobs=-1
)

forecasts_df = sf.fit().predict(h=48, level=[90])

for model, uid in zip(sf.fitted_, forecasts_df.index.unique()):
    print(uid, ' - ', arima_string(model[0].model_))

sample output