I'm doing some forecasts for highway traffic and some of them are not giving the results I expected. It's weird because I used the exact same code for some highways and it worked, but for this one I'm getting forecasts on the trillions (I expected 2 or 3 million vehicles).
Briefly, I'm using diff(x) in my model, so I adapt the data. Then, i use ardlDlm (with an order I got from my boss who estimated only the model before on EViews) to estimate the effect of "PIB" in "Leves" (all the other variables are dummies, that's why i remove their lags). The model is having decent results that make sense, but then I go the forecast and things get in the trillions for some reason. It shouldn't have this result, for example, the last "Leves" observation is 2.674.496, so I'm clueless here. Thanks in advance for any help.
The "Renovias" data looks like this:
(https://i.stack.imgur.com/fR9La.png)
First 12 rows:
Date Leves Pesados Q1 Q2 PIB JAN FEV MAR ABR MAI JUN JUL AGO SET OUT NOV Crise Greve CES 01/04/1999 1106640.5 758932 1 0 60123.20 0 0 0 1 0 0 0 0 0 0 0 0 0 0 01/05/1999 1218219.5 827682 1 0 60957.24 0 0 0 0 1 0 0 0 0 0 0 0 0 0 01/06/1999 1109770.5 774122 1 0 62562.70 0 0 0 0 0 1 0 0 0 0 0 0 0 0 01/07/1999 1222242.5 757961 1 0 61539.71 0 0 0 0 0 0 1 0 0 0 0 0 0 0 01/08/1999 1078172.5 880780 1 0 61903.36 0 0 0 0 0 0 0 1 0 0 0 0 0 0 01/09/1999 1120372.5 863346 1 0 60594.21 0 0 0 0 0 0 0 0 1 0 0 0 0 0 01/10/1999 1149853.5 869919 1 0 63358.14 0 0 0 0 0 0 0 0 0 1 0 0 0 0 01/11/1999 1094217.0 818265 1 0 65177.98 0 0 0 0 0 0 0 0 0 0 1 0 0 0 01/12/1999 1227005.0 821046 1 0 64274.77 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01/01/2000 1160712.5 780890 1 0 59871.67 1 0 0 0 0 0 0 0 0 0 0 0 0 0 01/02/2000 1015884.0 794188 1 0 59273.23 0 1 0 0 0 0 0 0 0 0 0 0 0 0 01/03/2000 1152619.0 843165 1 0 59664.81 0 0 1 0 0 0 0 0 0 0 0 0 0 0
The script:
library(openxlsx)
library(egcm)
library(aTSA)
library(nortest)
library(data.table)
library(dynamac)
library(AER)
library(dynlm)
library(readxl)
library(stargazer)
library(scales)
library(urca)
library(dLagM)
library(dplyr)
library(ggfortify)
library(data.table)
library(nardl)
library(zoo)
library(vars)
library(neuralnet)
library(DMwR2)
library(TTR)
library(quantmod)
library(PerformanceAnalytics)
library(ggplot2)
library(writexl)
library(tseries)
library(ecm)
library(knitr)
library(plotly)
library(tictoc)
library(ARDL)
#DADOS ----
Renovias <- read_excel("Ecopistas/ABCR/data.xlsx", sheet = "Renovias", range = "A1:t298")
data <- Renovias[c(1:(nrow(Renovias)-48)),-1]
d_Leves = diff(data$Leves)
d_Pesados = diff(data$Pesados)
d_PIB = diff(data$PIB)
d_data = data[-1,]
d_data$Leves = d_Leves
d_data$Pesados = d_Pesados
d_data$PIB = d_PIB
#LEVES ------
newdata_leves <- Renovias[c((nrow(data)):nrow(Renovias)), c(4:19)]
PIBdn = diff(newdata_leves$PIB)
newdata_leves = newdata_leves[-1,]
newdata_leves$PIB = PIBdn
transposed_newdata_leves <- t(newdata_leves)
model_leves = dLagM::ardlDlm(formula = Leves ~ PIB +
JAN + FEV + MAR + ABR + MAI + JUN +
JUL + AGO + SET + OUT + NOV + Crise + Greve + Q1 + Q2,
data = d_data, p = 1, q = 2,
remove = list(p = list(JAN=c(1:1),FEV=c(1:1),MAR=c(1:1), ABR=c(1:1),
MAI=c(1:1), JUN=c(1:1),JUL=c(1:1), AGO=c(1:1),
SET=c(1:1), OUT=c(1:1),NOV=c(1:1), Crise = c(1:1),
Greve=c(1:1), Q1 = c(1:1), Q2 = c(1:1)
)))
summary(model_leves)
MAPE(model_leves)
model_leves[["model"]][["coefficients"]][is.na(model_leves[["model"]][["coefficients"]])] <- 0
prediction_leves <- forecast(model = model_leves,
x = transposed_newdata_leves,
h = nrow(newdata_leves), interval = TRUE, level = 0.95, nSim = 1000)
I'm sorry I can't give the full data, it's a problem I'm having on work so anything further is really confidential.