I'm currently running a regression with various forecasting methods on the same dataset. For DT, the MAE is a little higher than for the AB model, while the MAPE is significantly higher for the AB model. How is this possible? I realize that a lower MAE does not necessarily lead to a lower MAE, but this difference is quite significant and got me wondering if I'm doing something wrong.
Decision Tree - MAE: 13.85
Decision Tree - MAPE: 59.77%
AdaBoost - MAE: 11.53
AdaBoost - MAPE: 76.23%
Here are the formulas that I use for the calculation:
MAE:
mae = sklearn.metrics.mean_absolute_error(y_pred, y_test)
MAPE:
def percentage_error(actual, predicted):
res = np.empty(actual.shape)
for j in range(actual.shape[0]):
if actual[j] != 0:
res[j] = (actual[j] - predicted[j]) / actual[j]
else:
res[j] = predicted[j] / np.mean(actual)
return res
def mean_absolute_percentage_error(y_test, y_pred):
return np.mean(np.abs(percentage_error(np.asarray(y_test), np.asarray(y_pred)))) * 100
Source for the MAPE formula: https://stackoverflow.com/a/59033147/10603410
I hope someone is able to help with this! Thanks!
From my perspective, you can expect that sort of result as it depends on what the shape of your error distribution for each model looks like.
If your AB regressor is making bigger mistakes at low values of y_true than your DT regressor, but smaller mistakes at higher values of y_true, then you could easily expect to see a larger MAPE and potentially a lower MAE for the AB regressor.
As a motivating (and hugely contrived) example, let's say we have two models: model A and model B.
Model A makes larger mistakes at low values of y_true, but then smaller mistakes at high values of y_true:
While model B makes smaller mistakes at low values of y_true, but then larger mistakes at high values of y_true: