Custom loss function of tweedie/Regression_l1 objective in LGBMRegressor python

278 Views Asked by At

I am using LGBMRegressor model with objective="tweedie"/objective='Regression_l1' and I need to create a custom loss function that takes the original objective and change it a bit based on business needs. Before I start changing, I am trying to implement custom loss function that is exactly tweedie/Regression_l1 to make sure its the same. I have tried a few regular implementation of tweedie and I got diffrent results. Can someone help?

2

There are 2 best solutions below

5
On

Looking at the C++ implementation of LightGBM#RegressionTweedieLoss(), you could try and see if you can emulate that same implementation in Python, and see if the results differ (before applying your changes for your business needs)

Something like (loose translation in Python):

import lightgbm as lgb
import numpy as np

class CustomTweedieLoss:
    def __init__(self, rho=1.5):
        self.rho = rho

    def __call__(self, preds, train_data):
        labels = train_data.get_label()
        exp_1_score = np.exp((1 - self.rho) * preds)
        exp_2_score = np.exp((2 - self.rho) * preds)
        grad = -labels * exp_1_score + exp_2_score
        hess = -labels * (1 - self.rho) * exp_1_score + (2 - self.rho) * exp_2_score
        return grad, hess

# Usage:
objective = CustomTweedieLoss(rho=1.5)

Same idea for LightGBM#RegressionL1loss():

import numpy as np
import lightgbm as lgb

class CustomL1Loss:
    def __init__(self, weights=None):
        self.weights = weights

    def __call__(self, preds, train_data):
        labels = train_data.get_label()
        diff = preds - labels
        grad = np.sign(diff)
        hess = np.ones_like(preds)
        if self.weights is not None:
            grad *= self.weights
            hess *= self.weights
        return grad, hess

# Usage:
# Get the weights data, if any, then:
# objective = CustomL1Loss(weights=weights_data)
0
On

It seems like you're working with the LGBMRegressor model using the objective="tweedie" or objective='Regression_l1' setting and are trying to create a custom loss function that precisely matches the behavior of these objectives. However, you've encountered some discrepancies when implementing the custom loss function.

To create a custom loss function that mimics the behavior of the built-in objectives, it's important to closely follow the mathematical formulation of the objective functions. For the Tweedie objective, the LGBMRegressor documentation specifies that the objective is the Tweedie deviance:

obj = tweedie(y, y_hat) = -2 * sum(w * (y * phi - (y_hat * phi)**(2/p) / (2/p)) / (1/p - 1))

where y is the true target values, y_hat is the predicted values, w is the weight of the samples, phi is the dispersion parameter, and p is the power parameter.

For the Regression L1 objective, the objective is the mean absolute error:

obj = |y - y_hat|

If you're observing differences in the results, it's possible that there might be variations in the implementation of the custom loss function. Ensure that the custom loss function follows the exact mathematical formulation of the Tweedie deviance or mean absolute error, depending on the objective you're replicating.

Here's an example of a custom loss function for Tweedie:

import numpy as np

def custom_tweedie(y_true, y_pred, phi, p):
    dev = -2 * np.sum(y_true * phi - (y_pred * phi)**(2/p) / (2/p))
    return dev

# Usage:
# loss = custom_tweedie(y_true, y_pred, phi, p)

For the Regression L1, you can use the mean absolute error:

def custom_regression_l1(y_true, y_pred):
    return np.mean(np.abs(y_true - y_pred))

# Usage:
# loss = custom_regression_l1(y_true, y_pred)

Make sure that the parameters (phi and p for Tweedie) are properly set and match the configuration you're using in LGBMRegressor.