Why my tensorflow model is not learning in logistic regression - binary classification problem?

52 Views Asked by Sayan Dey At 24 April 2023 at 18:24

I am using the following code in tensorflow to fit breast cancer dataset for a binary classification problem. The dataset has 30 features to predict cancer or not. The model is as below:

def loss(y, y_pred):
    return tf.reduce_mean(-y*tf.math.log(y_pred) - (1-y)*tf.math.log(1-y_pred))

class Model:

def __init__(self, n_feature, lr= 0.001):
    self.lr = lr
    self.W = tf.Variable(tf.random.uniform((n_feature+1, 1), -1, 1))
    tf.print(self.W)

def __call__(self, X):
    return tf.sigmoid(tf.matmul(X, self.W))

def predict(self, X):
    return tf.sigmoid(tf.matmul(X, self.W))

def predict_classes(self, X):
    a = tf.round(self.predict(X))
    return tf.where(a > 0.5, tf.ones_like(a), tf.zeros_like(a))

def fit(self, X, y, epochs=1000):
    for epoch in tqdm(range(epochs)):
        with tf.GradientTape() as t:
            y_pred = self.predict(X)
            loss_i = loss(y, y_pred)
        grad_W = t.gradient(loss_i, self.W)
        self.W.assign_sub(grad_W*self.lr)
        print(f"Loss is {loss_i}")

and I am preprocessing the datset as below (adding ones for bias as well)

from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()

n_feature = len(data.feature_names)
print(f"No of features: {n_feature}")

x = (data.data-data.data.min(axis=0))/(data.data.max(axis=0)-data.data.min(axis=0))
x = np.concatenate((x, np.ones((len(data.data), 1))), axis=1)

X = tf.convert_to_tensor(x, dtype=tf.float32)
y = tf.convert_to_tensor(data.target, dtype=tf.float32)

I have tried tuning the lr 0.1 - 0.001 and increasing epochs, still of no use. The model gives a max accuracy of 60%, min loss of 0.66 and it predicts all True. Is there some problem with scaling the input data or something else is wrong?

Similar model with keras Sequential API is performing good (95%% accuracy). Following is the caller code:

model = Model(n_feature=n_feature)
model.fit(X, y)
y_hat = model.predict_classes(X)
from sklearn.metrics import accuracy_score
print(f"Accuracy: {accuracy_score(y.numpy(), y_hat.numpy())}")

Original Q&A

There are 1 best solutions below

Sayan Dey On 25 April 2023 at 17:55

Input y is of shape n. But y_pred is of shape (n, 1) which is 2d. This is creating a nuisance in the loss function calculating incorrect loss

return tf.reduce_mean(-y*tf.math.log(y_pred) - (1-y)*tf.math.log(1-y_pred))

Thus y is multiplied with each entry of tf.math.log(y_pred). This is avoided by using

y = tf.convert_to_tensor(data.target.reshape(-1, 1), dtype=tf.float32)

i.e reshaping y as 2d tensor

Why my tensorflow model is not learning in logistic regression - binary classification problem?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in LOGISTIC-REGRESSION

Related Questions in GRADIENT-DESCENT

Related Questions in SIGMOID

Trending Questions

Popular # Hahtags

Popular Questions