I'm training a DeeplabV3+ for semantic segmentation of remote sensing images. I build the model after the keras tutorial ( https://keras.io/examples/vision/deeplabv3_plus/ ) and it works fine when I use a Focal loss (predefined in Keras/tf) for the training. The masks are one-hot encoded and in the form of (batch size, height, width, channels). In the next stept I wanted to test the Dice loss (I implemented this after Correct Implementation of Dice Loss in Tensorflow / Keras ), which also seemed to work fine but led to a lower IoU. This is how I implemented the loss and trained the model:


import numpy as np
import tensorflow as tf
from tensorflow.keras import backend as K

def dice_coef(y_true, y_pred, smooth=100): # calculate the dice coeffient for one class

    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    
    intersection = K.sum(y_true_f * y_pred_f)
    dice = (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
    return dice

def dice_coef_loss(y_true, y_pred, smooth=100): 
    return -dice_coef(y_true, y_pred, smooth) # turn the dice coeffient into a loss

#compute the dice coefficent for multiple classes and calculate a simple average 
def dice_coef_multilabel(y_true, y_pred, M=10, smooth=100): 
    dice = 0
    for index in range(M): #M is the number of classes
        dice += dice_coef(y_true[:,:,:,index], y_pred[:,:,:,index], smooth)
    return dice/M

def dice_coef_multi_loss(y_true, y_pred, M=number_classes, smooth=100):
    return -dice_coef_multilabel(y_true, y_pred, M=number_classes, smooth=100)
   

from Seg_Models_own import DeeplabV3Plus
DeeplabV3Plus = DeeplabV3Plus(image_size=256,image_channels = 4, num_classes= 10) 

DeeplabV3Plus.compile(optimizer = Adam(learning_rate = 1e-4), loss =dice_coef_multi_loss,metrics= ['categorical_accuracy',tf.keras.metrics.MeanIoU(num_classes=10,sparse_y_true=False, sparse_y_pred=False)])

history2 = DeeplabV3Plus.fit(train_img_gen,  # use a generator to get the training data
                    steps_per_epoch = steps_per_epoch,
                    epochs = 35,
                    verbose = 2,
                    validation_data = val_img_gen, # use a generator to get the validation data
                    validation_steps = val_steps_per_epoch,
                    batch_size= 64)

I'd like to combine the Focal loss with the Dice loss, since it could improve the performance of the model. I tried to combine the losses like this (and in a few similar ways):

def dice_coef_multi_loss(y_true, y_pred, M=10, smooth=100):
    return -dice_coef_multilabel(y_true, y_pred, M=10, smooth=100)+ tf.keras.losses.CategoricalFocalCrossentropy(y_true, y_pred)

But unfortunately I always receive the following error:

"TypeError: Failed to convert elements of <keras.src.losses.CategoricalFocalCrossentropy object at 0x7f8ec02546d0> to Tensor. Consider casting elements to a supported type. See https://www.tensorflow.org/api_docs/python/tf/dtypes for supported TF dtypes."

I also tried to cast the datatype of the losses and y_true,y_test to float32. What am i missing here? I thought each loss returns a single number and the addition is straightforward... Do I have to implement the Focal loss (similar to the Dice loss) in my script too? I'm thrilled to read what you think about it!

see above, where I explained the whole problem

0

There are 0 best solutions below