Deeplabv3 validation loss is nan

564 Views Asked by At

I had around 360 images splitted %25 as validation data. I could train Deeplabv3 with those images without any issue. Later on I have added around 40 images with labeled images. But the model now started to give validation loss always nan. Sometimes it gives at very first epoch some validation loss value but starting by second epoch the validation loss is always nan. The strange thing is I can still train Unet or any other model with the same data, without having no problem. And Later I discarded those 40 images and trained Deeplabv3 and worked again without any issue. I have checked labels and everything from those images and looks like there is no problem with new images. Any idea about what could cause this issue ?

1

There are 1 best solutions below

0
On

Assuming you haven't solved this or moved on, and if you're using a tf.keras implementation of Deeplabv3, check your DilatedSpatialPyramidPooling layer and in the convolution block of that layer, either comment out BatchNormalization, or surround it with Flatten and Reshape like this

x = layers.Flatten()(x)
x = layers.BatchNormalization()(x)
x = layers.Reshape((1,1,-1))(x)

Seems like there might be some weird behaviour with batch norm and spatial dimensions such as (1,1,num_channels) but I'm not entirely sure why. It solved the issue for me, however.