Mask-RCNN load_weights exclude layers for training

39 Views Asked by At

I want to implement transfer learning using the coco.h5 data in the Mask-RCNN model. I am focusing on training a single class, so I modified NUM_CLASSES to 1 + 1 (one for the new class and one for the background). The Coco dataset contains 80 classes.

When applying the load_weights(weights_path, by_name=True) function, I encountered the following error:

Layer #389 (named "mrcnn_bbox_fc"), weight <tf.Variable 'mrcnn_bbox_fc/kernel:0' shape=(1024, 8) dtype=float32> has shape (1024, 8), but the saved weight has shape (1024, 324).

This issue also occurred for the layers mrcnn_class_logits and mrcnn_mask.

Through my research, I found a simple solution by using exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", "mrcnn_mask"] within the load_weights() function. However, I am uncertain if this truly resolves the problem or merely circumvents the error message. Based on my understanding, excluding these layers implies they need to undergo training from scratch. Consequently, these excluded layers may require significantly more training compared to the non-excluded layers.

Upon visually inspecting my images, I observed no discernible difference between epoch 1 and epoch 40, so I think this is a hint that the exclusion of these layers is a problem. After the first epoch, I initiated the training again using the .h5 data from epoch one without excluding layers. This didn't lead to an error.

Therefore, I have two questions:

  1. Is my understanding correct?
  2. Is there a way to utilize these weights despite the difference in the number of classes?
0

There are 0 best solutions below