Problem Description: I've retrained a ResNet101 model to incorporate new classes. The original model had classes labeled from 0 to 999. For the new classes, I adjusted their labels to start from 1000 onwards (i.e., 1000, 1001, 1002, etc.). However, when testing the retrained model, it predicts the new classes with old indices (like 3) instead of the expected new indices (like 1002). If you need any more details or information to better understand the problem, please let me know.
Here is the part of the code that initializes the model
# Load the saved model's weights into a dictionary
saved_weights = torch.load(args.model_path)
# Determine the maximum label from the saved model's weights
max_label = saved_weights['fc.weight'].shape[0] - 1 # Change here
model = resnet101()
# Adjust the final layer to match the number of classes in the saved model
num_features = model.fc.in_features
model.fc = torch.nn.Linear(num_features, max_label + 1)
model.load_state_dict(saved_weights)
num_new_classes = len(os.listdir(os.path.join(args.data_path, 'train')))
total_classes = max_label + 1 + num_new_classes
weights = model.fc.weight.data
biases = model.fc.bias.data
model.fc = torch.nn.Linear(in_features=model.fc.in_features, out_features=total_classes)
with torch.no_grad():
model.fc.weight[:model.fc.out_features - num_new_classes] = weights
model.fc.bias[:model.fc.out_features - num_new_classes] = biases
# Initializing new weights
init.kaiming_uniform_(model.fc.weight[model.fc.out_features - num_new_classes:], mode='fan_in', nonlinearity='relu')
init.zeros_(model.fc.bias[model.fc.out_features - num_new_classes:])
model.train()
While testing the retrained model, I noticed that its predictions are highly accurate in terms of classifying the images to their correct categories. The confidence scores are consistently high, and the model seems to recognize the features of the new classes quite well. However, the issue arises with the class indices it outputs. Instead of predicting the new classes with the adjusted indices starting from 1000, it predicts them with the original indices (like 3, 4, etc.). This mismatch between the high accuracy and incorrect indices is puzzling.
Environment Details:
Python version: 3.8 Used Avalanche for continual learning during model training. Used ResNet from PyTorch.
What I Tried:
Previous Attempts: Before adjusting the labels, I tried fine-tuning the model without changing the labels, but faced issues with class overlap.
- Determined the max_label: Before adjusting the labels, I determined the max_label from the saved model's weights to calculate the offset.
# Load the saved model's weights into a dictionary
saved_weights = torch.load(args.model_path)
# Determine the maximum label from the saved model's weights
max_label = saved_weights['fc.weight'].shape[0] - 1
print("Max label from saved model:", max_label)
Output:
Max label from saved model: 999
- Verified Training Data Labels: I checked the labels in the training dataset after applying the offset.
offset = max_label + 1
train_dataset.targets = [label + offset for label in train_dataset.targets]
test_dataset.targets = [label + offset for label in test_dataset.targets]
print("Sample offset train labels after adjusting:", train_dataset.targets[:100])
print("Sample offset test labels after adjusting:", test_dataset.targets[:100])
Output:
Sample offset train labels after adjusting: [1000, 1000, ... , 1003, 1003]
Sample offset test labels after adjusting: [1000, 1000, ... , 1003, 1003]
- Checked Model's Architecture: I printed the shape of the model's final layer during testing to ensure it matches the expected number of classes.
print("Model's final layer shape:", model.fc.weight.shape)
Output:
Model's final layer shape: torch.Size([1005, 2048])