I am currently training a similarity learning neural network using triplet loss. I have employed semi-hard negative mining to train the network to learn the features. The trained model has high accuracy(or recall= 88%) but it has low average precision.
Specifics:
- Margin used in triplet loss =1.6.
- embeddings are normalized using L2 norm
- Image pairs with lowest square Distance are identified matches
- scores are negated and sorted in decreasing order (lowest distances represents higher confidence)
- PR(precision-recall) curve is plot for each prediction after scores are sorted in decreasing order of confidence( high confidence scores are plot first followed by lower confidence)
Confidence(or score) = - (square distance between two image-pair embedding)
Problem:
- PR curve shows a dip up until recall=0.05 highest confidence scores are pretty bad
- This improves later after recall 0.05
Question: How to investigate and try to improve the precision for the highest confidence scores.
Any thoughts , pointers?
What i have tried:
- tested for bugs; code looks good and the accuracy(high recall) is accurate
- tested for validation accuracy(pairs of triplets randomly visualized)
- lowered down the ALPHA =0.2(default by triplet loss paper) but it yields low recall(Accuracy) and lower average precision
def triplet_loss(x, alpha=ALPHA):
# Triplet Loss function.
anchor, positive, negative = x
# distance between the anchor and the positive
pos_dist = K.sum(K.square(anchor - positive), axis=1)
# distance between the anchor and the negative
neg_dist = K.sum(K.square(anchor - negative), axis=1)
# compute loss
basic_loss = pos_dist - neg_dist + alpha
loss = K.maximum(basic_loss, 0.0)
return loss
def identity_loss(y_true, y_pred):
return K.mean(y_pred)
def my_norm(ip):
return K.l2_normalize(ip, axis=-1)
def embedding_model():
# used for the embedding model.
base_cnn = keras.applications.ResNet50(weights="imagenet", input_shape=IM_SIZE + (3,), include_top=False)
flatten = keras.layers.Flatten()(base_cnn.output)
drop1 = keras.layers.Dropout(rate=0.25)(flatten)
dense1 = keras.layers.Dense(256, activation="relu")(drop1)
dense1 = keras.layers.BatchNormalization()(dense1)
output = keras.layers.Dense(256)(dense1)
output = Lambda(my_norm)(output)
trainable = False
for layer in base_cnn.layers:
if layer.name == "conv5_block1_out":
trainable = True
layer.trainable = trainable
mdl = Model(inputs=base_cnn.input, outputs=output, name="Embedding")
return mdl
def complete_model(base_model, alpha=0.2):
# Create the complete model with three
# embedding models and minimize the loss
# between their output embeddings
input_1 = Input((imsize, imsize, 3))
input_2 = Input((imsize, imsize, 3))
input_3 = Input((imsize, imsize, 3))
A = base_model(input_1)
P = base_model(input_2)
N = base_model(input_3)
loss = Lambda(triplet_loss)([A, P, N])
model = Model(inputs=[input_1, input_2, input_3], outputs=loss)
model.compile(loss=identity_loss, optimizer=Adam(LR))
return model
def get_model_name():
return "resnet50Reg0.8"
def preprocess(x):
return keras.applications.resnet50.preprocess_input(x)