How to improve similarity learning neural network with low precision but high recall?

218 Views Asked by At

I am currently training a similarity learning neural network using triplet loss. I have employed semi-hard negative mining to train the network to learn the features. The trained model has high accuracy(or recall= 88%) but it has low average precision.

Specifics:

  • Margin used in triplet loss =1.6.
  • embeddings are normalized using L2 norm
  • Image pairs with lowest square Distance are identified matches
  • scores are negated and sorted in decreasing order (lowest distances represents higher confidence)
  • PR(precision-recall) curve is plot for each prediction after scores are sorted in decreasing order of confidence( high confidence scores are plot first followed by lower confidence)

Confidence(or score) = - (square distance between two image-pair embedding)

Problem:

  • PR curve shows a dip up until recall=0.05 highest confidence scores are pretty bad
  • This improves later after recall 0.05

Question: How to investigate and try to improve the precision for the highest confidence scores. Any thoughts , pointers? precision-recall curve

What i have tried:

  • tested for bugs; code looks good and the accuracy(high recall) is accurate
  • tested for validation accuracy(pairs of triplets randomly visualized)
  • lowered down the ALPHA =0.2(default by triplet loss paper) but it yields low recall(Accuracy) and lower average precision
def triplet_loss(x, alpha=ALPHA):
    # Triplet Loss function.
    anchor, positive, negative = x
    # distance between the anchor and the positive
    pos_dist = K.sum(K.square(anchor - positive), axis=1)
    # distance between the anchor and the negative
    neg_dist = K.sum(K.square(anchor - negative), axis=1)
    # compute loss
    basic_loss = pos_dist - neg_dist + alpha
    loss = K.maximum(basic_loss, 0.0)
    return loss
def identity_loss(y_true, y_pred):
    return K.mean(y_pred)

def my_norm(ip):
    return K.l2_normalize(ip, axis=-1)


def embedding_model():
    # used for the embedding model.
    base_cnn = keras.applications.ResNet50(weights="imagenet", input_shape=IM_SIZE + (3,), include_top=False)
    flatten = keras.layers.Flatten()(base_cnn.output)
    drop1 = keras.layers.Dropout(rate=0.25)(flatten)
    dense1 = keras.layers.Dense(256, activation="relu")(drop1)
    dense1 = keras.layers.BatchNormalization()(dense1)
    output = keras.layers.Dense(256)(dense1)
    output = Lambda(my_norm)(output)

    trainable = False
    for layer in base_cnn.layers:
        if layer.name == "conv5_block1_out":
            trainable = True
        layer.trainable = trainable

    mdl = Model(inputs=base_cnn.input, outputs=output, name="Embedding")
    return mdl


def complete_model(base_model, alpha=0.2):
    # Create the complete model with three
    # embedding models and minimize the loss
    # between their output embeddings
    input_1 = Input((imsize, imsize, 3))
    input_2 = Input((imsize, imsize, 3))
    input_3 = Input((imsize, imsize, 3))

    A = base_model(input_1)
    P = base_model(input_2)
    N = base_model(input_3)

    loss = Lambda(triplet_loss)([A, P, N])
    model = Model(inputs=[input_1, input_2, input_3], outputs=loss)
    model.compile(loss=identity_loss, optimizer=Adam(LR))
    return model


def get_model_name():
    return "resnet50Reg0.8"

def preprocess(x):
    return keras.applications.resnet50.preprocess_input(x)
0

There are 0 best solutions below