Softmax output and probabilities not matching up?

45 Views Asked by At

I'm trying to test how well a GPT model can classify verbs according to the left-side context in a given input sentence with a masked term. For example,

Input sentence:

"The ballerinas' costumes that the thieves stole from the theatre last night [MASK] found at the abandoned condo."

Input answer choices: "are", "is" and "were".

Desired output: conditional probability of each of the three answers according to the model.

Ideally, if the model is performing well, the correct answer ("were") should have the highest probability % and softmax. But this isn't the case for me.

!pip install transformers


import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel


# Load tokenizer and model
model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Set the device to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

def calculate_conditional_probabilities(context, answer_choices):
    context_tokens = tokenizer.encode(context, add_special_tokens=True, return_tensors="pt")
    context_tokens = context_tokens.to(device)

    conditional_probs = []
    for choice in answer_choices:
        # Encode the choice and convert it to a tensor
        choice_tokens = torch.tensor(tokenizer.encode(choice, add_special_tokens=True)).unsqueeze(0).to(device)

        # Combine context and choice into a single input
        input_ids = torch.cat((context_tokens, choice_tokens), dim=-1)

        # Generate predictions using the model
        with torch.no_grad():
            logits = model(input_ids).logits

        # Calculate the conditional probability of the choice
        choice_id = tokenizer.encode(choice, add_special_tokens=True)[0]
        choice_prob = torch.softmax(logits[0, -1, :], dim=-1)[choice_id].item()
        conditional_probs.append(choice_prob)

    return conditional_probs
# Test the function
input_sentence = "The ballerinas' costumes that the thieves stole from the theatre last night [MASK] found at the abandoned condo."
answer_choices = ["are", "is", "were"]
conditional_probs = calculate_conditional_probabilities(input_sentence, answer_choices)

## printint softmax outputs directly 
for choice, prob in zip(answer_choices, conditional_probs):
    print(f"Softmax output of '{choice}':")
    print(prob)  
    prob_percentage = round(prob * 100, 2)
    print(f"Conditional probability of '{choice}': {prob_percentage:.2f}%")

The output:

Softmax output of 'are':
3.5287127957417397e-06
Conditional probability of 'are': 0.00%
Softmax output of 'is':
5.8688110584625974e-05
Conditional probability of 'is': 0.01%
Softmax output of 'were':
1.574901915546434e-07
Conditional probability of 'were': 0.00%

Should softmax values not add up to 1 or close to 1? Also, how come the output with the lowest softmax value has the highest probability % (the opposite of what should be the case)? Also I find it hard to believe that GPT can't be more that 0.01% about any of the output options in such a simple grammar solving problem...

Edit: I realize the softmax and probabilities do match up (completely glossed over e-6). But I'm still wondering why the output probabilities aren't any better than 0.01% for such a simple grammar problem (simple verb conjugation; solving for verb number given a subject), that too with GPT?

0

There are 0 best solutions below