Where is perplexity calculated in the Huggingface gpt2 language model code?

6.6k Views Asked by user947659 At 17 August 2025 at 04:45

I see some github comments saying the output of the model() call's loss is in the form of perplexity: https://github.com/huggingface/transformers/issues/473

But when I look at the relevant code... https://huggingface.co/transformers/_modules/transformers/modeling_openai.html#OpenAIGPTLMHeadModel.forward

    if labels is not None:
        # Shift so that tokens < n predict n
        shift_logits = lm_logits[..., :-1, :].contiguous()
        shift_labels = labels[..., 1:].contiguous()
        # Flatten the tokens
        loss_fct = CrossEntropyLoss()
        loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))
        outputs = (loss,) + outputs

    return outputs  # (loss), lm_logits, (all hidden states), (all attentions)

I see cross entropy being calculated, but no transformation into perplexity. Where does the loss finally get transformed? Or is there a transformation already there that I'm not understanding?

Original Q&A

There are 2 best solutions below

user947659 On 24 March 2020 at 15:33 BEST ANSWER

Ah ok, I found the answer. The code is actually returning cross entropy. In the github comment where they say it is perplexity...they are saying that because the OP does

return math.exp(loss)

which transforms entropy to perplexity :)

prosti On 02 March 2022 at 13:33

No latex no problem. By definition the perplexity (triple P) is:

PP(p) = e^(H(p))

Where H stands for chaos (Ancient Greek: χάος) or entropy. In general case we have the cross entropy:

PP(p) = e^(H(p,q))

e is the natural base of the logarithm which is how PyTorch prefers to compute the entropy and cross entropy.

Where is perplexity calculated in the Huggingface gpt2 language model code?

There are 2 best solutions below

Related Questions in MACHINE-LEARNING

Related Questions in HUGGINGFACE-TRANSFORMERS

Related Questions in GOOGLE-PUBLISHER-TAG

Related Questions in PERPLEXITY

Trending Questions

Popular # Hahtags

Popular Questions