In section 4.1 (Normalized Stupid Backoff) of "One Billion Word Benchmark ..." by Chelba, Mikolov, et.al., it states:
... the Stupid Backoff model does not generate normalized probabilities. For the purpose of computing perplexity ... values output by the model were normalized over the entire LM vocabulary.
Assuming a bigram LM, the obvious way to interpret this is to score all single words (using MLE) and score pairs of words using the standard backoff formula, sum the unigram and bigram scores, yielding Sigma, then divide each (unigram or bigram) score by Sigma. Is this the correct interpretation of the quote?