What is the meaning of number behind every word in LDA model topic words?

588 Views Asked by At

When we train a model using LDA model we get an outcome of set of common topics which belong too LDA model. Each word in the topic have a number behind it. example

topic - 0.004*great + 0.004*good + 0.004*like + 0.003*well + 0.003*best + 0.003*better 

What is the meaning of this number?

1

There are 1 best solutions below

0
On

The numbers are probabilities. A higher number indicates a higher probability that the word will be selected after the topic has been selected in the process of generating texts.

If you use your LDA model to create a text it will roll a die and select a topic distribution (a set of numers similar to the ones in your post which determine how likely it is that a topic will occur in a text). It will then roll a die to select one of the topics from the distrubtion, then roll another die to select a word from that topic. It repeats the last two steps for every word in the document.

Most of the time the model is used in reverse - by looking at existing texts you try to find the paramters that yield a model that is liekly to create the texts you have.