Training a LDA model with gensim from some external tf-idf matrix and term list

608 Views Asked by Ziyuan At 27 November 2014 at 19:37

I have a tf-idf matrix already, with rows for terms and columns for documents. Now I want to train a LDA model with the given terms-documents matrix. The first step seems to be using gensim.matutils.Dense2Corpus to convert the matrix into the corpus format. But how to construct the id2word parameter? I have the list of the terms (#terms==#rows) but I don't know the format of the dictionary so I cannot construct the dictionary from functions like gensim.corpora.Dictionary.load_from_text. Any suggestions? Thank you.

Original Q&A

There are 1 best solutions below

Radim On 09 December 2014 at 12:15

id2word must map each id (integer) to term (string).

In other words, it must support id2word[123] == 'koala'.

A plain Python dict is the easiest option.

Training a LDA model with gensim from some external tf-idf matrix and term list

There are 1 best solutions below

Related Questions in PYTHON-3.X

Related Questions in TF-IDF

Related Questions in LDA

Related Questions in TOPIC-MODELING

Related Questions in GENSIM

Trending Questions

Popular # Hahtags

Popular Questions