How to test the trained NMF topic model on new text

707 Views Asked by At

I have created a NMF topic model in python the code snippet for which is as follows:

def select_vectorizer(req_ngram_range=[1,2]):
    ngram_lengths = req_ngram_range
    vectorizer = TfidfVectorizer(analyzer='word', ngram_range=(ngram_lengths), stop_words='english', min_df=2)
    #print("User specified custom stopwords: {} ...".format(str(custom_stopwords)[1:-1]))
    return vectorizer

vectorizer = select_vectorizer([2,5])
X = vectorizer.fit_transform(new_review_list)


clf = decomposition.NMF(n_components=20, random_state=3, alpha = .1).fit(X)
vocab = vectorizer.get_feature_names()
print_top_words(clf, vocab, num_top_words)

which created 20 topics like the following:

Topic #0:
[u'blocks available', u'delivery blocks available', u'notifications blocks', u'notifications blocks available', u'new blocks', u'know blocks available', u'new blocks available', u'know blocks', u'open blocks available', u'available work', u'zero blocks', u'like blocks', u'notification blocks', u'day blocks', u'slow blocks', u'10 blocks', u'option set', u'logged 10', u'notification blocks available', u'day blocks available']
Topic #1:
[u'amazon flex', u'working amazon', u'amazon flex app', u'working amazon flex', u'hello amazon', u'hello amazon flex', u'flex delivery', u'amazon flex delivery', u'flex team', u'amazon flex team', u'work amazon', u'amazon flex support', u'flex support', u'work amazon flex', u'deliver amazon', u'hi amazon flex', u'hi amazon', u'deliver amazon flex', u'signed amazon', u'love amazon'] and so on..

Now I want to test this out on new texts, such that it categorizes those texts based on these categories. How do I do that?

0

There are 0 best solutions below