SSVD for dimensional reduction +Clustering

89 Views Asked by Darsh Sofyan At 30 July 2025 at 06:54

I have run the ssvd by mahout to apply LSA (Latent semantic analysis). I have text documents each contains many features(from 100 to 2000 terms). I would like to use LSA on the documents to get the top terms or phrases which appear together "concepts". Any one has an idea how can I do that? Actually I applied preprocessing filtering(tokenization, stopword removal, stemming, ....), create tfidf by mahout, and then run ssvd command: bin/mahout ssvd -i termVectors/tfidf-vectors/part-r-00000 -no Output Folder -c 200 -us true -U false -V false -t 1 -ow -pca true I use clusterdump in mahout to parse the results, but all terms in the rsults start with the letter "a*", and are not represent any concept. Is anyone has experince in ssvd for reducing the features before clustering? or any idea how do you use ssvd to show the concepts in text corpus?

Thank you

Original Q&A

SSVD for dimensional reduction +Clustering

There are 0 best solutions below

Related Questions in CLUSTER-ANALYSIS

Related Questions in MAHOUT

Related Questions in DIMENSION-REDUCTION

Trending Questions

Popular # Hahtags

Popular Questions