Using LDA in Galago search engine

311 Views Asked by At

I have started to use Galago for document retrieval. I want to cluster some documents (initially retrieved documents with any model) using LDA. I prefer to use a java-based implementation that can be integrated into my code using Galago. I'd appreciate it if you could let me know what open source implementation of LDA is more suitable for my purpose.

Thank you in advance for your help!

1

There are 1 best solutions below

1
On BEST ANSWER

There's a fast algorithm for LDA from this paper:

S. Arora, R. Ge, Y. Halpern, D. Mimno, A. Moitra, D. Sontag, Y. Wu, M. Zhu. A Practical Algorithm for Topic Modeling with Provable Guarantees. 30th International Conference on Machine Learning (ICML), 2013.

Which has a Java implementation by one of the authors (D. Mimno) on github here: https://github.com/mimno/anchor

I've poked around with this implementation briefly, and found good and fast results. Like all LDA/Topic modeling, getting the number of topics right can be challenging.