What is the difference between OnlineLDA and EMLDA in Spark?

506 Views Asked by At

We are working on a project where we need to run LDA on topic identification , so we have applied the OnlineLDA for this , but for that we are getting the OOM exception, when we try to increase the iteration.

Hence we tried to shift to EMLDA if it scales better!

So my question is which one is better in terms of performance , memory management etc?

Corpus Size : 86k+ Documents

Topics Number : 2000+

P.S: On a side note we need to apply the LDA on the Stream data in future , I see a spark Ticket for it but it seems its still on hold , so if someone can suggest me a way around that would work like a charm . Thanks !

0

There are 0 best solutions below