LDA in Java

Part of the term project in distributed systems.

Sequential Version
Author: Kai Zhen
Date: 10/05/2016 11:45pm

Run it in eclipse.

Arguments: toyinput.txt toyoutput.txt 2000 1.0 0.1 2 10

Output file

=====================Top 10 words for topic 0=====================
protests
man
tiananmen
party
beijing
china
square
tanks
chinese
tank
=====================Top 10 words for topic 1=====================
learning
bayesian
human
data
models
cognitive
computational
concepts
language
people

To Do...

LDA is more for the situation where each topic is relatively as powerful as others. If there are 4 docs mainly for topic A, 4 for topic B, and only 1 for topic C. Then there is an issue.