I am passing a corpus of news through LDA (using Vowpal Wabbit) and the results are nice, but some real world topics like football for example are spread over a number of LDA topics.
So I was thinking, if I run PCA on top of that, I could perhaps separate the topics even better - it could be a good unsupervised topic discovery method.
What do you think?
[link] [4 comments]