Well lets say I have to generate topics over a bunch of documents. First I assume [correct me if I'm wrong] that all the documents have same number of words N in them. Then the part after this is what I am confused on. After that we choose the topics labeling each of them with certain probabilistic weight? Do we have to supply the topics to the LDA model?
According to this blog in the "LDA Model" section, it says
Choose a topic mixture for the document (according to a Dirichlet distribution over a fixed set of K topics). For example, assuming that we have the two food and cute animal topics above, you might choose the document to consist of 1/3 food and 2/3 cute animals.
It does not make sense to me, I always thought LDA would generate topics by itself.
Please correct me if I am wrong anywhere. I would like to be clear about this.
Thanks!
[link] [14 comments]