Is there any implementation of the Author-Topic model put forward by Rosen-Zvi et al (http://www.datalab.uci.edu/author-topic/398.pdf) for large data sets? I have about 20 million texts, each about 500 words, and I'd like to find topic distributions by author to generate a distance metric between authors based on their interests (possibly symmetric Kullback-Leibler divergence as they use).
The distributed implementations of which I know (Mr. LDA and Gensim) don't have the extendability needed to implement an Author-Topic model without some non-trivial rewriting. I'd also like to do a dynamic implementation if at all possible. Maybe I'm asking too much?
[link][1 comment]