My friend and I are working on a recommendation project and I've recently been reading a lot about recommendation engines. It seems like kNN and SVD (or some matrix factorization) are many authors' clear preferences, but the examples used always compare similarity using to user-generated ratings (i.e., numerical values).
Is there a blog post or stack exchange answer or base of code you guys can point me to for comparing similarity based on text? Perhaps a portion of the NLTK? Any help is enormously appreciated, and thank you in advance.
[link][2 comments]