I'm one of the authors of RTextTools: a free, open source machine learning package for semi-automated text classification. I've been working with social scientists and software engineers during the past year to create a simple yet functional R package to categorize text documents into discrete categories.
I'm sure there are statistical methods that haven't been implemented in RTextTools that could improve functionality and accuracy. I'd really appreciate if you could test the package ( in R 2.14+, install.packages("RTextTools") ) and provide some feedback. Thank you in advance for your help!
EDIT: If you'd like more details, RTextTools has a website.
[link] [8 comments]