Hey /r/machinelearning--
I don't see too many [question] posts here, so I hope I'm not in the wrong sub. If so, please point me to a better option.
Currently I am using SciKit Learn to classify text documents. I badly need to lower the dimensionality of my data, and I have began doing so by attempting to implement some feature selection classes. The only problem is that they're not working very well.
I found this streamhacker post, but I am less familiar with the NLTK, so I was hoping to learn of other feature selection options (i.e., low information feature elimination) before I started.
Can anyone here suggest anything?? Has anyone here ever reduced dimensionality using SciKit before?? Thank you in advance for any leads!
PS: Is there a sub-group or sub-reddit dedicated to scikit question and, if not, is there any interest in starting one??
[link][4 comments]