I am trying a bunch of data against a bunch of different algorithms in Weka but some of these are so slow. I have a large amount of data, 10000 items, represented by 5 classes and each item is represented in a BOW/TF-IDF.
Some algorithms run slowly like NB (30 seconds to build a model). Some run quickly (MNB, CNB) and some take 5+ min to build a model so I end up stopping it from running (JRip, J48, SMO).
What are some good strategies to use to reduce the data so the classifiers can at least run in an acceptable amount of time?
I am running 10-fold cross-validation in Weka... would be trying 5x2 cv in experimenter speed it up?
Thanks
[link][7 comments]