I am using Java Weka library to train a set of data and test it against another. Here is my Java code:
import java.io.BufferedReader; import java.io.FileReader; import weka.classifiers.meta.FilteredClassifier; import weka.classifiers.trees.J48; import weka.core.Instances; import weka.filters.unsupervised.attribute.Remove; public class WekaStocks { public static void main(String[] args) throws Exception { // Instantiate Training Data Instances training_data = new Instances( new BufferedReader( new FileReader( "res/training_data.arff"))); training_data.setClassIndex( training_data.numAttributes() - 1); // Instantiate Testing Data Instances testing_data = new Instances( new BufferedReader( new FileReader( "res/test_data.arff"))); testing_data.setClassIndex( testing_data.numAttributes() - 1); // Print Initial Data Summary String summary = training_data.toSummaryString(); int number_samples = training_data.numInstances(); int number_attributes_per_sample= training_data.numAttributes(); System.out.println( "Number of attributes in model = " + number_attributes_per_sample); System.out.println( "Number of samples = " + number_samples); System.out.println("Summary: " + summary); System.out.println(); // a classifier for decision trees: J48 j48 = new J48(); // filter for removing samples: Remove rm = new Remove(); // remove first attribute rm.setAttributeIndices("l"); // filtered classifier FilteredClassifier fc = new FilteredClassifier(); fc.setFilter(rm); fc.setClassifier(j48); // train using training data fc.buildClassifier(training_data); // test using test data for (int i = 0; i < testing_data.numInstances(); i++) { double pred = fc.classifyInstance(testing_data.instance(i)); System.out.print("given value: " + testing_data.classAttribute().value((int)testing_data.instance(i).classValue())); System.out.println(". predicted value: " + testing_data.classAttribute().value((int)pred)); } } }
And here are my two data files:
training_data.arff:
@relation stock @attribute percent_change_since_open real @attribute percent_change_from_day_low real @attribute percent_change_from_day_high real @attribute action {buy, sell, hold} @data -0.3,0.2,-0.22,hold -2.2,0.0,-2.5,sell 0.2,0.22,-0.01,buy -0.25,0.12,-0.25,hold -2.0,0.0,-2.1,sell 0.26,0.26,-0.4,buy -0.12,0.18,-0.14,hold -2.6,0.12,-2.6,sell 0.24,0.3,-0.035,buy
test_data.arff:
@relation stock @attribute percent_change_since_open real @attribute percent_change_from_day_low real @attribute percent_change_from_day_high real @attribute action {buy, sell, hold} @data -0.2,0.1,-0.22,hold -2.2,0.0,-2.5,sell 0.2,0.21,-0.01,buy -0.22,0.12,-0.25,hold -2.0,0.0,-2.1,sell 0.28,0.26,-0.4,buy -0.12,0.08,-0.14,hold -2.6,0.1,-2.6,sell 0.24,0.25,-0.03,buy
I'm getting an error in Eclipse at runtime that says:
Exception in thread "main" java.lang.IllegalArgumentException: Invalid range list at l at weka.core.Range.setFlags(Range.java:316) at weka.core.Range.setUpper(Range.java:88) at weka.filters.unsupervised.attribute.Remove.setInputFormat(Remove.java:202) at weka.classifiers.meta.FilteredClassifier.buildClassifier(FilteredClassifier.java:389) at WekaStocks.main(WekaStocks.java:56)
I think there is an issue with the training_data.arff file because the error occurs on the first time the program manipulates that data. This is line 56:
// train using training data fc.buildClassifier(training_data);
Can anyone shed light on what is going wrong here?
[link][1 comment]