I'm using scikit learn and tried out the decision tree classifier: http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
Based on the default settings listed, I would expect the classifier to always predict with 100% accuracy on the data set it was trained on unless the data set has two data points with the exact same features, but with different labels. I don't think my dataset has a lot of these, but I could be wrong. Am I misunderstanding the classifier somehow, or should my assumption be correct?
Thanks!
[link][8 comments]