I am trying to tune a CART classifier, and am overwhelmed at the hyperparameter optimization. There are just so many, and I don't know which ones to optimize. You have at least these parameters:
- maximum depth of tree
- minimum number of observations in a node to attempt another split
- minimum number of observations in any terminal leaf node
- minimum factor by which lack of fit has to be decreased to execute the next split
I don't think I can do all of them, this is just too expensive on CPU time even for smaller grids.
Is there a consensus which hyperparameters to optimize in classification trees?
[link][2 comments]