In decision tree based classifiers (decision tree, random forest, GBT, etc...), the scaling of features is irrelevant since a split point is chosen based on how many points are above/below the split point. So, for example, in a text classification task, the results should be the same for tf (term frequency) features as tf-idf (term frequency - inverse document frequency) features. Is this correct?
[link][2 comments]