Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 63697

How can I make use of weakly predictive features when the predictive model appears to be better without them?

$
0
0

I have an interesting problem where if I remove half of the features in my feature set (i.e. go from 40 to 20), I get a significantly more accurate model (a random forest) than I would if I kept those features in. I get a 40% reduction in Log loss and close to a factor 3 improvement in AUC loss by doing this. This seems odd however because the features I removed have some predictive value not accounted for by other features (unless there are very strange correlations I am unaware of). My question is, how can I make use of these features? Are random forests bad at making use of weakly predictive features? Here are my current ideas for how I can tackle the problem: I could just ignore the weak attributes with a higher probability when making decision tree splits. I could try building a separate model that includes the weak attributes and blend that with my current model. Any ideas, comments, or criticisms will be greatly appreciated.

submitted by AlexTHawk
[link][2 comments]

Viewing all articles
Browse latest Browse all 63697

Trending Articles