Consider a situation where we have 99 mediocre predictive models, and 1 good one. The models are tasked with predicting the probability of a particular classification. We combine the model's predictions by averaging them to obtain the ensemble's prediction.
Wouldn't the 99 bad models (which will produce probabilities closer to the global probability of that classification) drag the prediction of the good model back towards the global mean - making it a worse prediction?
Or should I be using something other than averaging to combine these probabilities?
edit: Lots of people are suggesting that I look at boosting. From what I've read it is sensitive to noisy data, and my data is extremely noisy (I'm predicting the probability that people will click on something).
My real question is not so much whether there may be better approaches, I'm sure there are, but whether my approach is seriously flawed.
[link][24 comments]