Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 63407

How to use spotty labeled training data, knowing there is correlation between features and whether an example is labeled or not?

$
0
0

I have a data-set with both labeled and unlabeled examples. Due to my knowledge of the domain I know that some of the examples features greatly affect whether an example is labeled or unlabeled, causing very biased label data and grave errors in prediction. How can I use this knowledge of the correlation between the features and the probability of an example having a label to reduce bias and prediction errors?

submitted by solen-skiner
[link] [6 comments]

Viewing all articles
Browse latest Browse all 63407