Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 62673

What do you do when your biggest source of error is the people hired to evaluate your classifier?

$
0
0

Hey all. I'm currently working on a ML project that involves record linkage optimized by ML. When I first started the project, I knew I would need a way to rapidly evaluate the accuracy of the classifier. I knew I would need a labeled data set, but unfortunately this data set did not exist yet.

I'm not afraid of getting dirty with the data, so I dug in, and eventually found a way to consistently find the correct answer. Unfortunately, labeling this data takes quite a bit of time, since it involves a level of manual record linkage. Hence, I was only able to do a small set on my own. I trained on this data, and found it to generalize fairly well, so a plan was set in place to get a set of outsourced interns to do this labeling task so I could improve performance and generalization.

Unfortunately, I found early on, no one else had the accuracy or efficiency to do this task well. Even when the task was simplified to simply evaluating the accuracy of the output of the classifier, they struggle. They consistently can not see the patterns in the data and mark correct things as incorrect. When I go through their work, and make corrections, they agree that my corrections are accurate, however, they have never improved on the tasks provided to them.

This is stressing me the hell out. They are the biggest source of error in my project, but I have no other resources to hire someone better. What can I do? The classifier is doing a hell of a job based on my evaluations, but I simply do not have the time to take a sample each time and evaluate it myself. I can't run the optimization system until I have at least a modestly sized set of data with the correct answers. But the team hired to evaluate has a 10% error rate

Have any of you dealt with this issue before? What do I do?

submitted by iwantedthisusername
[link][1 comment]

Viewing all articles
Browse latest Browse all 62673

Trending Articles