(I posted this just now in /r/computervision as well, hope cross-posting is not frowned upon, didn't see it in the rules, sorry otherwise!)
I haven't done a CV task before where the available dataset have been this big, or of this exceptional quality.
Every image of this dataset of seven million+ user-uploaded pictures have been painstakingly labeled manually by our "community support" team over the last 10 years, plus the addition of volunteers from the social networks where the pictures where uploaded (ten or so people has historically been required to have chosen the same label for a picture before the label was assigned).
The dataset is near perfection, with extremely few mislabeled images (except for human bias, though would have to be collectively biased since multiple people needs to miss-classify).
This dataset has six labels:
- pictures of animate objects that morally and legally we can't show to users that are <18 years old,
- pictures of animate objects that morally and legally we can't show to users that are <16 years old,
- pictures of animate objects that morally and legally we can't show to users that are <12 years old,
- pictures of INANIMATE objects (toys, cars, whatever) that morally and legally we can't show to users that are <18 years old,
- pictures of INANIMATE objects (toys, cars, whatever) that morally and legally we can't show to users that are <16 years old,
- pictures of INANIMATE objects (toys, cars, whatever) that morally and legally we can't show to users that are <12 years old.
(animatemeans dick pics and titties more or less,inanimateis non-human, e.g. dildos and forests; genitals and pr0n is labeled 18+, any nipple showing or more sexual than that is labeled 16+ if not enough to earns the 18+ stamp, anything else is just labeled 12+ since we don't allow users below this age to use our services.)
The task at hand is to automatically label pictures that we legally can't show to users that are <16 years old. The laws basically boils down to this (which the manual labeling has followed):
"if a nipple and/or anything more sexual can be seen in a picture, the user needs to be at least 16 years old to see it."
My initial idea is to use a pre-trained VGG19 or ResNet50 model, lock a number of first layers and do transfer learning on whatever number of later layers show promise, and if the results are bad, experiment with a combination of AWS Rekognition and a custom solution.
Any thoughts, tips, guidelines? Appreciate any feedback!
NB: CV is not my main focus at work (though I've studied and played around with it quite a lot); I'm usually involved in time-series and NLP, and I have a much stronger comp-sci background than stats, but focusing on bridging that gap the last few years.