Postage stamp collecting (/r/philately/) is a very popular hobby. The hobby depends on several large paper catalogs issued to help collectors identify what they have. But with over 1/2 million different major stamps issued over the last 170 years, and 1,000's more issued every year, identifying stamps is a time consuming task.
Some of the issues faced are:
1) Similar varieties that vary only by price and color, such as the Machins of the UK or Washington-Franklin issues of the U.S.
2) Cancellations, usually black, but sometimes other colors, can hide portions of the designs.
3) Differences in the number of perforations (the holes between stamps used to separate them) can indicate different issues.
So, /r/ml, how would you apply machine learning to this problem? Assume you had access to clean, identified, copies of all the stamps you needed as a training set.
[link] [17 comments]