I'm new to ML and have a small project, where I have a million or so images containing handwritten texts in some uncommon languages. I'd like to experiment using the code that been written to process the MNIST digit dataset.
How do I get from image containing digits to the MNIST digit format?
I have written some opencv code to pull out the characters into separate images, and normalize the size etc., but I'm guessing there is some code that does this better than I can achieve.
Can anyone point me in the right direction?
[link][1 comment]