I have made a small character recognition program which uses a webcam to scan in printed text and convert it into a computer string. Im using the encog library with java.
I have very little experience with machine learning and neural networks and was after some advise on what makes a good training set? Should i train it for example with the letter distorted aswell as a perfect letter? How big should a training set be for just recognising one font with just letters? Also i was wondering if there are any databases, training images etc available online? I have only managed to find ones for handwritten characters which im not after implementing yet.
At the moment i have just trained it with an image of a..z typed in paint and get okish results for nicely printed documents.
I will appreciate any advice with neural network and training sets
Thanks for your time
[link] [2 comments]