I am trying to solve the Kaggle digit recognizer problem using Neural Network.
I am planning to use a very simple 3 layers Neural Network (1 input, 1 hidden and 1 output layer). The input layer will have 784 neurons (one for each pixel) and the output unit will have 10 neurons (1 for each digit).
The training dataset has 42,000 labeled digits. I am thinking of randomly splitting it into 3 datasets
- training dataset (60%)
- cross validation set (20%)
- test dataset (20%)
My question is how important is label distribution in this case. Can I just randomly split the data into (60%, 20%, 20%) or do I have to make sure that every label (10 digits in this case) to be even distributed in the splits?
Thanks.
[link][2 comments]