Backpropagation - how much training data do I need?

Hello,

For the last few weeks I've been working on a backprop network and posting a few questions to this forum; I thank you for all the help so far. I've gone from concept, to buggy implementation, to something that works.

As a quick recap of my network - my network takes input/feature vectors of length 43, has 25 nodes in the hidden layer (arbitrary parameter choice I can change), and has a single output node. I want to train my network to take the 43 features and output a single value between 0 and 100.

Unfortunately, I currently only have a very small pool or training data - 162 sets of feature vectors with corresponding scores out of 100 (I have to manually label this lol! Working on creating more data though obviously). So I take this limited training set, and here's a snapshot of how well my network adapts to it:

Output value:0.90406 | Test value:0.9 (pretend to multiply all values by 100)

Output value:0.21558 | Test value:0.2

Output value:0.60394 | Test value:0.6

Output value:0.79604 | Test value:0.8

Output value:0.99846 | Test value:0.85

Output value:0.23444 | Test value:0.2

Output value:0.19609 | Test value:0.2

Output value:0.88889 | Test value:0.9

Output value:0.19178 | Test value:0.2

Output value:0.20549 | Test value:0.2

Output value:0.63248 | Test value:0.64

Output value:0.74367 | Test value:0.74

Output value:0.15477 | Test value:0.17

Output value:0.17084 | Test value:0.18

Output value:0.21143 | Test value:0.19

Output value:0.16179 | Test value:0.17

Output value:0.081413 | Test value:0.18

Output value:0.18287 | Test value:0.19

Output value:0.19118 | Test value:0.17

Output value:0.20018 | Test value:0.18

Output value:0.19222 | Test value:0.19

Output value:0.20719 | Test value:0.2

Output value:0.18718 | Test value:0.2

Output value:0.18064 | Test value:0.2

Output value:0.20925 | Test value:0.2

Output value:0.20731 | Test value:0.2

Output value:0.19914 | Test value:0.2

Output value:0.6033 | Test value:0.6

Output value:0.63723 | Test value:0.64

Output value:0.77831 | Test value:0.78

Output value:0.23468 | Test value:0.2

Output value:0.87713 | Test value:0.9

Output value:0.23822 | Test value:0.2

Output value:0.18954 | Test value:0.15

Output value:0.19912 | Test value:0.2

At first I'm like, "wow this is sick!" The results are much, much better than when I originally tried gradient descent on its own. Like, this is too good to be true. Hmm, maybe it is. So I decide to try something - use the same test/target values, but create 162 completely random feature vectors.

Uh oh - my network was able to fit the random training data even better than my actual training data! In fact, it fit the random data perfectly. Shit:

Output value:0.92 | Test value:0.92

Output value:0.2 | Test value:0.2

Output value:0.62 | Test value:0.62

Output value:0.7 | Test value:0.7

Output value:0.77 | Test value:0.77

Now I'm thinking one of two possibilities:

1) Because I have so few training samples (only 162), my 3-layer network of 43->25->1 is able to over-fit the data with all its weights.

2) My original feature vectors are absolutely worthless, and just as good as inputting plain garbage. These feature vectors I hand-coded based on what I researched would be appropriate to my problem domain.

What do you guys think is going on, and will I only know once I have more training data? Given the topology of my network, any idea how much data I'll actually need?

Cheers.

submitted by bvcxxcvb
[link][2 comments]

Backpropagation - how much training data do I need?

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List