Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 62858

Train time layer resizing

$
0
0

When designing a neural net, depth and sizes of each layer matter a lot. All papers I've seen so far seem to manually choose sizes.

I'm considering a system which (imagine all layers are fully connected for now) can dynamically add and remove nodes and layers at training time to try to put parameters only where they are valuable.

Every epoch, the "least useful" 1% of nodes in a fully connected neural net are removed, and a new 1% of nodes are added randomly and randomly initialised.

"Least useful" nodes are determined by looking at d(Node output value)/d(Loss Function) averaged across a large number of training examples.

Adding layers can be done by inserting a new "unity" layer between two existing layers, which has all parameters set to just pass through data unmodified (and no nonlinear elements), but it can then have new non-unity nodes added through the above mechanism.

One can imagine there are (sometimes non-trivial) extensions to this idea to apply to other layer types too.

Has this been done before, and is there any literature on the topic?

submitted by londons_explorer
[link][1 comment]

Viewing all articles
Browse latest Browse all 62858

Trending Articles