I came across a paper which proposes a pretty interesting method of training ConvNets I haven't really seen before: http://arxiv.org/pdf/1403.2802v1.pdf
This seems like the supervised version of the greedy layerwise pre-training people used to do. I'm kind of surprised such a simple strategy works. It doesn't even seem like there's any fine tuning is done.
Has anyone looked into this Pyramid CNN architecture? It seems way more computationally efficient to train than regular ConvNets.
[link][7 comments]