I remember from my Russell / Norvig AI textbook that any neural net that can be expressed in N hidden layers can also be expressed in a single hidden layer.
Yet, the trend seems to be towards more layers lately. Why is this? Is it a question of efficiency? Learning speed?
[link][18 comments]