I just came across Extreme Learning Machines (ELM), and they seem to have some nice properties. Essentially they are a neural network with a single layer of many hidden nodes. However, the weights connecting the inputs to the hidden nodes are assigned randomly and never updated. The weights between the hidden nodes and the outputs are learned in a single step.
From my read, the method is not new. There is another post that claims it is similar to Random Matrices of Eugene Wigner in the 1940s, or gaussian processes from the 90s or Random Kitchen Sinks.
I started playing with a python implementation of ELM, and it is crazy fast and works quite well. Seems not to overfit on cross validation, and is pretty insensitive to the number of hidden nodes beyond a certain point. Works on simple linear relationships, and more complex nonlinear ones quite well. Handles noise well. Also it is just so darn simple.
So what is the catch? Are these widely used? If not, why?
[link][13 comments]