Hi,
When using kernels in a machine learning problem, each data point is generally chosen as a center for a kernel basis function. For a huge training set, this can lead to a huge number of parameters which need to be trained. What are some commonly used methods to select some subset of the data which can be used for prediction? I know that some sparse kernel methods (like SVM) can be used for some problems, but are there generic ways to deal with this (which can work with classification, regression, and any choice of loss function)?
Thanks!
[link][comment]