What algorithms do other people use for hyperparameter optimisation and how long do you spend on it?
I quite like random sampling of hyperparams as it is so simple to implement and will typically get a good spread of samples.
With Gaussian Process I find it will often beat random samples; but they can be sensitive to parameters being specified in the wrong space (e.g. learning rate vs. log(learning rate)), among other difficulties.
I haven't tried TPE (http://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf) but it looks respectable.
I haven't experimented much with manual tuning.
I am currently working on a problem where my models just aren't training nicely and would be grateful to hear any advice or opinions.
[link][7 comments]