I feel that recently, pretraining deep neural networks is not a common practice anymore. Is this a wrong observation?
Is it because dropout provides the needed regularization and rectified linear units are easy to train and initialize?
Or is it because everyone thinks thier dataset is big enough?
Even with dropout + ReLUs + big dataset, why woudn't pretraining improve results?
[link][comment]