I have a theoretical question.
does it make sense to combine Sparse Autoencoder with the dropout technique and maxout? ..when Dropout adds sparsity itself. Is dropout only useful for a big system? or ca I use it on a small test architecture (like 64-25-3)
[link][2 comments]