I was going through this popular tutorial on deep learning.
From what I understand, a linear decoder is used to obtain autoencoder outputs that can go beyond the [0,1] interval that a sigmoid activation would produce. The tutorial says that instead of using the sigmoid activation at the output layer of the autoencoder, we simply use the linear activation function.
Here's what I'm thinking is happening: the output layer activations are a linear transformation of the hidden layer activations (by definition). This also means that the hidden layer activations are a linear transformation of the output layer activations (a1 = W.a2 <=> a2 = pinv(W).a1).
Further, the output activations = input values by constraint, as that's what an autoencoder does. Hence, the input is a linear transformation of the hidden layer activations, and conversely, the hidden layer activations are a linear transformation of the input layer.
Perhaps I'm looking at this wrong?
[link][8 comments]