Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 63278

Why don't sigmoid and tanh neural nets behave equivalently?

$
0
0

A sigmoid net can emulate a tanh net of the same architecture, and vice versa. I calculated the gradient for a tanh net, and used the chain rule to find the corresponding gradient for a sigmoid net that emulated that net, and found the same exact gradient as for a sigmoid net. What am I missing?

Edit: It turns out that if learning occurs by following the gradient in the tanh net, and one observes what happens in the corresponding sigmoid net, the gradient of the sigmoid net is not followed. I guess I could calculate the tanh gradient and transform it into updates for a sigmoid net to simulate a tanh net with a sigmoid net. I couldn't find any literature on this, so I'm still suspicious I'm overlooking something.

Edit: By sigmoid function, I am referring to 1/(1 + exp(-x)).

submitted by justonium
[link][12 comments]

Viewing all articles
Browse latest Browse all 63278

Trending Articles