Quantcast
Viewing all articles
Browse latest Browse all 63329

r/ML, i've hit a wall trying to code a simple RNN, please help!

Hey guys,

I've come across a really crazy bug happening to me in coding backpropagation for an RNN in pure python. I'm comparing my implementation's results against an implementation from theano (which is much easier to code as there is no backpropagation to figure out). The weirdest bug is that I'm getting the last row of weights correct for the input -> hidden layer connections, but the rest are incorrect. I.e, I'm correctly figuring out the derivative of the weights from the 5th unit of the first layer to each unit in the hidden layer, but none of the rest. All info is here:

http://stackoverflow.com/questions/27544698/pure-python-rnn-and-theano-rnn-computing-different-gradients-code-and-results

(and note that in the code df is the derivative of the hidden activation function f)

I'm totally blocked here, can't figure it out for the life of me. I really thought my backpropagation code is correct, but there must be an issue there. I'm doing gradient checks too and theano's implementation is computing the correct gradient. Really, really not sure what's happening with the last row of weights for input->hidden, though, that really makes no sense to me.

Youre my last hope r/ML!

submitted by blkorcut
[link][comment]

Viewing all articles
Browse latest Browse all 63329

Trending Articles