Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 63087

LSTM derivs for backprop through time

$
0
0

Hi guys, I'm struggling with implementing a long short-term memory network. I have the forward pass done, but I'm having trouble deriving the activation functions in order to get the error terms because I suck at math. The original LSTM paper uses a combination of truncated BPTT and RTRL but the paper I'm trying to follow claims to use BPTT only (sidenote: does calculating the full gradient imply not updating the weights at every timestep?). If someone could walk me through how to calculate the derivative of the cell I'd greatly appreciate it.

TLDR: How do I calculate a LSTM cell's derivatives?

submitted by purpleladydragons
[link][2 comments]

Viewing all articles
Browse latest Browse all 63087

Trending Articles