Hi guys, I wrote a recurrent network using long short-term memory cells and a mixture density output layer. There's 3 input units, 900 LSTM cells, and 121 output nodes. So, there's a lot of calculations to do. I'm trying to train the network on handwriting data, which contains paragraphs of handwritten data. Unfortunately, my code will take about 8 hours just to do back propagation on a single stroke/letter. I don't know if my code is just that poorly written or if this is the speed that I should expect. Any tips on how I can speed my code up?
[link][15 comments]