Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 63524

Trouble with RNN predicting previous input for time series

$
0
0

EDIT:So I figured it out. It had to do with skipping the 0th epoch. Instead of comparing to t+1 target, I just had to compare to t+2 because of the way I was skipping.

Hi guys, I'm currently trying to get a LSTM network to predict a text sequence. My version doesn't have a forget gate or a scaling function for the cell output. I use tanh as the activation function for all nodes except the output layer which is using a logarithmic softmax function. I have one hidden layer of 250 LSTM cells. The input and output layers are both of size 28 (26 for letters, 1 for space, and 1 for '). The training data is currently only about 250 characters.

My problem is that once the network gets past outputting gibberish, it only predicts the previous target. The current code is at https://gist.github.com/anonymous/10404161

The things I'm confused about that I think could be causing this problem are that I don't really understand the inputs in a given epoch. Currently, I'm skipping the 0th prediction because the papers I read didn't specify how to evaluate y(t-1) if t = 0. So I'm starting at t=1, assuming that every output at t=0 is 0.

I'm also not entirely sure that my gradient descent is perfect, but since the network seems to learn some coherent function, I think it might not be the direct cause of the improper predictions.

I'm really sorry if my question is horribly worded because my understanding of the material is shitty enough, so trying to explain it is very hard for me.

The paper's I've been reading are: http://arxiv.org/pdf/1308.0850v3.pdf, http://www.bioinf.jku.at/publications/older/2604.pdf, http://machinelearning.wustl.edu/mlpapers/paper_files/GersSS02.pdf

EDIT:Changing the code only so that it starts at the 0th epoch instead of the 1st results in the program alternatively predicting spaces and one of a few different letters. I'm not sure if this means there is a problem with my gradient descent since the network seems to get stuck on local optima.

submitted by purpleladydragons
[link][8 comments]

Viewing all articles
Browse latest Browse all 63524

Trending Articles