Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 62733

Question About Recurrent Neural Network Training Procedure

$
0
0

Hello,

I am experimenting with recurrent neural networks for time-series prediction, and in the simplest case, having each time-step in the recurrent neural network predict x[t + 1] using x[t] as the only input feature.

In the part of the time series used for validation/testing, the values for x[t] are not known so the RNN's prediction from the previous time step is used (or samples from the conditional distribution if the network is modeling p(x[t] | x[1:t-1])).

In the training window, however, I have a choice between using the value for x[t] predicted at the last time step in the network and using the known value for x[t]. One option would be to always use the known value of x[t] in the training window. Another option (which Alex Graves uses in his paper on wind power forecasting) is to segment the part of the time-series available for training and use the observed values of x[t] for the first part and use the network's predicted value of x[t] for the second part.

I suspect that the first approach might make it harder for the network to learn long-range dependencies because it's never forced to make long-range predictions during training. However, I'm curious if there's any more formal/theoretical analysis related to this problem.

submitted by alexmlamb
[link][1 comment]

Viewing all articles
Browse latest Browse all 62733

Trending Articles