Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 63329

Time Series Question- Please Help

$
0
0

So this is something with which I've been struggling lately, and I think I have a solution, but I'm curious as to how others would handle this.

I have a dataset that describes a discrete (temporal) sequence of ~15 events observed in specific individuals. The target here is a rolling cost measurement, and I also have temporal inputs that describe items that contributed to the cost in the given period. FWIW, the cost variable alone for a specific individual isn't rather informative (i.e. an AR(I)MA on just the cost for just one person wouldn't be insightful). However, there are patterns in cost shared across the entire population, and there are also similar trends in the target-input relationship shared across individuals.

In addition to the sequential information, for each individual I have another set of data with static features that contain data similar to demographics. This set includes ~100 features. The static and sequential data for each individual are linked.

So I know there are stats-y kinds of ways to deal with this (hierarchical models, ignoring sequential nature of the data, etc.), but my data violates most of the necessary assumptions. Similarly, I could either ignore the sequential nature of the data and feed it to any of a number of ML models, but through personal experience and domain research, I've found that this results in significant information loss and poor model performance. Likewise, I suppose that I could mash the static data together with the sequential, essentially repeating a demographic characteristic 15 times. That doesn't seem like a good idea from a math point of view.

Thus, here's what I'm currently considering: a hybrid HMM-RNN model. The HMM will model the relationship between the static data and the rolling cost, and will have access to cost at t-1 (relatively standard HMM). The RNN- recurrent neural net- will be used to model the relationship between the sequential inputs and the rolling target. As the sequences would be too short to train independently, and it would be illogical to combine all of the sequences into one long one, I'll use a slightly different training method for the RNN. I'll train each sequence (i.e. each individual) separately, then use the final weights for the first individual as the starting weights for the following, and repeat until convergence.

I'd greatly appreciate criticism of my idea, recommendations of different techniques, or paper suggestions. Honestly, I'd be exceedingly grateful for just a hint on search keywords, as I'm at a loss here.

submitted by fhadley
[link][comment]

Viewing all articles
Browse latest Browse all 63329

Trending Articles