I would like to estimate the exact position of certain "landmarks" in a low dimensional time series. Each landmark is characterized by a specific type of pattern in the data. I already know the type of each landmark and their approximate locations; a region of 40 frames around those estimates should contain the landmark with sufficient contextual information.
Being new to machine learning I'm not sure what might be a sensible approach for a problem like this. For instance if something like this could be done with a neural net (?), what would the inputs/outputs and training cases look like? Would you predict a likelihood of the landmark being at any of the 40 input frames (i.e. have 40 outputs)? If so, would you present each training sample multiple times with different 40 frame "windows"? Or, would you maybe predict the likelihood of the landmark being at say the center of the region presented to the network (i.e. single output), and use a sliding window to search for the most likely position?
[link][4 comments]