I was wondering about the way people organize their machine learning projects. Specifically, it is very common to have a pipeline starting with data in some sort of database, which is fed through several sequential algorithms (an example taken from Andrew Ng's course - we start with raw images, then extract locations of digits which appear in them, then feed these to a digit-recognizer).
- Where do you store intermediary results?
- How do you store your trained classifiers?
- How do you store the results of different parameterizations or hyper-parametrizations of your algorithms? (for example, assuming that one layer is perfect so that we can see how that affects the final output)
Thanks!
[link][15 comments]