I am looking into utilizing a ML regression technique to approximate a very computationally expensive, but known, function for purposes of improving processing rates (is this common?).
This function maps a 40 dimensional (floating-point) parameter space to a scalar (also floating-point). The forty dimensions can be reduced (using symmetry and other conserved quantities) to roughly 21 parameters (some of which are periodic), and the internal correlations are known very well and are non-linear.
Over the course of the last few months, we (with the help of a nice global compute network) have spent nearly 350 CPU-years to evaluate this function on the order of 2.2 billion times. Which gives us an average run time of just over 5 seconds per evaluation.
In the coming years, we will need to scale up that 2.2 billion number by orders of magnitude, and I don't believe our compute network will scale at that rate, so we need to come up with a solution.
My idea was to use the 2.2 billion samples we already have to train some ML regression technique to approximate this function under the Universal approximation theorem. Evaluating the NN should take orders of magnitude less time per sample than a 5 second function, and information lost due to the approximation should be recoverable using a separate technique I recently developed (unique to this problem domain).
Initially, I applied my own implementation of a kNN regression technique which got modest results and definitely showed that there was something to be gained, but the approximation was too rough for "prime time" ... so I need to step it up. I am leaning toward using a SGD Regressor but I don't have a handy C++ implementation in the package I am currently using. (Suggestions welcome)
So, any suggestions for where to start? Or more importantly, avenues to avoid? I am probably more interested in a discussion regarding ML-based performance optimizations as I am familiar with some ML techniques but am certainly not a dedicated Data Scientist.
(Thanks in advance for your time and patience.)
[link][14 comments]