Machine learning is the task of finding a concise set of rules to generalize some behavior in data. I've long wondered if one could create a learning algorithm based on regular expressions, which are meant to do something similar (explain a whole bunch of complicated features with highly general constructs). As simple contrived examples, regular expressions would make a nice classifier to discriminate numbers from words, or tell whether a string is an email address.
I'm sure there's some deep connection with encoding, compression, manifolds, etc. between ML and the way regexps work. Anyone else thought about this or seen work along these lines? I imagine there is some way to extend regexps to numerical data?
(For clarity, I am talking about using regexps to do ML, not using ML to find regexps)
[link][1 comment]