I've seen in literature and practice that most machine learning systems in the text domain use Bag-of-Words representation of data. Is this really the best option?
I really think we can come up with something better, BOW is extremely naive and throws away a ton of useful information.
[link][3 comments]