Could we create a standard for text resources (blogs, tutorials, articles...) that could make machine learning as easy as requesting the machine resource from a server rather than getting the human readable form? Generating the resource once and having humans review it before publishing would improve both speed and accuracy of NLP. Selfish companies content, who choose not to create machine resources for thier content, could be stored on third-party servers. Thoughts?
[link][1 comment]