Maybe this is an approach for NLP but what are some approaches for analyzing text content from a user forum like reddit and trying filter out the noise converting the information into useful information.
Say, if I am on /askscience. Is there a way to filter out the noise content and then convert the information into a searchable database. If I scan /askscience and then perform a query "microbiology", return results related to microbiology.
[link] [6 comments]