Hi, all. (As a quick caveat, I'm brand new to ML, so explain like I'm 5.)
I help to organize a pretty large conference every summer at my university. Part of that is taking in 200+ 1,000-word abstracts and assigning them to referees for blind review. This process takes 20-25 man-hours of work or more. It dawned on me that there may be some way of teaching a computer which referees would be a good fit for which abstracts.
I have access to hundreds of past abstracts, close to 1,000 total, and the human-assigned referees. Each of these abstracts is about 1,000 words long, and each one of them has been assigned either two or three referees. I think this is a pretty strong dataset to start with for training.
Is there some way of using ML to analyze these patterns so that a program could recognize that, for example, abstracts that contain phrases A, B, and C are usually assigned to referee X? If so, where would I start? I have a decent amount of experience with programming (C++, Java, PHP), and some basic knowledge of R.
Thanks for your time.
[link][2 comments]