Hello, first of all sorry if this is not the right subreddit for this kind of question.
Here it goes: in my job we have a set of text document coming in every day and so far there has been a colleague who had the task to tag the document based on the document topic. We are using a finite set of tags in the order of the thousands.
We would like to implement a system that can get the topic of the document and tag it automatically getting the most relevant words from the text.
In our DB we already have past years documents tagged by the user so if this can be used by a ML algorithm we would use that, too.
Now we are an web-based IT company so we don't know much about ML so we would like what would be the best technique / method to achieve that.
Thanks
*Edit: the documents and the tags are all in Italian
[link][8 comments]