Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 63383

What algorithm/features to use for classifying resumes

$
0
0

I finally got done going through the coursera course on machine learning and I want to try to apply my knowledge to an actual problem. An interesting problem that I stumbled upon is classifying candidate resumes for job positions. Let's say we want to classify into 4 categories ("Should definitely invite for interview", "Should probably invite for interview", "Should probably decline to interview", "Should definitely decline to interview"). I will try to use either logistic regression or SVM for the classification algorithm.

Setting aside the problem of parsing resumes into a structured form, let's assume that for each resume I have information such as name, address, education (university name, degree, major, year of graduation, gpa), work experience (employer name, title, dates worked, description of role), and maybe stuff like descriptions of personal projects that they have worked on and technologies that they have used.

Let's assume that we are training this algorithm for a particular employer and a particular position (because each employer probably looks for different things in a resume). Let's also assume that we have good historical training data (a map of resumes to one of the categories above). Some of the features are obvious such as university names, employer names, job titles, gpa, major, etc. I can also think of keyword based features. For example, I can include a feature based on the keyword "JAVA" (whether job description contains the string "JAVA"). I'm not even sure whether adding these keyword based features make sense and beyond that I can't think of anything else to add. I don't know much about NLP so would I be able to make use of some NLP techniques to add features?

submitted by ipoman
[link][comment]

Viewing all articles
Browse latest Browse all 63383