My data is
unique id | text string | label (1 or 0) |
Imagine the text string is jokes and the label is 1 for funny 0 for not funny. The strings are the text of jokes of varying length. I want to see if any words within the strings are more correlated with a joke being funny or not.
What would be the best way to begin this analysis?
[link][15 comments]