Let me try to explain this simply. I have a bunch of data (taken from reddit actually) and the attributes are things like title, author, permalink, and a classier that is a text based (e.g. 'popular_post').
The problem I'm having is that Weka automatically assigns all the text based attributes (e.g. title) to the nominal type. What I want is to:
1) be able to convert things like the title to string without changing the types of other attributes. So far, selecting an attribute form the explorer, selecting the "Nominal to String" filter and applying that doesn't work unless I remove all the other attributes.
2) After turning the title into a string I want to turn the string into a vector so all my posts will have entries for each of the words in the dictionary.
Example data I have:
post_id || title || author || subreddit || score || model_type 001 cats rule! joe1 /r/awww 9001 popular 002 guns rule! bob1 /r/guns 0 doa_post
What I'd like to end up with
post_id || cats || guns || rule || author || subreddit || score || model_type 001 1 0 1 joe1 /r/awww 9001 popular 002 0 1 1 bob1 /r/guns 0 doa_post
Any help would be appreciated!
[link][2 comments]