Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 62797

Is using PCA a good way to reduce dimensionality of text features?

$
0
0

I'm working on a classifier for reddit posts, and I have the impression that non-text features such as subreddit, author, domain or votes are being drown by the sheer number of features from the text (link title, and optionnally comments and linked page).

So I'm thinking of using some sort of dimensionality reduction on the text features before handing them to the classifier. Am I on the right path?

EDIT: thanks everyone for the answers!

submitted by joelthelion
[link] [11 comments]

Viewing all articles
Browse latest Browse all 62797

Trending Articles