Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 62700

Homework advice - Clustering w/o K-Means?

$
0
0

Hello all, I'm working on a data mining project that ends with me clustering bag-of-words type data.

The majority of the project so far has been pre-processing (the data is an awesome web-crawled data set of tweets from middle eastern countries during the arab spring!). I have a dictionary made of word counts, so I can assign some sort of weight to each word.

I'm getting to the point now where I need to actually cluster the data. The vectors are very sparse (each feature is a word :/ Maybe I should try something else for this? Kernel method to map it onto some subspace??) After alllll the work I've done preprocessing rough, incomplete, arabic/french/english mixtures of tweets I feel like I've got to find SOME algorithm that's more complicated than the k-means that the professor spoon fed us.

Any thoughts? If anyone knows of an algorithm that's particularly good on sparse data, I will upvote you and your family.

submitted by groundshop
[link] [comment]

Viewing all articles
Browse latest Browse all 62700

Trending Articles