Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 62728

[OC] I made a basic Recommendation Engine for Reddit. Would love to hear your feedback about the implementation.

$
0
0

I made a recommendation engine for Reddit. It allows redditors to discover new subreddits based on which subreddits they are subscribed to.

You can check it out here

Steps I followed:

1. Find the top 1,400 subreddits by number of users (those subs that have more than 10k subscribers). 2. for each of those subs, get comments until there is a userbase of 250k redditors 3. For each subredditor, parse from their profile page all comments and posts they have made. 4. For each sub among those 1,4K , compute their similarity with the rest using the [Jaccard Similarity coefficient](http://en.wikipedia.org/wiki/Jaccard_index). 5.Create slick website :) 6. When a user searches for a redditor, the app checks if that user exists on the database, if not, it pulls the user subs following the logic explained on step 3. 7.Once the list of all subs that redditor is subscribed to is retrieved, sum the similarities of those subs with every other sub on the database. 8. Return the top subs based on the total sum of similarities calculated on 7. 

Some people have suggested that I could get better information about a user by using the Reddit Api. Even though by implementing this method I could get all the subs a user is subscribed to (instead of those the user has commented or posted on), that would retrieve also the default subreddits that user has not unsubscribed from, therefore increasing the noise on the recommendation algorithm.

What do you guys think of the implementation?

submitted by manueslapera
[link][7 comments]

Viewing all articles
Browse latest Browse all 62728

Trending Articles