Want to test my recommendation engine, need help finding new datasets!

Hey everyone, I've written a recommendation engine that does some neat stuff (ILP relational OTF subgraph building) and some boring traditional stuff (w00t matts correlation, bayesian statistics). Without getting too technical, I was wondering if anyone knows a generic source for these. I already found the movieslens one, which worked, but I'd like something that people disagree about a lot more.

The problem with the movielens dataset was that a very small number of movies had a very large percentage of likes and almost everyone agreed that the good movies were really, really good. So in order to optimize for a peak matts correlation (after intelligently breaking apart the graph into a training and testing set) the types of recommendations that came out (while correct) made little sense to humans. For example, when I ignored metadata like genre, Shrek was very high up for people that liked The Matrix. That recommendation would surprise people because those movies are impossibly dissimilar, but the overlap between people that liked The Matrix was very high into liking Shrek (though not the other way around, obviously).

Anyways, I'm pretty amped about what I've built and I'm looking for more validation. Ideally the dataset would have both positive and negative signals, but it's not strictly required. I've got a couple NLP modules in here too, so don't shy away from text heavy content either.

Thanks in advance :D

submitted by arh_the_drones_come
[link] [3 comments]

Want to test my recommendation engine, need help finding new datasets!

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...