If there's a better place to post something like this, please let me know.
I have a data set of high-dimensional categorical variables. It's for a card game where winning requires strategy, and instead of a strict ordering, there's a sort of "rock paper scissors" tactic for winning. Hands are semi-random, there's both strategic and preferential card selection going on. I want to cluster to find different types of hands (bear with me on that, I know it sounds a little silly).
I'm trying to figure out how to represent the data - My current idea is a vector of 52 0s and 1s, where 1s represent cards in your hand? If I run that through something like k-means, how can I display the clusters I come up with?
The biggest issues I've been having are because my data is huge (millions of rows, but I can sample) and I don't know how to represent it visually with such high dimensionality. If I'm able to visualize correctly, I expect to see some cards in several clusters.
I'm working with this in R, if it matters or if there's a good package to help. Also, if anything I posted was unclear, please ask. Thanks!
[link][comment]