Quantcast
Viewing all articles
Browse latest Browse all 62700

Has this been tried, does it have a name? (random projections and database hash)

I'm trying to find if this has been done or has a name. You take data vectors, multiply them by a set of n random vectors (you use thesame random vectors on every data vector), and sum the result along each multiplication, yielding a new vector of size n. (this i believe might be random matrices or a random two layer net).

The next step, you digitize the resulting vectors m times for various bin sizes (e.g. 2 bins, 10 bins, 20 bins etc.). Then you hash the resulting vectors and save to a key value store with hashes as key and data labels as value (the hashing step is simply to reduce database size).

Then on the prediction task, you do the random projection and binnings, and then instead of saving to disk, query the hashes. You should get a number of labels back, you take the most common label as your prediction.

submitted by marshallp
[link] [7 comments]

Viewing all articles
Browse latest Browse all 62700

Trending Articles