Quantcast
Viewing all articles
Browse latest Browse all 62682

Trying to implement LSA in matlab. How can I build my incidence matrix faster?

I'm attempting to utilize latent semantic analysis (LSA) to develop a semantic space in order to use a SVM to classify text based on semantic relatedness. I've attempted to use a few open source versions of LSA and have been met with just failure after failure.

So now, I'm trying to develop my own function which produces an incidence matrix in matlab, so that I can then use singular value decomposition. For the life of me I cannot figure out an efficient method to do this in matlab.

Basically my program is searching a cell array for terms that have already been encountered in documents to find their index in the matrix. If they are not found, they are added to the index and the matrix. However, searching for the terms in the cell array and finding their indices is taking forever. I'm using:

wherestring = strcmp(A{1}{j},term); i = find(wherestring, 1,'first');

to search for the string (word) in A{1}{j} in the term cell array.

Is there a faster way to do this? All I can think is to implement some data structure to make searching more efficient (binary search tree or hash table), but this doesn't seem to be easily implemented in matlab. Sigh.

submitted by mayonaise55
[link][6 comments]

Viewing all articles
Browse latest Browse all 62682

Trending Articles