I made my bachelor's thesis on distributional semantic models, more specifically a common weighted count-based model. I have heard a lot of hype about Mikolov's word2vec model, and started reading up on it. But despite much effort into decrypting his papers on the topic, I still feel like I don't know how the model actually works. I believe it could be because I haven't studied deep learning models before or its word embeddings, but wanted to verify this with you first, and ask for help to pick out some topics as prerequisite to word2vec.
Thanks
[link][12 comments]