DEV Community

Data Science at Home

Episode 64: Get the best shot at NLP sentiment analysis

The rapid diffusion of social media like Facebook and Twitter, and the massive use of different types of forums like Reddit, Quora, etc., is producing an impressive amount of text data every day. 

There is one specific activity that many business owners have been contemplating over the last five years, that is identifying the social sentiment of their brand, by analysing the conversations of their users.

In this episode I explain how one can get the best shot at classifying sentences with deep learning and word embedding.

 

 

Additional material

Schematic representation of how to learn a word embedding matrix E by training a neural network that, given the previous M words, predicts the next word in a sentence. 

 

 

 

Word2Vec example source code

https://gist.github.com/rlangone/ded90673f65e932fd14ae53a26e89eee#file-word2vec_example-py

 

 

References

[1] Mikolov, T. et al., "Distributed Representations of Words and Phrases and their Compositionality", Advances in Neural Information Processing Systems 26, pages 3111-3119, 2013.

[2] The Best Embedding Method for Sentiment Classification, https://medium.com/@bramblexu/blog-md-34c5d082a8c5

[3] The state of sentiment analysis: word, sub-word and character embedding  https://amethix.com/state-of-sentiment-analysis-embedding/

 

Episode source