Skip to content

DEV Community

Matt Hamilton

Posted on Jun 17, 2020 • Edited on Oct 22, 2020

Analysing the Tone of Tweets with Keras

#python #machinelearning #keras #sentiment

This is a write up of a live coding session from my show "ML for Everyone" broadcast on the IBM Developer live streaming Twitch channel every Tuesday.

This session was an attempt to train a neural network to detect the sentiment of tweets. Specifically I wanted it to be able to detect joyful tweets for my #gftwhackathon entry:

hammertoe

Joyful Tweets

Matt Hamilton ・ Jun 3 '20 ・ 4 min read

#gftwhackathon #node #machinelearning #ibmwatson

This is a follow on from the previous session in which I used an existing sentiment analysis service, IBM Watson Tone Analyzer to detect sentiment. Using that service was nice and quick to get going, but it only allowed me to send one tweet at a time to it, which resulted in the service being quite slow, or me hitting rate limits of the service. So this is the beginnings of creating my own simpler version of that service.

hammertoe

Live coding my DEV/GftW Hackathon entry

Matt Hamilton ・ Jun 2 '20 ・ 1 min read

#gftwhackathon #machinelearning #coil #twitch

Session recap

In this session I used IBM Watson Studio to analyse the content of around 800,000 tweets I downloaded from twitter. Each tweet contained one of the words: joy, anger, angry, happy, sad.

The goal was to create and train a neural network using Keras, a high level Python API, to learn what a 'joyful' tweet might look like.

The basics steps of the process were:

Download a selection of tweets, about 800,000 in total from Twitter's API
Categorise those tweets into being either 'joyful' or 'angry'. I used a pretty naive crude regular expression match for this.
Tokenise the tweets, using a tokeniser in the Kera preprocessing package that split the words up and lowercased them
Download a pre-trained "word vector" that represents words in tweets as a 100-dimensional vector.
Create a neural network consisting of two LSTM layers (ideal for learning word sequences) with dropout layers to prevent overfitting.
Load the word vector from above into the embedding layer of the network
Train the network on the processed tweets
Evaluate the network performance with a few real world examples

Python notebook

The full Python notebook for this session is in the Github repository for this session:

IBMDeveloperUK / ML-For-Everyone

Resources, notebooks, assets for ML for Everyone Twitch stream

Conclusion

Well, it seemed to work. Looking at the examples we tested on we got:

"I love the world": 53% joy; 47% anger
"I hate the world": 22% joy; 78% anger
"I'm not happy about riots": 45% joy; 55% anger
"I like ice cream": 63% joy; 37% anger

The next steps will be to take this trained model and deploy it as a service such that we can then query it from the Joyful Tweets application.

I hope you enjoyed the video, if you want to catch them live, I stream each week at 2pm UK time on the IBM Developer Twitch channel:

https://developer.ibm.com/livestream

Top comments (1)

Subscribe

merus2 • Aug 27 '20

Excellent, thank you.
I have been trying to understand how to bring my own embedding into a model for a while now.
This line of code made my day :-)!!
model.add(Embedding(vocab_size,100, weights=[emdedding_matrix], input_length = max_len, trainable = False).
The "trainable = False" parameter was the cherry on the cake

Read next

Displaying Python Script Outputs on Conky Panels

Everton Tenorio - Dec 11 '24

Part 7: Building Your Own AI - Convolutional Neural Networks (CNNs) for Image Processing

Trix Cyrus - Dec 11 '24

Level Up Your Python Skills with These Fun Coding Games! 🎮🐍

Hadil Ben Abdallah - Dec 10 '24

How to disable GIL (Global Interpreter Lock) in Python 3.13

Sachin - Nov 20 '24

Director of DevRel Bittensor. Prev: Ripple, IBMDeveloper. Interested in cryptocurrencies / blockchain / python / AI/ML. He/Him.

Location

Barbados
Education

University of Bristol
Work

Director of Developer Relations at Bittensor
Joined

May 14, 2020

Exploring Bias in Crime Data

#datascience #python #crime

Intro to Machine Learning - The Titanic Dataset

#machinelearning #datascience #twitch #python

Intro to Generative Adversarial Networks (GANs)

#python #gans #machinelearning #tutorial