Introduction
Bitcoin is like the hottest topic right now in finance and tech circles. It's got everyone chatting, from big-shot investors to regular folks. Keeping an eye on what people are saying about Bitcoin is super important because its value can go up and down like crazy. And guess where all the chatter happens these days? Yep, X (Twitter). By using fancy tools that can analyse tweets in real-time, we can really get a sense of how people are feeling about Bitcoin and how that might shake up the market.
Thankfully, Python, the go-to coding language for loads of developers, is here to save the day. It's got some awesome features for diving into text sentiment analysis. With cool libraries like Tweepy, we can sift through X(Twitter) data and snag those interesting tweets about Bitcoin. And then there's TextBlob, a clever tool for understanding the sentiment in text. When it's time to clean up and organize all that data, libraries like pandas and numpy are there to help out. And let's not forget about matplotlib, the master of visualisations that can help us see the trends in sentiment crystal clear. Armed with these tools, developers can really dig deep into social media data and figure out what the general public thinks about Bitcoin.
Our project journey starts with using Python to collect tweets about Bitcoin from X(Twitter). We use a tool called Tweepy to help us gather these tweets. Once we have all this data, we then analyse the sentiment of the tweets using TextBlob. To present our findings, we rely on matplotlib to create visualisations that help us better understand how people feel about Bitcoin. By following this step-by-step approach, we can see how the sentiment around Bitcoin is changing over time. This process highlights how Python is a powerful tool for studying social media data and tracking the emotions related to cryptocurrencies, especially Bitcoin.
Project Setup.
So, in this project, we'll be utilising a tool called Colaboratory or Colab. It's essentially like a user-friendly version of Jupyter Notebook provided as a service by Google. With Colab, you can easily write and run Python code directly in your browser. The cool thing is you don't need to set up anything, you get free access to GPUs, and sharing your work is super simple.
Step 1
Start by opening google Colab
Step 2
Open a new notebook
This is what the new notebook will look like
Step 3
First things first, let's start by importing the necessary libraries:
import tweepy
from textblob import TextBlob
import pandas as pd
import numpy as np
import re
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
Libraries used:
• Tweepy: Tweepy is a Python library that allows us to interact with the Twitter API. We will use Tweepy to gather tweets about Bitcoin.
• TextBlob: TextBlob is a Python library that provides a simple API for performing natural language processing tasks. We will use TextBlob to analyze the sentiment of tweets.
• Pandas: Pandas is a Python library for data analysis and manipulation. We will use Pandas to store and analyze the results of our sentiment analysis.
• Matplotlib: Matplotlib is a Python library for creating visualizations. We will use Matplotlib to visualize the results of our sentiment analysis.
• NumPy: NumPy is a library that helps you work with numbers and arrays in Python. We will use NumPy to perform numerical calculations and operations on the data you collect from Twitter.
• re: The re library helps you find and replace patterns in text. We will use re to clean the data by removing unnecessary characters and foformatting
• When you use plt.style.use('fivethirtyeight')
in your Python code with matplotlib, you're giving your plots a cool look. This style gives your plots a sleek vibe with muted colors and nice fonts. It's all about making your data visualizations neat and easy on the eyes.
Loading the Login File.
In this step, we're going to load the login file from our local computer using files.upload()
function. Once you've selected the file, it will be uploaded to a temporary directory on the Colab server. You can access the uploaded file using the
uploaded variable:
Step 4
To do this, we'll use the
files.upload()
function from the google.colab
library:
from google.colab import files
files.upload()
Storing the Data.
Once you have collected the data, you need to store it somewhere. There are a few different ways to do this.
Step 5
In this step we start by creating a variable log
:
log = pd.read_csv('login.csv')
Getting the Twitter API Credentials.
To talk to the Twitter API, we need to give it some secret information. This information is stored in a file called login.csv
. The login.csv
file contains your consumer key, consumer secret, access token, and access token secret. These are all long, random-looking strings of characters that you can get from Twitter when you create a developer account.
Step 6
Let's get the Twitter API credentials from the login.csv
file:
consumer_key = log["key"][0]
consumer_secret = log["key"][1]
access_token = log["key"][2]
access_token_secret = log["key"][3]
Creating the Authentication object.
The OAuthHandler object is a Python object that helps us talk to the Twitter API using our secret information from the previous step.
Step 7
To create an OAuthHandler object, we use the following code:
authenticate = tweepy.OAuthHandler(consumer_key, consumer_secret)
Setting the access token and access token secret.
The access token and access token secret are two more pieces of secret information from the login.csv
file. We need to set them on the OAuthHandler object so that it can properly talk to the Twitter API.
Step 8
To do this, we use the following code:
authenticate.set_access_token(access_token, access_token_secret)
Creating the API object.
The API object is the object that we'll use to actually interact with the Twitter API.
Step 9
To create an API object, we use the following code:
api = tweepy.API(authenticate, wait_on_rate_limit=True)
Gathering tweets about Bitcoin and filtering out retweets.
Now we can finally start talking to the Twitter API. We'll use the API object to search for tweets that contain the hashtag #Bitcoin
. We'll also make sure to exclude retweets, and we'll only search for tweets that are in English and that were posted after a certain date.
Step 10
In this step we'll use the following code:
search_term = '#Bitcoin -filter:retweets'
tweets = tweepy.Cursor(api.search, q=search_term, lang='en', since='2018-11-01', tweet_mode='extended').items(2000)
Storing the tweets in a variable and getting the full text.
We'll loop through the tweets that we found in the previous step and extract the full text of each tweet.
Step 11
Let's store all of the full tweets in a list. To do this, we use the following code:
all_tweets = [tweet.full_text for tweet in tweets]
Creating a Data Frame to store the tweets.
We'll create a data frame to store the full tweets in a structured format. This will make it easier to analyse the tweets later on.
Step 12
To create a data frame, we use the following code:
df = pd.DataFrame(all_tweets, columns=['Tweets'])
Cleaning the tweets.
Before we can dive into analyzing the tweets, we need to do a little bit of cleanup. We're going to remove any unnecessary stuff like website links, hashtags, mentions, emojis, and punctuation. We'll also get rid of any duplicate tweets and any tweets that aren't in English.
Why is this step so important?
Well, cleaning the tweets helps us make sure that our sentiment analysis is accurate and reliable. If we don't clean the tweets, the analysis could be thrown off by all that extra noise.
Once the tweets have been cleaned, we can finally start analysing them for sentiment.
Step 13
In this step, let's clean the tweets:
def clean_tweet(tweet):
# Remove hashtags with 'bitcoin'
tweet = re.sub('#bitcoin', 'bitcoin', tweet)
# Remove hashtags with 'Bitcoin'
tweet = re.sub('#Bitcoin', 'Bitcoin', tweet)
# Remove hashtags with any letters or numbers
tweet = re.sub('#[A-Za-z0-9]+', ' ', tweet)
# Remove newlines
tweet = re.sub('\\n', ' ', tweet)
# Remove hyperlinks
tweet = re.sub('https?:\/\/\S+', ' ', tweet)
return tweet
df['Cleaned_tweets'] = df['Tweets'].apply(clean_tweet)
Step 14
Let's save and run the code we've written so far.
Getting the subjectivity and polarity.
Subjectivity indicates the degree of personal opinion in the text. High subjectivity means more opinionated content, while low subjectivity suggests factual information.
Polarity determines if sentiment is positive, negative, or neutral towards Bitcoin. Positive polarity shows favorability, negative indicates negativity, and neutral signifies lack of strong sentiment.
By analyzing subjectivity and polarity in Bitcoin sentiment analysis, we can gain a better understanding of public sentiment towards Bitcoin. This information can be used to track trends, assess opinions, and make informed decisions about Bitcoin.
Step 15
In this step, we want to get the subjectivity and polarity:
def get_subjectivity(tweet):
return TextBlob(tweet).sentiment.subjectivity
def get_polarity(tweet):
return TextBlob(tweet).sentiment.polarity
df['Subjectivity'] = df['Cleaned_tweets'].apply(get_subjectivity)
df['Polarity'] = df['Cleaned_Tweets'].apply(get_polarity)
Step 16
Let's save and run our code.
Getting the Sentiment text.
Imagine you're at a party and you want to get a sense of the overall mood of the guests. You could ask each guest how they're feeling, and then you could categorise their responses as positive, negative, or neutral. This would give you a good idea of the overall sentiment of the party.
The same principle applies to sentiment text data. By analysing the sentiment of a large amount of text data, researchers can get a good understanding of the overall sentiment towards Bitcoin.
Sentiment text data is a powerful tool that can be used to gain insights into public opinion and market trends. It's a valuable resource for researchers who want to understand how people feel about Bitcoin and other topics:
Step 17
To get the sentiment text for this project, we use the following code:
def get_sentiment(score):
if score < 0:
return 'Negative'
elif score == 0:
return 'Neutral'
else:
return 'Positive'
df['Sentiment'] = df['Polarity'].apply(get_sentiment)
Step 18
Let's save and run the code we've written so far
Creating a scatter plot to show the subjectivity and polarity.
Imagine you have a bunch of text data, like reviews of a movie or tweets about a certain topic. You're interested in understanding the overall sentiment and subjectivity of this data, so you decide to create a scatter plot to visualise the relationship between two aspects of the data:
• Subjectivity: How subjective or biassed the text is, ranging from 0 (completely objective) to 1 (completely subjective).
• Polarity: The emotional sentiment of the text, ranging from -1 (negative) to 1 (positive).
By plotting these two metrics on a scatter plot, you can see how they relate to each other. For example, you might find that highly subjective texts tend to have more negative polarity, or that texts with neutral polarity tend to be more objective.
Step 19
Here, we'll a scatter plot to show the subjectivity and polarity and to do this we will use the following code:
plt.figure(figsize=(8, 6))
for i in range(0, df.shape[0]):
plt.scatter(df['Polarity'][i], df['Subjectivity'][i], color='purple')
plt.title('Sentiment Analysis Scatter Plot')
plt.xlabel('Polarity')
plt.ylabel('Subjectivity (objective -> subjective)')
plt.show()
Step 20
Let's save and run the code we've written so far.
Creating a bar chart to show the count of positive, neutral, and negative sentiments.
Finally, we're going to create a bar chart that shows us how many positive, neutral, and negative sentiments we have.
Each bar in the chart will be like a little tower, representing the count of each sentiment type. This way, we can easily see how many of each type we have and compare them to each other.
This visual representation will make it super easy to understand the different types of sentiments in our data and how they compare to each other. It's like having a snapshot of our sentiment analysis results, all in one place.
Step 21
In this final step we will create a bar chart to show the count of positive, neutral, and negative sentiments:
df['Sentiment'].value_counts().plot(kind='bar')
plt.title('Sentiment Analysis Bar Plot')
plt.xlabel('Sentiments')
plt.ylabel('Number of Tweets')
plt.show()
Step 22
Let's save and run our written code so far.
Conclusion
Our Bitcoin sentiment analysis project was a lot of fun, and we learned a lot about how people feel about Bitcoin. By analysing tweets in real-time, we were able to see how people's feelings about Bitcoin changed over time. This information could be really useful for investors, traders, or anyone else who is interested in the cryptocurrency market.
The tools and techniques we used in this project can be used to analyse other social media data. So, if you're interested in understanding how people feel about other topics, like politics or sports, you can use the same techniques that we used in this project.
As the cryptocurrency market continues to grow and evolve, it will be more important than ever to understand how people feel about different cryptocurrencies. By using sentiment analysis, we can gain a better understanding of the market and make more informed decisions about our investments.
Top comments (1)
📌