Welcome to this beginner-friendly tutorial on sentiment analysis using Hugging Face's transformers library! Sentiment analysis is a Natural Language Processing (NLP) technique used to determine the emotional tone or attitude expressed in a piece of text.
In this tutorial, you'll learn how to leverage pre-trained machine learning models from Hugging Face to perform sentiment analysis on various text examples. We'll walk you through the entire process, from installing the required packages to running and interpreting the model's output, all within a SingleStore Notebook environment, just like Jupyter Notebook.
By the end of this tutorial, you'll be equipped with the knowledge to use Hugging Face Transformers as a Library for analyzing the sentiment of text data.
What is Hugging Face🤗?
Hugging Face🤗 is a community specializing in Natural Language Processing (NLP) and artificial intelligence (AI). Founded in 2016, the company has made significant contributions to the field of NLP by democratizing access to state-of-the-art machine learning models and tools.
Hugging Face has a strong community focus. They provide a platform where researchers and developers can share their trained models, thereby fostering collaboration and accelerating progress in the field.
NLP stands for Natural Language Processing, which is a field of artificial intelligence that focuses on the interaction between computers and human language. Hugging Face is known for its contributions to NLP through its open-source libraries, pre-trained models, and community platforms.
Hugging Face🤗 Transformers as a Library:
Hugging Face's Transformers library is an open-source library for NLP and machine learning. It provides a wide variety of pre-trained models and architectures like BERT, GPT-2, T5, and many others. The library is designed to be highly modular and easy to use, allowing for the quick development of both research and production projects. It supports multiple languages and tasks like text classification, question-answering, text generation, translation, and more.
Prerequisites
Before you start with this tutorial, make sure you have the following prerequisites in place:
- The only prerequisite for this tutorial is SingleStore Notebook. The tutorial is designed to be followed in a SingleStore Notebook. If you haven't installed SingleStore Notebook yet, you can do so by signing up at SingleStore and then selecting the Notebook feature.
You will land on a SingleStore Notebook dashboard.
From here, we will use it as our python playground to execute our commands.
Step 1: Install Required Packages
First, you'll need to install the transformers library from Hugging Face. You can do this using pip:
!pip install transformers
PyTorch is a prerequisite for using the Hugging Face transformers library.
You can install PyTorch by running the following command in your SingleStore Notebook:
!pip install torch
Restart the Kernel: After installing, you may need to restart the SingleStore Notebook kernel to ensure that the newly installed packages are recognized. You can usually do this by clicking on "Kernel" in the menu and then selecting "Restart Kernel".
Step 2: Import Libraries
Import the necessary Python libraries.
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
Step 3: Load Pre-trained Model and Tokenizer
Load a pre-trained model and its corresponding tokenizer. For this example, let's use the distilbert-base-uncased-finetuned-sst-2-english model for sentiment analysis.
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
Step 4: Preprocess Text
Tokenize the text you want to analyze.
text = "I love programming!"
tokens = tokenizer(text, padding=True, truncation=True, return_tensors="pt")
Step 5: Model Inference
Pass the tokenized text through the model.
with torch.no_grad():
outputs = model(**tokens)
logits = outputs.logits
probabilities = torch.softmax(logits, dim=1)
Step 6: Interpret Results
Interpret the model's output to get the sentiment.
label_ids = torch.argmax(probabilities, dim=1)
labels = ['Negative', 'Positive']
label = labels[label_ids]
print(f"The sentiment is: {label}")
This should output either "Positive" or "Negative" based on the sentiment of the text.
Make sure you are executing your code in the SingleStore's Notebook playground.
Let's modify the text we want to analyze from "I love programming!" to "I hate programming!". You should see a Negative sentiment analysis.
Let's analyze one more sentence "SingleStore's Notebook feature is just mind blowing!" and see the response. (it should be positive as expected)
Congratulations on completing this beginner-friendly tutorial on sentiment analysis using Hugging Face's transformers library! By now, you should have a solid understanding of how to use pre-trained models to analyze the sentiment of text. You've learned how to tokenize text, run it through a model, and interpret the output—all within a SingleStore Notebook environment.
Top comments (1)
This information is very useful for anyone learning about Hugging Face! There are many amazing things you can do with their Datasets, Models, and Spaces.
In my opinion, the Hugging Face community is the bedrock of the platform. The community has attracted developers, researchers, and AI enthusiasts who share their knowledge, experiences, and resources to support each other in their ML journeys.
Within the Hugging Face community, users can participate in forums and discussions to exchange ideas, seek advice, and showcase their projects. You can also join their Discord channel, which has tens of thousands of members that share knowledge and help each other. Whether you’re a beginner seeking guidance or an expert looking to contribute, the community welcomes everyone of all skill levels and backgrounds.
Along with the discord, Hugging Face also provides comprehensive support through its documentation, tutorials, and courses. Users can access guides, code examples, and step-by-step tutorials to help them get started with Hugging Face and master advanced NLP techniques. For instance, when exploring a specific task like Text Generation, you’ll find plenty of relevant information to assist and guide you in its proper usage. This includes concise explanations about the task itself, accompanied by videos, demos, and use cases. Additionally, it shows all models for that given task and their datasets available.
If you are interested in learning more about this amazing library, I recommend reading this article from my partner Nicolas Azevedo, which provides some good examples of Hugging Face: scalablepath.com/machine-learning/...