DEV Community: Mahmoud Sehsah

Getting Started with Natural Language Toolkit (NLTK)

Mahmoud Sehsah — Sat, 27 Jan 2024 22:22:19 +0000

Introduction

NLTK (Natural Language Toolkit), one of the most popular libraries in Python for working with human language data (i.e., text). This tutorial will guide you through the installation process, basic concepts, and some key functionalities of NLTK.

Link for the Notebook

1.Installation

First, you need to install NLTK. You can do this easily using pip. In your command line (Terminal, Command Prompt, etc.), enter the following command:

!pip install nltk

2.Understanding the Role of nltk.download() in NLTK Setup

Use nltk.download() to fetch datasets and models for text processing with NLTK, ensuring updated resources and easing setup.

import nltk
nltk.download()

3.Tokenization

Tokenization is the process of splitting a text into meaningful units, such as words or sentences.

from nltk.tokenize import word_tokenize, sent_tokenize

text = "Hello there! How are you? I hope you're learning a lot from this tutorial."

# Sentence Tokenization
sentences = sent_tokenize(text)
print(sentences)

# Word Tokenization
words = word_tokenize(text)
print(words)

4. Part-of-Speech (POS) Tagging

POS tagging means labeling words with their part of speech (noun, verb, adjective, etc.).

from nltk import pos_tag

words = word_tokenize("I am learning NLP with NLTK")
pos_tags = pos_tag(words)
print(pos_tags)

5. Stopwords

Stopwords are common words that are usually removed from text because they carry little meaningful information.

from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize

words = word_tokenize("Hello there! How are you? I hope you're learning a lot from this tutorial.")
stop_words = set(stopwords.words('english'))
filtered_words = [word for word in words if not word in stop_words]
print(filtered_words)

6. Stemming

Stemming is a process of stripping suffixes from words to extract the base or root form, known as the 'stem'. For example, the stem of the words 'waiting', 'waited', and 'waits' is 'wait'.

from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize
ps = PorterStemmer()
sentence = "It's important to be waiting patiently when you're learning to code."
words = word_tokenize(sentence)
stemmed_words = [ps.stem(word) for word in words]
print(stemmed_words)

7. Lemmatization

Lemmatization is the process of reducing a word to its base or dictionary form, known as the 'lemma'. Unlike stemming, lemmatization considers the context and converts the word to its meaningful base form. For instance, 'is', 'are', and 'am' would all be lemmatized to 'be'.

import nltk
from nltk.stem import WordNetLemmatizer
from nltk.tokenize import word_tokenize

nltk.download('punkt')
nltk.download('wordnet', download_dir='/usr/share/nltk_data/corpora/wordnet')  # specify your NLTK data directory if it's not in the default location

lemmatizer = WordNetLemmatizer()
sentence = "The leaves on the ground were raked by the gardener, who was also planting bulbs for the coming spring."
words = word_tokenize(sentence)
lemmatized_words = [lemmatizer.lemmatize(word) for word in words]
print(lemmatized_words)

8.Frequency Distribution

This is used to find the frequency of each vocabulary item in the text.

from nltk.probability import FreqDist
words = word_tokenize("I need to write a very, very simple sentence")
fdist = FreqDist(words)
print(fdist.most_common(1))

9. Named Entity Recognition (NER)

NER is used to identify entities like names, locations, dates, etc., in the text.

import nltk
from nltk.tokenize import word_tokenize
from nltk.tag import pos_tag
from nltk.chunk import ne_chunk

nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('maxent_ne_chunker')
nltk.download('words')

sentence = "I will travel to Spain"
# Tokenize the sentence
words = word_tokenize(sentence)
# Part-of-speech tagging
pos_tags = pos_tag(words)
# Named entity recognition
named_entities = ne_chunk(pos_tags)
# Print named entities
print(named_entities)

Deploying HuggingFace Chat UI with the Hugging Face Text Generation Inference Server

Mahmoud Sehsah — Tue, 23 Jan 2024 02:01:34 +0000

Introdcution

Before we dive into deploying the Hugging Chat UI, let's first explore the capabilities of the Hugging Face Text Generation Inference Server. We'll start with a practical walkthrough, demonstrating how to access and utilize its API endpoints effectively. This initial exploration is key to understanding the various configurations available for text generation and how they can enhance your AI interactions.

Start The Hugging Face Inference Server

In this section, we focus on launching the Hugging Face Text Generation Inference Server, specifically configured with 8-bit quantization. This setting is pivotal for optimizing GPU memory utilization, ensuring efficient resource management, please refer to the detailed setup instructions provided in this link



export model=mistralai/Mistral-7B-v0.1
export volume=$PWD/data



docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --quantize=bitsandbytes --model-id $model

Discover Hugging Face Inference Server endpoints

Call the default generate Enpoint



curl --location 'http://127.0.0.1:8080/generate' \
--header 'Content-Type: application/json' \
--data '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}'

Call the streaming endpoint



curl --location 'http://127.0.0.1:8080/generate_stream' \
--header 'Content-Type: application/json' \
--data '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}'

Call the generate endpoint while activating sampling



curl --location 'http://127.0.0.1:8080/generate' \
--header 'Content-Type: application/json' \
--data '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":100, "do_sample":true, "top_k":50 }}'

Call the generate endpoint while changing temperature



curl --location 'http://127.0.0.1:8080/generate' \
--header 'Content-Type: application/json' \
--data '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":50, "do_sample":true, "top_k":50, "temperature":0.2 }}'

For more Generation strategies please refer to this link : https://huggingface.co/docs/transformers/generation_strategies

Monitoring with Health, Info, and Metrics API Endpoints

Ensuring System Health



curl --location 'http://127.0.0.1:8080/health'

Retrieving Server Information



curl --location 'http://127.0.0.1:8080/info'

Accessing Performance Metrics Endpoint



curl --location 'http://127.0.0.1:8080/metrics'

Install Hugging Face Chat UI

Clone the Repository

Initiate your project by cloning the Hugging face chat UI repository:



git clone https://github.com/huggingface/chat-ui.git

Configure the Environment

After cloning the repository, you'll need to set up your environment by editing the .env file. This involves specifying the correct IP addresses for your MongoDB instance and the Hugging Face Text Generation Inference Server.

Editing MongoDB Configuration:

Locate and edit the MONGODB_URL in the .env file to point to your MongoDB instance. Replace ${MONGO_DB_IP} with the actual IP address of your MongoDB server.



MONGODB_URL=mongodb://${MONGO_DB_IP}:27017

Setting Up Text Generation Inference Server Connection:

In the same .env file, ensure that the Hugging Face Text Generation Inference Server is correctly configured. Below is a JSON configuration snippet that you'll need to adjust based on your setup, it's important to recognize the MODELS object encapsulates your models' configurations:



{
      "name": "mistralai/Mistral-7B-Instruct-v0.1-local",
      "displayName": "mistralai/Mistral-7B-Instruct-v0.1-name",
      "description": "Mistral 7B is a new Apache 2.0 model, released by Mistral AI that outperforms Llama2 13B in benchmarks.",
      "websiteUrl": "https://mistral.ai/news/announcing-mistral-7b/",
      "preprompt": "",
      "chatPromptTemplate" : "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s>{{/ifAssistant}}{{/each}}",
      "parameters": {
        "temperature": 0.1,
        "top_p": 0.95,
        "repetition_penalty": 1.2,
        "top_k": 50,
        "max_new_tokens": 1024,
        "stop": ["</s>"]
      },
      "endpoints": [{
        "type" : "tgi",
        "url": "http://${TEXT_GENERATION_INFERENCE_SERVER}:80/",
        }],
      "promptExamples": [
      {
          "title": "Assist in a task",
          "prompt": "How do I make a delicious lemon cheesecake?"
        }
      ]
    }

Build the Chat UI Docker image



DOCKER_BUILDKIT=1 docker build -t hugging-face-ui .

Run MongDB



docker run -d -p 27017:27017 --name mongo-chatui mongo:latest

Run the Hugging-Face Chat UI



docker run -p:3000:3000 hugging-face-ui

Deploy Mistral LLM on Google Compute Engine with Docker, GPU Support, and Hugging Face Inference Server

Mahmoud Sehsah — Sun, 21 Jan 2024 20:21:46 +0000

Introduction

A practical guide on setting up Large Language Models (LLMs) on Google Compute Engine using GPUs. This guide is designed to walk you through the process step by step, making it easy for you to take advantage of the powerful combination of Google's cloud infrastructure and NVIDIA's GPU technology.

Machine Specs for the tutorial

Hardware Specifications:

GPU Information:

GPU Type: Nvidia T4
Number of GPUs: 2
GPU Memory: 16 GB GDDR6 (per GPU)

Google compute engine Machine Type:

Type: n1-highmem-4
vCPUs: 4
Cores: 2
Memory: 26 GB

Disk Information:

Disk Type: Balanced Persistent Disk
Disk Size: 150 GB

Software Specifications:

Operating System:

Ubuntu Version: 20.04 LTS

CUDA version:

CUDA version: 12.3

1.Setting Up Docker

Follow these simple steps to get Docker up and running on your system:

1.1 Adding Docker's Official GPG Key

add Docker’s official GPG key to your system. This step is crucial for validating the authenticity of the Docker packages you'll be installing

sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

1.2 Adding Docker Repository to Apt Sources

Add Docker's repository to your system's Apt sources. This allows you to fetch Docker packages from their official repository:

echo \
  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

1.3 Installing Docker

Install Docker using the following command

sudo apt-get --reinstall install docker-ce

1.4 Create Docker Group

If not already present, add the 'docker' group to your system:

sudo groupadd docker

1.5 Add default User to Docker Group

Add your default user to the 'docker' group to manage Docker as a non-root user:

sudo usermod -aG docker $USER

1.6 check on Docker status

systemctl status docker

2.Install NVIDIA Container Toolkit

2.1 Add NVIDIA GPG Key and NVIDIA Container Toolkit Repository

Start by adding the NVIDIA GPG key to ensure the authenticity of the software packages and add the NVIDIA Container Toolkit repository to your system's software sources:

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

2.2 Enable Experimental Features (Optional)

If you wish to use experimental features, uncomment the respective lines in the sources list:

sudo sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list

2.3 Update Package Index and Install NVIDIA Toollit

pdate your package index and install the NVIDIA Container Toolkit:

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

3.Configure Container Toolkit

3.1 Configure NVIDIA Container Toolkit

Configure the NVIDIA Container Toolkit to work with Docker:

sudo nvidia-ctk runtime configure --runtime=docker

3.2 Restart Docker Service

Apply the changes by restarting the Docker service:

sudo systemctl restart docker

4.Prerequisites Before Installing CUDA Drivers

Ensure your system meets the following prerequisites before proceeding with the CUDA driver installation. For detailed guidance, refer to the official NVIDIA CUDA installation guide (https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#pre-installation-actions).

4.1 Verify CUDA-Capable GPU

First, confirm that your system has an NVIDIA GPU installed, this command should return information about the NVIDIA graphics card if one is present.

lspci | grep -i nvidia

4.2 Confirm Supported Linux Version

Ensure your Linux distribution is supported by checking its version, this command will display the architecture of your system and details about your Linux distribution :

uname -m && cat /etc/*release

4.3 Check Kernel Headers and Development Packages

Verify that your system has the appropriate kernel headers and development packages, which are essential for building the NVIDIA kernel module:

uname -r

5. Installing NVIDIA Drivers

Follow these steps to install NVIDIA drivers on your system. For detailed instructions, you can refer to the NVIDIA Tesla Installation Notes (https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html).

5.1 Install Required Kernel Headers

Start by installing the Linux kernel headers corresponding to your current kernel version:

sudo apt-get install linux-headers-$(uname -r)

5.2 Add the NVIDIA CUDA Repository

Identify your distribution's version and add the NVIDIA CUDA repository to your system:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID | sed -e 's/\.//g')
wget https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb

5.3 Update and Install NVIDIA Drivers

Finally, update your package lists and install the CUDA drivers, after the installation it will need a restart

sudo apt-get update
sudo apt-get -y install cuda-drivers

6. Post-Installation Steps for NVIDIA Driver

After successfully installing the NVIDIA drivers, perform the following post-installation steps to ensure everything is set up correctly. For a comprehensive guide, consult the NVIDIA CUDA Installation Guide for Linux (https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions).

6.1 Verify NVIDIA Persistence Daemon

Check the status of the NVIDIA Persistence Daemon to ensure it's running correctly:

systemctl status nvidia-persistenced

6.2 Monitor GPU Utilization

To confirm that your GPU is recognized and monitor its utilization, use:

nvidia-smi

Define model configuration

To set up the model configuration, you can use the following environment variables, the variable model is set to mistralai/Mistral-7B-v0.1, representing the Mistral-7B-v0.1 model for the tutorial, The variable volume is set to the present working directory ($PWD) followed by /data, indicating the directory path where data will be stored :

export model=mistralai/Mistral-7B-v0.1
export volume=$PWD/data

Run text-generation-inference using Docker

To perform text generation inference, we will will use the Huggingface text generation inference server (for more details check this url https://huggingface.co/docs/text-generation-inference/index), execute the following Docker command with the following parameters, and also here is the parameters explanation:

--gpus all: Enables GPU support for Docker containers.
--shm-size 1g: Sets the shared memory size to 1 gigabyte.
-p 8080:80: Maps port 8080 on the host to port 80 in the Docker container.
-v $volume:/data: Mounts the local data volume specified by $volume inside the Docker container at the /data path.
ghcr.io/huggingface/text-generation-inference:1.3: Specifies the Docker image for text-generation-inference with the version tag 1.3.
--model-id $model: Passes the specified model identifier ($model) to the text-generation-inference application.

docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model

check GPU utilisation

Run again the GPU monitoring command to check the memory utilization after loading model weights into the GPU memory :

nvidia-smi

Test API endpoint

To test the API endpoint, use the following curl command:

curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' -H 'Content-Type: application/json'