Word Frequency Counter using NLTK

#nltk #python #nlp #tutorial

NLTK is short for Natural Language Toolkit, which is an open-source Python library for NLP.

We want to count the frequency of words for the following text using NLTK.

text= "Morocco, officially the Kingdom of Morocco, is the westernmost country in the Maghreb region of North Africa. It overlooks the Mediterranean Sea to the north and the Atlantic Ocean to the west, and has land borders with Algeria to the east, and the disputed territory of Western Sahara to the south. "

To install NLTK

pip install nltk

If you don't have Jupyter installed type the following commands in your terminal.

pip install jupyterlab

pip install notebook

pip install voila

run Jupyter with

jupyter notebook

Import the following libraries.

Assign the text to a variable.

The following function divides a sentence into words and punctuations.

Which you can see in the output.

The following code loops over the text tokens and counts the number of times a given token occurred.
Using lower() we're going to convert the words into lowercase, like this we can avoid considering the same word in uppercase as different.

Top 10 most frequent words:

Now let's visualize it using Matplotlib.

DEV Community

Word Frequency Counter using NLTK

Top comments (0)