DEV Community

Durga Pokharel
Durga Pokharel

Posted on

3 2

Day 91 Of 100DaysOfCode: Word Tokenization with NLTK

This is my 91st day of #100daysofcode and #python learning journey. Talking about today's progress I did write one blog and push the blog on GitHub. Did some code on random topic.

Like usual today also keep learning from Datacamp chapter Natural Language Processing regarding to the topic Word Tokenization with NLTK.

code

# Import necessary modules
from nltk.tokenize import sent_tokenize
from nltk.tokenize import word_tokenize

# Split scene_one into sentences: sentences
sentences = sent_tokenize(scene_one)

# Use word_tokenize to tokenize the fourth sentence: tokenized_sent
tokenized_sent = word_tokenize(sentences[3])

# Make a set of unique tokens in the entire scene: unique_tokens
unique_tokens = set(word_tokenize(scene_one))

# Print the unique tokens result
print(unique_tokens)

Enter fullscreen mode Exit fullscreen mode

Day 91 Of #100daysofcode and #Python
Word Tokenization with NLTK From https://t.co/b2X089pkqcDataCamp#WomenWhoCode #CodeNewbie #100DaysOfCode #DEVCommunity pic.twitter.com/xBLjPrnUT6

— Durga Pokharel (@durgacodes) March 30, 2021

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay