DEV Community

Cover image for Create a Wordcloud of News Headlines in Python!
Code_Jedi
Code_Jedi

Posted on • Edited on

3 3

Create a Wordcloud of News Headlines in Python!

Today, I'll be showing you a simple way to make a wordcloud of news headlines in python!


If you haven't read this tutorial explaining how to scrape news headlines in python, make sure you do.
In summary, here's the code for scraping news headlines in python:

import requests
from bs4 import BeautifulSoup

url='https://www.bbc.com/news'
response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')
headlines = soup.find('body').find_all('h3')
for x in headlines:
    print(x.text.strip())

Enter fullscreen mode Exit fullscreen mode

To create a wordcloud out of these news headlines, first import these 2 libraries beside the libraries needed to scrape our news source:

import requests
from bs4 import BeautifulSoup
from wordcloud import WordCloud #add wordcloud
import matplotlib.pyplot as plt #add pyplot from matplotlib
Enter fullscreen mode Exit fullscreen mode

Next, replace

for x in headlines:
    print(x.text.strip())
Enter fullscreen mode Exit fullscreen mode

with

h3text = ''
for x in el:
    h3text = h3text + ' ' + x.text.strip()
Enter fullscreen mode Exit fullscreen mode
  • This will first define the "h3text" string, then add every news headline to the string and seperate them with spaces.

Before we make the wordcloud, you can check the news headlines by using print(h3text)


To make the wordcloud, add these lines of code to the end of your script:

wordcloud = WordCloud(width=500, height=500, margin=0).generate(soup.get_text(h3text))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.margins(x=0, y=0)
plt.show()
Enter fullscreen mode Exit fullscreen mode

Let me explain...

  • First create a wordcloud(well, more like a box in this case) sized 500 by 500.
  • Next, our wordcloud will be created using "plt.imshow()" (interpolation='bilinear' just makes the words in the wordcloud easier to read).
  • plt.axis("off") and plt.margins(x=0, y=0) make sure our wordcloud isn't displayed as a graph.
  • Finally, our wordcloud is displayed using "plt.show()".

If you run your code, your wordcloud should look something like this:

wordcloud
Of course, your wordcloud will probably be quite different since news headlines change all the time.


That's it for this Tutorial/Mini-project!



If you're a beginner who likes discovering new things about python, try my weekly python newsletter

Newsletter thumbnail


Byeeeee👋

Image of Datadog

Create and maintain end-to-end frontend tests

Learn best practices on creating frontend tests, testing on-premise apps, integrating tests into your CI/CD pipeline, and using Datadog’s testing tunnel.

Download The Guide

Top comments (0)

The Most Contextual AI Development Assistant

Pieces.app image

Our centralized storage agent works on-device, unifying various developer tools to proactively capture and enrich useful materials, streamline collaboration, and solve complex problems through a contextual understanding of your unique workflow.

👥 Ideal for solo developers, teams, and cross-company projects

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay