DEV Community

Cover image for Scrape Google Spell Check with Python
Dmitriy Zub ☀️
Dmitriy Zub ☀️

Posted on • Edited on

1 1

Scrape Google Spell Check with Python

Contents: intro, imports, what will be scraped, process, code, links, outro.

Intro

This blog post is a continuation of Google's web scraping series. Here you'll see examples of how you can scrape Google Spell Check with Python. An alternative API solution will be shown.

Imports

from bs4 import BeautifulSoup
import requests, lxml
from serpapi import GoogleSearch
Enter fullscreen mode Exit fullscreen mode

What will be scraped

image

Process

Selecting CSS selector that support autocompletion on all languages

Process of using SerpApi from the playground search query to the final output

Code

from bs4 import BeautifulSoup
import requests, lxml

headers = {
    'User-agent':
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}

params = {
  'q': 'fush ro dah',
  'hl': 'en',
  'gl': 'us',
}

html = requests.get('https://www.google.com/search?q=', headers=headers, params=params).text
soup = BeautifulSoup(html, 'lxml')

corrected_word = soup.select_one('a.gL9Hy').text
corrected_word_link = f"https://www.google.com{soup.select_one('a.gL9Hy')['href']}"
search_instead_for = soup.select_one('a.spell_orig').text
search_instead_for_link = f"https://www.google.com{soup.select_one('a.spell_orig')['href']}"
print(f'{corrected_word}\n{corrected_word_link}\nSearch instead: {search_instead_for}\n{search_instead_for_link}')

-------
'''
fus ro dah
https://www.google.com/search?hl=en&gl=us&q=fus+ro+dah&spell=1&sa=X&ved=2ahUKEwiIwb3ykMzxAhVWSzABHQtlDeMQkeECKAB6BAgBEDA
Search instead: fush ro dah
https://www.google.com/search?hl=en&gl=us&q=fush+ro+dah&nfpr=1&sa=X&ved=2ahUKEwiIwb3ykMzxAhVWSzABHQtlDeMQvgUoAXoECAEQMQ
'''
Enter fullscreen mode Exit fullscreen mode

Using Google Spell Check API

SerpApi is a paid API with a free trial of 5,000 searches.

from serpapi import GoogleSearch
import os

params = {
  "api_key": os.environ["API_KEY"],
  "engine": "google",
  "q": "fus ro dish",
  "gl": "us",
  "hl": "en"
}

search = GoogleSearch(params)
results = search.get_dict()

print(results['search_information']['organic_results_state'])
print(results['search_information']['spelling_fix'])

--------
'''
Some results for exact spelling but showing fixed spelling
fus ro dah
'''
Enter fullscreen mode Exit fullscreen mode

Links

Code in the online IDEGoogle Spell Check API

Outro

If you have any questions or something isn't working correctly or you want to write something else, feel free to drop a comment in the comment section or via Twitter at @serp_api.

Yours,
Dimitry, and the rest of SerpApi Team

Heroku

Deliver your unique apps, your own way.

Heroku tackles the toil — patching and upgrading, 24/7 ops and security, build systems, failovers, and more. Stay focused on building great data-driven applications.

Learn More

Top comments (0)

👋 Kindness is contagious

Dive into this thoughtful article, cherished within the supportive DEV Community. Coders of every background are encouraged to share and grow our collective expertise.

A genuine "thank you" can brighten someone’s day—drop your appreciation in the comments below!

On DEV, sharing knowledge smooths our journey and strengthens our community bonds. Found value here? A quick thank you to the author makes a big difference.

Okay