Web scraping with python-first try

Web scraping with python

For my first web scraping, I followed a tutorial in YouTube by a person named Tinkernut

importing libraries

from bs4 import BeautifulSoup 
import requests
import csv

Here, we import the basic libraries for scraping the data and writing them into a csv file.

url_to_scrape = requests.get('https://quotes.toscrape.com/')
soup = BeautifulSoup(url_to_scrape.text, 'html.parser')
quotes = soup.findAll("span", attrs={"class":"text"})
authors = soup.findAll("small", attrs={"class":"author"})

Here, we specify the url where we will be scrapping the data from and also the classses where the data we want is located. We also specificy the data that we need. I.e, we want to scrap quotes and authors which are in their respective span and class attributes.

file = open("quotes.csv", "w")
writer= csv.writer(file)

The file is opened in a write mode and csv.writer returns a writer object for writing files to the csv file.

writer.writerow(["Quotes", "Author"])
for quote, author in zip(quotes, authors):
  print(quote.text + "." + author.text)
  writer.writerow([quote.text, author.text])
file.close()

This writes the headers "Quotes" and "Author" to the CSV file. It then iterates through pairs of quote and author elements and prints each quote and author to the console. It finally writes each author and code to a new row in the CSV file before closing the csv file.