DEV Community

csdj92
csdj92

Posted on

2 2

Web crawling with python and selenium

Intro

For a recent job interview I was tasked with scraping a web table and converting the data into a csv file. To get started I first searched the around for the proper tools for the job. I knew I would need to install selenium with pip install -U selenium then from reading the docs it states 'Selenium requires a driver to interface with the chosen browser.'. For this I choose the chrome webdriver once downloaded Make sure it’s in your PATH, e. g., place it in /usr/bin or /usr/local/bin. if you do not do this chrome will fail to open.

Learing to crawl

To get started you must import selenium from webdriver, set a variable equal to webdriver.Chrome() and then call the variable.get('url here')

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

browser = webdriver.Firefox()

browser.get('http://www.yahoo.com')
assert 'Yahoo' in browser.title

elem = browser.find_element_by_name('p')  # Find the search box
elem.send_keys('seleniumhq' + Keys.RETURN)

browser.quit()
Enter fullscreen mode Exit fullscreen mode

Congrats you just learned to crawl. In my next blog I will go over using pandas to turn a web table into a dataframe then csv.

Make sure to check out the selenium docs for more information!

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (1)

Collapse
 
crawlbase profile image
Crawlbase

Thanks! Nice blog post! Love how you break down the basics of web crawling with Python and Selenium. It's awesome to see practical tips like setting up Selenium and navigating through a webpage.
By the way,you can check out Crawlbase, it could be your next go-to tool.

AWS Q Developer image

Your AI Code Assistant

Generate and update README files, create data-flow diagrams, and keep your project fully documented. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE