If you wanna learn automation scrapping with selenium, then this simple project can be the starting point of your journey. In this tutorial i will explain how to scrape image from google using selenium.
Background
on my case i want to scrape image on google with some keyword, lets say "cat" then i will store some links as csv files.
Getting Started
What is Selenium
before we go any further we must know what is selenium. Selenium is a tool for controlling web browsers through programs and performing browser automation. It is mainly used as a testing framework for cross-web browser platforms. However, Selenium is also a very capable tool to use for general web automation, as we are able to program it to do what a human user can do on a browser (in this case, to programmatically download images from Google).
Scraping with Selenium
So how does Selenium exactly work? well, Selenium provides the mechanisms to locate elements on a web page and it mimic the user behaviour. here is the table for most used attribute and locator
These elements can be found in feature Developer Tools on web browsers
and now lets start coding!
- Set up the necessary libraries required for the script
pip install selenium
- Import Libraries for this tutorial i will be using google chrome so,
from selenium.webdriver import Chrome
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
- go get into google.com then search with keyword "cat"
driver = Chrome()
driver.get("https://www.google.com/")
search_el = driver.find_element(By.XPATH, "//input[@title='Search']")
search_el.send_keys("cat")
search_el.send_keys("Keys.ENTER")
image_el = driver.find_element(By.XPATH, "//a[href]")
image_el.click()
the google window will pop up with cat image
Congratulations! You have successfully open a browser and and navigate to cat images automatically, next we will scroll the page and extract the url image. i will cover it in next part
Github :https://github.com/muchamadfaiz
Email : muchamadfaiz@gmail.com
Top comments (0)