DEV Community

Magesh Narayanan
Magesh Narayanan

Posted on

Python Selenium Architecture

The Python Selenium architecture is the structure that enables interaction between Python programs or codes with Web browser through the Selenium Web Drivers API.
It allows automation of web-based tasks like testing, form submissions, scraping etc.

  1. Overview of Selenium

i. Selenium is a web automation framework that allows us to programmatically control a web browser.

ii. Python bindings for Selenium let us write scripts in Python to automate browser interactions like

• Clicking buttons
• Filling forms
• Extracting information
• Navigating to pages

  1. Architecture Components

a. Python Script:

i) Here we write our Python test cases or automation scripts or logics are written on Python.
ii) It uses Selenium Webdriver API (Python bindings) to send commands.

b. Selenium Webdrive (Client API)

• This is the core library that interacts with the browser specific driver. (ex: selenium.webdriver)
• If translates Python commands into JSON protocol or W3C webdriver protocol commands.

c. Browser Driver (like Chrome Driver, GeckoDriver)
• Acts as a bridge between Selenium and web browser.
• Receives commands from webdriver in JSON format.
• Executes those commands on the browser using the browser’s native automation support.

Example:
Chrome driver
msedgedriver
geckodriver

d. Web Browser
• The browser where the test case or automation will run.
• Executes the actions like clicking, scrolling, extracting etc.

  1. Flow Diagram – How it works

java
Python Script
|
Selenium Webdriver (Python Bindings)
|
JSON over HTTP
Browser Driver (ex. Chrome Driver)
|
Web Browser (ex. Chrome, Firefox etc)
Example:

• We write a Python command driver.get (“https://examples.com”)
• Webdriver sends a JSON HTTP request to the browser driver
• The browser driver translates it into native browser commands
• Browser opens the URL
• Driver sends back the response to the Webdriver, which our script receives.

  1. Python Selenium Key Classes
    o webdriver.chrome() or webdriver.Firefox() Launch Browser
    o driver.get (url) Navigate to page
    o driver.find_element (by element_id) Find element
    o element.click(), element.send_keys(“text”) Interact with element
    o driver.quit() Close Browser

  2. Supported Browsers and Drivers

    Browser Driver Notes
    Chrome ChromeDriver Maintained by Chromium team
    Firefox GeckoDriver Supports modern Firefox
    Edge EdgeDriver Supports Chromium Edge
    Safari SafariDriver Built-in on macOS
  3. Key Features

  4. Supports headless execution

  5. Can handle JavaScript-heavy websites

  6. Can take screenshots, handle frames, alerts, and cookies.

  7. Pytest and unittest are integrated.

  8. Disadvantages or Limitations

  • Requires browser and driver compatibility
  • It is less efficient for data extraction compared to headless APIs like requests
  • It works only for web and will not work on desktop.
  1. Installation
bash
CopyEdit
pip install selenium
Enter fullscreen mode Exit fullscreen mode

Download required drivers and set it in the PATH.

Summary
Component Description
Python Code Logic is written
Selenium WebDriver Python library to send commands
Browser Driver Translator between Webdriver and Browser
Browser Executes commands visually

╔══════════════════════╗
║ Python Test Script ║ ← we write code (e.g., login test)
╚══════════════════════╝

╔══════════════════════╗
║ Selenium Python ║ ← Converts Python commands into
║ WebDriver Bindings ║ WebDriver API calls

╚══════════════════════╝

╔═══════════════════════╗
║ Real Browser (Chrome) ║ ← Performs actions like clicking, typing
╚═══════════════════════╝

╔═══════════════════════╗
║ Web Application (Site)║ ← Target site to test/automate/scrape
╚═══════════════════════╝

We elaborate this with an example
Consider we want to open a website and enter login credentials
Step by step Python script
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import keys
import time

Step 1: Launch Browser using chrome driver
driver = webdriver.Chrome() #Make sure the Chrome driver is in the required path

Step 2: Open a website
driver.get(“https://example.com/login”)

Step 3: Find user name and password then enter values
driver.find_element(By.ID, "username").send_keys("myuser")
driver.find_element(By.ID, "password").send_keys("mypassword")

Step 4: Submit the form
driver.find_element(By.ID, "password").send_keys(Keys.ENTER)

Step 5: Wait and close browser
time.sleep(5)
driver.quit()

Top comments (0)