DEV Community

Cover image for Automated Browsing - the fast way
theiosif
theiosif

Posted on

Automated Browsing - the fast way

NOTE: This post will not be a Selenium tutorial, it's a mere workflow optimization which I often find myself using. If you want to cover Selenium basics, you're better of watching [this] or [reading their docs]. For a Selenium IDE primer, go [here], but frankly the interface is pretty self-explanatory.

The goal

Getting an easy way to automate web browsing, interactively, one interaction step at a time.

I'm demo-ing this with the log-in flow for a german flatshare app. This is for educational purposes, don't ban me WG-Gesucht, I'm not DOS-ing anything.

What the manual browing looks like

Step 1. Accept cookies
Step 2. Click "Mein Konto" ("my account") in the top right

Image description
Step 3. input credentials in the pop-up and hit enter.

Image description

The initial setup

  • Step 1:
    • Initialize a Selenium IDE project, start a new test-case, hit Record
  • Step 2:
    • Do the actions you want to automate on the target webpage, then stop the recording
  • Step 3:
    • Export the script as Pytest

Selenium output

Below is what selenium has churned out based on the recorded actions:

# Generated by Selenium IDE
import pytest
import time
import json
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

class TestSimpleLogin():
  def setup_method(self, method):
    self.driver = webdriver.Chrome()
    self.vars = {}

  def teardown_method(self, method):
    self.driver.quit()

  def test_simpleLogin(self):
    # Test name: simpleLogin
    # Step # | name | target | value
    # 1 | open | / | 
    self.driver.get("https://www.wg-gesucht.de/")
    # 2 | setWindowSize | 1536x824 | 
    self.driver.set_window_size(1536, 824)
    # 3 | click | id=cmpbntsavetxt | 
    self.driver.find_element(By.ID, "cmpbntsavetxt").click()
    # 4 | click | linkText=Mein Konto | 
    self.driver.find_element(By.LINK_TEXT, "Mein Konto").click()
    # 5 | click | id=login_email_username | 
    self.driver.find_element(By.ID, "login_email_username").click()
    # 6 | type | id=login_email_username | nice.try@fbi.com
    self.driver.find_element(By.ID, "login_email_username").send_keys("nice.try@fbi.com")
    # 7 | click | id=login_password | 
    self.driver.find_element(By.ID, "login_password").click()
    # 8 | type | id=login_password | dontLeaveYourPassInCode
    self.driver.find_element(By.ID, "login_password").send_keys("dontLeaveYourPassInCode")
    # 9 | sendKeys | id=login_password | ${KEY_ENTER}
    self.driver.find_element(By.ID, "login_password").send_keys(Keys.ENTER) 
Enter fullscreen mode Exit fullscreen mode

Let's look at what's happening:

First of all, typical for Pytest, there's a class with a setup_method and a teardown_method.

We don't need all that for browser automation.

The slim-down

What I'm pursuing is to write stuff portably. That is, this login thingy should be imported and callable in other scripts.

Let's slim down that code from SeleniumIDE a little, and keep credentials in a config.json file instead.

Here's what the config.json would look like on a windows system:

{
    "driverpath": "C:\\Windows\\chromedriver.exe",
    "chromeprofilepath": "C:\\Users\\<REPLACEWITHYOURUSER>\\AppData\\Local\\Google\\Chrome\\User Data",
    "chromeprofiledir": "<REPLACEWITHYOURPROFILE>",
    "user": "nice.try@fbi.com",
    "pass": "rockyou.txt"
  }
Enter fullscreen mode Exit fullscreen mode

High-level structure

(py-pseudocode) The intended structure for our script would be:

import whatever_there_is_to_import

loadConfig()

def setup(AUTO=True):
    # ... initialize the webdriver object
    return driver

def login(driver):
    # ... perform login actions

if __name__ == "__main__":
    try:
        driver=setup()
        login(driver)
    except Exception as e:
     print(f"{'*'*80}\nOOPS:\n{e}\n{'*'*80}\n")
     IPython.embed()   
Enter fullscreen mode Exit fullscreen mode

The try ... except thing will be explained shortly. It's a very neat trick.

The setup() method

My setup method is written as follows:

def setup(AUTO=True):

  options = Options()
  # options.add_argument('ignore-certificate-errors') # enable if you use self-signed certs (NOT RECOMMENDED EXC. FOR DEVVING)

  if AUTO:
    # automagically keep the webdriver in sync with local chrome version
    from webdriver_manager.chrome import ChromeDriverManager
    driver  = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()), options=options)
  else:
    service = Service(executable_path=config['driverpath']) # if chromedriver not in PATH, then do: service = Service(executable_path='/path/to/chromedriver')
    options.add_argument(f'user-data-dir={config["chromeprofilepath"]}')
    options.add_argument(f'--profile-directory={config["chromeprofiledir"]}')
    driver = webdriver.Chrome(service=service, options=options)

  driver.set_window_size(1920, 1080)
  return driver
Enter fullscreen mode Exit fullscreen mode

There's some goodies in there which I am sharing:

  • AUTO=True installs the webdriver for you (useful if you update chrome but forget about the webdriver 😉).
  • The --profile-directory trick is used to keep extensions/settings/cookies/history from your normal browsing session, and not what Selenium spawns by default.
  • (commented out for security) -- if you're testing out stuff you're developing and have dummy certificates in place, the 'ignore-certificate-errors' option saves you some hassle.

The login login, adapted from the test mockup

Now for the login function:

def login(driver):
    # 1 | open | / | 
    driver.get("https://www.wg-gesucht.de/")
    # 2 | setWindowSize | 1536x824 | 
    driver.set_window_size(1536, 824)
    # 3 | click | id=cmpbntsavetxt | 
    driver.find_element(By.ID, "cmpbntsavetxt").click()
    # 4 | click | linkText=Mein Konto | 
    driver.find_element(By.LINK_TEXT, "Mein Konto").click()
    # 5 | click | id=login_email_username | 
    driver.find_element(By.ID, "login_email_username").click()
    # 6 | type | id=login_email_username | nice.try@fbi.com
    driver.find_element(By.ID, "login_email_username").send_keys(config['user'])
    # 7 | click | id=login_password | 
    driver.find_element(By.ID, "login_password").click()
    # 8+9 | type | id=login_password | dontLeaveYourPassInCode + [ENTER]
    driver.find_element(By.ID, "login_password").send_keys(config['pass']+"\n")
Enter fullscreen mode Exit fullscreen mode

What changed?

Aside from not using a class anymore, all that changes is getting parameters from the config file in steps 6 and 8, I've eliminated step 9 (hitting ENTER) by appending "\n" to the password (much hax, such wow).

Cool, let's run it and see what happens. In the chrome window, stuff looks "as expected" up until step 5 - we get the pop-up asking for our credentials, but the console output says the following:

OOPS:
Message: element not interactable
Enter fullscreen mode Exit fullscreen mode

The IPython.embed() method got called upon encountering the exception, which means we get to keep our global and local variables and essentially have an "interactive breakpoint" on our hands.

Cool, let's paste in the code from step 5 and onwards, line by line, into the IPython interactive session.

This approach yields working results, so what's the problem?

Disclaimer, I'm not a webdev, so it might be obvious to many already -- element not interactable means that the DOM didn't fully load yet.

The fix: Adding driver.implicitly_wait(1) to the setup() method before returning the driver object means that Selenium waits (at most) 1 second before throwing these kinds of exceptions.

Full code

#!/usr/bin/env python3
import IPython
import json
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

with open('config.json') as config_file:
    config = json.load(config_file)

def setup(AUTO=True):

  options = Options()
  options.add_argument('ignore-certificate-errors')

  if AUTO:
    # automagically keep the webdriver in sync with local chrome version
    from webdriver_manager.chrome import ChromeDriverManager
    driver  = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()), options=options)
  else:
    service = Service(executable_path=config['driverpath']) # if chromedriver not in PATH, then do: service = Service(executable_path='/path/to/chromedriver')
    options.add_argument(f'user-data-dir={config["chromeprofilepath"]}')
    options.add_argument(f'--profile-directory={config["chromeprofiledir"]}')
    driver = webdriver.Chrome(service=service, options=options)

  driver.set_window_size(1920, 1080)
  driver.implicitly_wait(1) # give 1s for stuff to load
  return driver

def login(driver):
    # 1 | open | / | 
    driver.get("https://www.wg-gesucht.de/")
    # 2 | setWindowSize | 1536x824 | 
    driver.set_window_size(1536, 824)
    # 3 | click | id=cmpbntsavetxt | 
    driver.find_element(By.ID, "cmpbntsavetxt").click()
    # 4 | click | linkText=Mein Konto | 
    driver.find_element(By.LINK_TEXT, "Mein Konto").click()
    # 5 | click | id=login_email_username | 
    driver.find_element(By.ID, "login_email_username").click()
    # 6 | type | id=login_email_username | nice.try@fbi.com
    driver.find_element(By.ID, "login_email_username").send_keys(config['user'])
    # 7 | click | id=login_password | 
    driver.find_element(By.ID, "login_password").click()
    # 8+9 | type | id=login_password | dontLeaveYourPassInCode + [ENTER]
    driver.find_element(By.ID, "login_password").send_keys(config['pass']+"\n")

if __name__ == '__main__':
  driver=setup()
  try:
    login(driver)
  except Exception as e:
     print(f"{'*'*80}\nOOPS:\n{e}\n{'*'*80}\n")
     IPython.embed()    
Enter fullscreen mode Exit fullscreen mode

Takeaways

Combining Selenium with IPython.embed() is really powerful because:

  1. Selenium IDE sucks at giving good selectors. We'll cover how to do this better than sIDE in another post (TODO).
  2. Instead of having to run the whole script, you get to try out stuff at the spot where the script got b0rked.

Top comments (0)