NOTE: This post will not be a Selenium tutorial, it's a mere workflow optimization which I often find myself using. If you want to cover Selenium basics, you're better of watching [this] or [reading their docs]. For a Selenium IDE primer, go [here], but frankly the interface is pretty self-explanatory.
The goal
Getting an easy way to automate web browsing, interactively, one interaction step at a time.
I'm demo-ing this with the log-in flow for a german flatshare app. This is for educational purposes, don't ban me WG-Gesucht, I'm not DOS-ing anything.
What the manual browing looks like
Step 1. Accept cookies
Step 2. Click "Mein Konto" ("my account") in the top right
Step 3. input credentials in the pop-up and hit enter.
The initial setup
-
Step 1:
- Initialize a Selenium IDE project, start a new test-case, hit Record
-
Step 2:
- Do the actions you want to automate on the target webpage, then stop the recording
-
Step 3:
- Export the script as Pytest
Selenium output
Below is what selenium has churned out based on the recorded actions:
# Generated by Selenium IDE
import pytest
import time
import json
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
class TestSimpleLogin():
def setup_method(self, method):
self.driver = webdriver.Chrome()
self.vars = {}
def teardown_method(self, method):
self.driver.quit()
def test_simpleLogin(self):
# Test name: simpleLogin
# Step # | name | target | value
# 1 | open | / |
self.driver.get("https://www.wg-gesucht.de/")
# 2 | setWindowSize | 1536x824 |
self.driver.set_window_size(1536, 824)
# 3 | click | id=cmpbntsavetxt |
self.driver.find_element(By.ID, "cmpbntsavetxt").click()
# 4 | click | linkText=Mein Konto |
self.driver.find_element(By.LINK_TEXT, "Mein Konto").click()
# 5 | click | id=login_email_username |
self.driver.find_element(By.ID, "login_email_username").click()
# 6 | type | id=login_email_username | nice.try@fbi.com
self.driver.find_element(By.ID, "login_email_username").send_keys("nice.try@fbi.com")
# 7 | click | id=login_password |
self.driver.find_element(By.ID, "login_password").click()
# 8 | type | id=login_password | dontLeaveYourPassInCode
self.driver.find_element(By.ID, "login_password").send_keys("dontLeaveYourPassInCode")
# 9 | sendKeys | id=login_password | ${KEY_ENTER}
self.driver.find_element(By.ID, "login_password").send_keys(Keys.ENTER)
Let's look at what's happening:
First of all, typical for Pytest, there's a class with a setup_method
and a teardown_method
.
We don't need all that for browser automation.
The slim-down
What I'm pursuing is to write stuff portably. That is, this login thingy should be imported and callable in other scripts.
Let's slim down that code from SeleniumIDE a little, and keep credentials in a config.json
file instead.
Here's what the config.json
would look like on a windows system:
{
"driverpath": "C:\\Windows\\chromedriver.exe",
"chromeprofilepath": "C:\\Users\\<REPLACEWITHYOURUSER>\\AppData\\Local\\Google\\Chrome\\User Data",
"chromeprofiledir": "<REPLACEWITHYOURPROFILE>",
"user": "nice.try@fbi.com",
"pass": "rockyou.txt"
}
High-level structure
(py-pseudocode) The intended structure for our script would be:
import whatever_there_is_to_import
loadConfig()
def setup(AUTO=True):
# ... initialize the webdriver object
return driver
def login(driver):
# ... perform login actions
if __name__ == "__main__":
try:
driver=setup()
login(driver)
except Exception as e:
print(f"{'*'*80}\nOOPS:\n{e}\n{'*'*80}\n")
IPython.embed()
The try ... except
thing will be explained shortly. It's a very neat trick.
The setup()
method
My setup method is written as follows:
def setup(AUTO=True):
options = Options()
# options.add_argument('ignore-certificate-errors') # enable if you use self-signed certs (NOT RECOMMENDED EXC. FOR DEVVING)
if AUTO:
# automagically keep the webdriver in sync with local chrome version
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()), options=options)
else:
service = Service(executable_path=config['driverpath']) # if chromedriver not in PATH, then do: service = Service(executable_path='/path/to/chromedriver')
options.add_argument(f'user-data-dir={config["chromeprofilepath"]}')
options.add_argument(f'--profile-directory={config["chromeprofiledir"]}')
driver = webdriver.Chrome(service=service, options=options)
driver.set_window_size(1920, 1080)
return driver
There's some goodies in there which I am sharing:
-
AUTO=True
installs the webdriver for you (useful if you update chrome but forget about the webdriver 😉). - The
--profile-directory
trick is used to keep extensions/settings/cookies/history from your normal browsing session, and not what Selenium spawns by default. - (commented out for security) -- if you're testing out stuff you're developing and have dummy certificates in place, the
'ignore-certificate-errors'
option saves you some hassle.
The login login, adapted from the test mockup
Now for the login function:
def login(driver):
# 1 | open | / |
driver.get("https://www.wg-gesucht.de/")
# 2 | setWindowSize | 1536x824 |
driver.set_window_size(1536, 824)
# 3 | click | id=cmpbntsavetxt |
driver.find_element(By.ID, "cmpbntsavetxt").click()
# 4 | click | linkText=Mein Konto |
driver.find_element(By.LINK_TEXT, "Mein Konto").click()
# 5 | click | id=login_email_username |
driver.find_element(By.ID, "login_email_username").click()
# 6 | type | id=login_email_username | nice.try@fbi.com
driver.find_element(By.ID, "login_email_username").send_keys(config['user'])
# 7 | click | id=login_password |
driver.find_element(By.ID, "login_password").click()
# 8+9 | type | id=login_password | dontLeaveYourPassInCode + [ENTER]
driver.find_element(By.ID, "login_password").send_keys(config['pass']+"\n")
What changed?
Aside from not using a class anymore, all that changes is getting parameters from the config file in steps 6 and 8, I've eliminated step 9 (hitting ENTER) by appending "\n"
to the password (much hax, such wow).
Cool, let's run it and see what happens. In the chrome window, stuff looks "as expected" up until step 5 - we get the pop-up asking for our credentials, but the console output says the following:
OOPS:
Message: element not interactable
The IPython.embed()
method got called upon encountering the exception, which means we get to keep our global and local variables and essentially have an "interactive breakpoint" on our hands.
Cool, let's paste in the code from step 5 and onwards, line by line, into the IPython interactive session.
This approach yields working results, so what's the problem?
Disclaimer, I'm not a webdev, so it might be obvious to many already -- element not interactable
means that the DOM didn't fully load yet.
The fix: Adding driver.implicitly_wait(1)
to the setup()
method before returning the driver object means that Selenium waits (at most) 1 second before throwing these kinds of exceptions.
Full code
#!/usr/bin/env python3
import IPython
import json
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
with open('config.json') as config_file:
config = json.load(config_file)
def setup(AUTO=True):
options = Options()
options.add_argument('ignore-certificate-errors')
if AUTO:
# automagically keep the webdriver in sync with local chrome version
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()), options=options)
else:
service = Service(executable_path=config['driverpath']) # if chromedriver not in PATH, then do: service = Service(executable_path='/path/to/chromedriver')
options.add_argument(f'user-data-dir={config["chromeprofilepath"]}')
options.add_argument(f'--profile-directory={config["chromeprofiledir"]}')
driver = webdriver.Chrome(service=service, options=options)
driver.set_window_size(1920, 1080)
driver.implicitly_wait(1) # give 1s for stuff to load
return driver
def login(driver):
# 1 | open | / |
driver.get("https://www.wg-gesucht.de/")
# 2 | setWindowSize | 1536x824 |
driver.set_window_size(1536, 824)
# 3 | click | id=cmpbntsavetxt |
driver.find_element(By.ID, "cmpbntsavetxt").click()
# 4 | click | linkText=Mein Konto |
driver.find_element(By.LINK_TEXT, "Mein Konto").click()
# 5 | click | id=login_email_username |
driver.find_element(By.ID, "login_email_username").click()
# 6 | type | id=login_email_username | nice.try@fbi.com
driver.find_element(By.ID, "login_email_username").send_keys(config['user'])
# 7 | click | id=login_password |
driver.find_element(By.ID, "login_password").click()
# 8+9 | type | id=login_password | dontLeaveYourPassInCode + [ENTER]
driver.find_element(By.ID, "login_password").send_keys(config['pass']+"\n")
if __name__ == '__main__':
driver=setup()
try:
login(driver)
except Exception as e:
print(f"{'*'*80}\nOOPS:\n{e}\n{'*'*80}\n")
IPython.embed()
Takeaways
Combining Selenium
with IPython.embed()
is really powerful because:
- Selenium IDE sucks at giving good selectors. We'll cover how to do this better than sIDE in another post (TODO).
- Instead of having to run the whole script, you get to try out stuff at the spot where the script got b0rked.
Top comments (0)