I recently started working on my Master's thesis. Until now, I haven't wrote a single line of text, but only code instead. And since I wanted to share my experiences I already made along the way, I decided to create a small library of a part of my work.
There is no concrete title for the paper yet, but the main task is to model and automatize Twitter user behavior. To easily interact with Twitter, I decided to go for browser automation using Selenium. And since I wanted to embed this into an existing Spring application and needed to concurrently run different Selenium tasks, I wrote a module to provide Selenium-Docker containers.
The main concept π
In general, I'm making use of the awesome library Testcontainers to create the containers. Since we are talking about pools, it's all about taking containers and returning them to the pool after finishing work. The rest (environment cleanup, browser restart, ...) is done by the pool itself.
When acquiring a container from the pool, it is also possible to specify a profile, which the underlying Chrome browser should run with. By this, you have the chance to access cookies and stuff stored from previous runs.
Dive into the pool π€Ώ
Depending on your build tool, add the dependency to your project. For Maven it would be:
<dependency>
<groupId>de.alxgrk</groupId>
<artifactId>spring-selenium-pool-core</artifactId>
<version>1.0.0</version>
</dependency>
Thanks to Spring auto-configuration, you are now able to get an @Autowired
instance of WebDriverPool
.
@Component
class SomeSeleniumTask(@Autowired webDriverPool: WebDriverPool) {
init {
val container: WebDriverForContainer? = webDriverPool.getWebDriverForContainer()
}
}
Since it might happen, that all containers are in use, this method may return null. To circumvent this issue, you can also wait until there is a free container. (If you've expected a suspending function at this point, have a look at CompletionStage.await() extension function for converting CompletableFuture
)
@Component
class SomeSeleniumTask(@Autowired webDriverPool: WebDriverPool) {
init {
val container: CompletableFuture<WebDriverForContainer> = webDriverPool.getWebDriverForContainerAsync()
// blocking get
container.get()
// ... or with timeout
container.get(1, TimeUnit.SECONDS)
// ... or coroutine style
GlobalScope.launch {
val webDriverForContainer = container.await()
}
}
}
Doing the actual work πͺ
Now you are able to do whatever you wanted to do. And to not forget to return the container to the pool (unless you are sure, you'll need the exact same instance later), WebDriverForContainer
implements AutoClosable
. Means you can simply use try-with-resources
in Java or .use()
extension function in Kotlin.
@Component
class SomeSeleniumTask(@Autowired webDriverPool: WebDriverPool) {
init {
webDriverPool.getWebDriverForContainer()!!.use {
it.webDriver.get("https://foo.bar/")
}
}
}
Configuration π οΈ
Last but not least, it is possible to configure the pool via Spring-Boot properties. Auto-Completion when working with IntelliJ included. This is an example configuration showing the default values:
# default values
selenium.pool.size=3
selenium.pool.recording-directory= # if empty, no recording will happen
selenium.pool.profiles-directory= # if empty, a temporary directory will be created and deleted on exit
selenium.pool.extension-files-in-classpath= # if empty, no extension files will be loaded
β¨ Bonus: Custom Chrome extensions
If you read about the configuration properties carefully, you may have notice the parameter about extension files.
It is possible to install a custom Chrome extension by specifying a manifest.json
and a corresponding index.js
file.
This might be helpful for a couple of reasons including e.g. for grabbing the network traffic.
For an example on how to use this, see application.properties
, manifest.json
and index.js
in spring-selenium-pool-example.
Feedback
I hope you like this library and find it useful. So please use it and tell me about it. I would be happy to hear what you like and what could be improved.
Check out the library on Github (a working example can be found in spring-selenium-pool-example
) and leave a star if you like it:
alxgrk / spring-selenium-pool
A library that provides a pool of dockerized Selenium instances.
π Selenium Pool
This small library helps to create a pool of Selenium Docker Container with a configurable size.
π©² Prerequisites
Make sure to have the following installed:
- Docker
πΏ Installation
For Maven, add this dependency:
<dependency>
<groupId>de.alxgrk</groupId>
<artifactId>spring-selenium-pool-core</artifactId>
<version>1.0.0</version>
</dependency>
For Gradle use:
implementation("de.alxgrk:spring-selenium-pool-core:1.0.0")
π οΈ Configuration
This library is configurable via Spring-Boot properties. Auto-Completion when working with IntelliJ included. This is an example configuration showing the default values:
# default values
selenium.pool.size=3
selenium.pool.recording-directory= # if empty, no recording will happen
selenium.pool.profiles-directory= # if empty, a temporary directory will be created and deleted on exit
selenium.pool.extension-files-in-classpath= # if empty, no extension files will be loaded
π₯½ Utilities
To connect to the running Selenium container, you have to have vncviewer
& tigervnc-common
installed. If correctly configured, simplyβ¦
Top comments (0)