Using BiDirectional Protocol support in Selenium 4 to stream console logs and network requests

BiDi Protocol support in Selenium 4

One of the new features in the recently released Selenium 4 is support for new event-driven listeners which will be powered by the currently-in-draft BiDirectional (or BiDi) protocol (though the current Selenium implementation has some limitations, which we'll discuss later). In this article we'll discuss some of these new capabilities and demonstrate how to use them in Scala to inspect console logs and network requests made from the browser.

Limitations of Console/Network Log Support in Selenium 3

In previous versions of Selenium, console and network log information was accessible by pulling via methods such as WebDriver.manage.logs.get(...). While that model does provide log access, it has a few shortcomings:

Logs need to be actively requested. No built-in interface is available for having logs pushed to your code as they occur.
No control is available for the volume of logs returned - presenting memory concerns when dealing with a long-lived session unless the logs are periodically pulled.
While network requests and responses can be recorded after they have been made, no built-in mechanism is available to modify or block the requests themselves.

Selenium 4 BiDi support

While the existing pull-based log methods remain available in Selenium 4, a new set of APIs has been added in Selenium 4 to allow users to subscribe to console logs and intercept network requests. We'll now demonstrate these APIs using Scala and Selenium's Java library.

BiDi support in action

We'll start by instantiating our WebDriver and creating buffers to hold the console log and network request information:

import org.openqa.selenium.chrome.ChromeDriver
import org.openqa.selenium.devtools.events.ConsoleEvent
import scala.collection.mutable.ListBuffer

// A case class to hold some useful data on a given network request 
// executed in the browser.
case class RequestData(method: String, url: String, responseCode: Int)

val consoleMessages = ListBuffer.empty[ConsoleEvent]

val networkRequests = ListBuffer.empty[RequestData]

val chromeDriver = new ChromeDriver()

val devTools = chromeDriver.getDevTools

devTools.createSessionIfThereIsNotOne()

To begin, we're launching Chrome and opening a connection to it using the CDP protocol (in a future release it will use the BiDi protocol). Tracking all console messages that are logged during the browser session is as simple as calling:

devTools.getDomains.events().addConsoleListener(consoleMessages.append(_))

addConsoleListener registers a function that is invoked whenever a console message is logged by the browser - in our case we simply throw it onto our
consoleMessages buffer.

Note: In order to avoid the same memory considerations as the non-streamed API you'll want your application to consume these messages rather than holding them in-memory forever - see the Potential Applications section for discussion on how that can be done.

To record network request and response information we can use the following code:

import java.util.concurrent.CountDownLatch

// A latch to track when a network request has been completed.
val networkRequestLatch = new CountDownLatch(1)

devTools.getDomains.network().interceptTrafficWith(next => {
   request => {
      val response = next.execute(request)

      networkRequests.append(RequestData(
         method = request.getMethod.toString,
         url = request.getUri,
         responseCode = response.getStatus
      ))

      networkRequestLatch.countDown()

      response
   }
})

Let's break this down: interceptTrafficWith allows us to register a Filter that is executed for every network request made by the browser. In our example, we execute each request and add a RequestData entry to our networkRequests buffer containing the HTTP method and URL of the request, as well as the status code of the response.

Note: This is only recording a subset of the data available from the request - a full list of available request information can be found in the Javadocs for HttpRequest.

networkRequestLatch is presumably not something you would use in your normal code, we add it here to provide a hook to ensure a request has been executed before closing the browser.

To demonstrate this functionality against a web page, we can use the following example.html file:

<!DOCTYPE html>
<html lang="en">
    <head><title>Selenium 4 Example</title></head>
    <body>
        <script>
            console.log('Hello, Selenium 4!')

            fetch('https://www.google.com')
        </script>
    </body>
</html>

Putting together all the earlier code snippets, you can test against the example.html file with the following code:

import java.util.concurrent.CountDownLatch
import org.openqa.selenium.chrome.ChromeDriver
import org.openqa.selenium.devtools.events.ConsoleEvent
import scala.collection.mutable.ListBuffer

// A case class to hold some useful data on a given network request 
// executed in the browser.
case class RequestData(method: String, url: String, responseCode: Int)

val consoleMessages = ListBuffer.empty[ConsoleEvent]

val networkRequests = ListBuffer.empty[RequestData]

val chromeDriver = new ChromeDriver()

val devTools = chromeDriver.getDevTools

devTools.createSessionIfThereIsNotOne()

devTools.getDomains.events().addConsoleListener(consoleMessages.append(_))

// A latch to track when a network request has been completed.
val networkRequestLatch = new CountDownLatch(1)

devTools.getDomains.network().interceptTrafficWith(next => {
   request => {
      val response = next.execute(request)

      networkRequests.append(RequestData(
         method = request.getMethod.toString,
         url = request.getUri,
         responseCode = response.getStatus
      ))

      networkRequestLatch.countDown()

      response
   }
})

// Replace this string with the location of the site or file you'd like to test
chromeDriver.get("example.html")

// Wait for the network request, as it might not complete until after 
// the page has loaded.
networkRequestLatch.await()

// Print out the collected console and network information
consoleMessages.foreach(println)
networkRequests.foreach(println)

// Shut down the browser
chromeDriver.quit()

If you run this code against the example.html file, you should see the following information printed to console:

2021-11-01T02:07:57.309Z [log] [["Hello, Selenium 4!"]]
(RequestData(GET,https://www.google.com/),ResponseData(200))

Potential Applications

While the example code simply recorded and printed the console and network request information it collected, there is lots of potential for using this new functionality for more practical applications.

Streaming into storage/ingestion

Rather than relying on pulling logs periodically, network and console logs can now be streamed directly into a file or remote storage system (such as Amazon S3). Alternatively the logs could be forwarded directly into a data ingestion pipeline such as Amazon Kinesis for additional processing/filtering before storage.

Network Request Modification

By having access to every outbound network request before it is actually executed, one could begin conditionally injecting new data - such as supplemental headers - into outgoing network requests based on the attributes (headers, path, etc.) of the request. You could even block requests if desired.

Limitations

While the new BiDi APIs offer new and interesting patterns for interacting with the browser, they suffer from the same limitation as the older pull-based APIs for log information in that they're dependent on each individual browser to implement the necessary protocols/APIs for use by the WebDriver. Because the BiDi Protocol is still in a draft state that means support across browsers is quite limited - in fact these new APIs are actually reliant on the
Chrome DevTools Protocol rather than BiDi.

As the BiDi Protocol is finalized browser support can be expected to improve, but until then you may not be able to leverage these new APIs for all the browsers you'd like to.

Conclusion

Selenium 4 provides a new mechanism for interacting with and recording log and network request information in the browser. While it has limited support today, the potential applications for using it make it at minimum a feature to monitor as the BiDi Protocol is finalized.