Das

Posted on Mar 21 • Originally published at Medium

Real-Time Test Results in Robot Framework

#robotframework #python #opensearch #monitoring

No more waiting for the suite to finish to find out what failed.

Introduction

In a previous organisation, a proof of concept was built to stream Robot Framework test results into a live monitoring dashboard using the ELK stack (Elasticsearch, Logstash, and Kibana). The idea was simple: instead of waiting for the full run to complete and then reading the output report, results would appear in a dashboard the moment each test finished.

The problem with ELK is that it is no longer fully open source. Elastic changed its licensing in 2021. OpenSearch is the community fork that stayed open. This project is a rebuild of that original setup using OpenSearch, with the same goal: live test visibility while the suite is still running.

The approach works with any test framework that exposes hooks into the test lifecycle. Robot Framework is used here because that is where the original work was done, but the same pattern applies to pytest, Selenium, and others.

The Problem

When a test suite runs, results are only written to output.xml when the entire run completes. For short suites this is fine. For longer ones it creates a gap.

During the run, the only signal is whatever the terminal prints. A failure shows up as FAIL with a short message, but by the time the run finishes and the report is available, the context around that failure is gone. Was it a transient error? A timeout that only happens under load? Something that affected multiple tests in the same suite? There is no way to know without digging through logs after the fact.

What would actually help is seeing each result as it happens, with the full message, the test name, the suite, and the duration, in a searchable dashboard that stays available after the run.

Tech Stack

Robot Framework - test framework with a listener API for hooking into test execution
OpenSearch - open-source search and analytics engine for storing results
OpenSearch Dashboards - visualisation layer for building live dashboards
Docker - runs OpenSearch and the dashboard in containers
Python - the listener that ships results to OpenSearch

Architecture

+-------------------------------+
|  Robot Framework (local)      |
|  listener fires after each    |
|  test and ships the result    |
+---------------+---------------+
                |
                v
+---------------+---------------+        +---------------------------+
|  OpenSearch (Docker)          | -----> |  OpenSearch Dashboards    |
|  port 9200                    |        |  port 5601                |
+-------------------------------+        +---------------------------+

Robot Framework and tests run locally as usual. Docker handles OpenSearch and the dashboard without any manual installation needed.

Step by Step

Prerequisites

1. How the Listener Works

Robot Framework exposes a listener API, a Python class it calls at specific points during execution. The hook used here is end_test, which fires immediately after each test completes, before the next one starts.

class OpenSearchListener:
    ROBOT_LISTENER_API_VERSION = 2

    def end_test(self, name, attrs):
        doc = {
            "run_id": self.run_id,
            "suite": self.suite_name,
            "test": name,
            "status": attrs["status"],
            "message": attrs["message"],
            "tags": list(attrs["tags"]),
            "start_time": ...,
            "elapsed_seconds": attrs["elapsedtime"] / 1000,
            "indexed_at": datetime.utcnow().isoformat() + "Z",
        }
        self.client.index(index=self.index, body=doc)

Each test result is indexed into OpenSearch straight away. The document is in the index before the next test has started.

Each document looks like this:

{
  "run_id": "a3f2c1...",
  "suite": "Tests.Login",
  "test": "Valid credentials should log in",
  "status": "FAIL",
  "message": "Element not found: #submit-btn",
  "tags": ["smoke", "auth"],
  "start_time": "2026-03-21T12:20:35.701Z",
  "elapsed_seconds": 3.001
}

The run_id field is generated once per test run, which makes it straightforward to filter the dashboard to a specific run or compare runs over time.

2. Setup

Clone the repo and install dependencies:

git clone https://github.com/007bsd/results-execution-monitoring
cd results-execution-monitoring
pip install -r requirements.txt

Start OpenSearch and the dashboard:

docker compose up -d

Note: The first run downloads around 1 GB of images. Once containers are up, confirm OpenSearch is ready with curl http://localhost:9200. A JSON response means it is running.

Run the setup script once to create the index pattern and a starter dashboard:

python setup_dashboard.py

This is meant to get things running quickly. The dashboard it creates is a starting point, not the final word. See the section below on building dashboards manually.

3. Running Tests with the Listener

python -m robot --listener opensearch_listener.OpenSearchListener tests/

The listener confirms it has started and prints the run_id and index name:

As the run progresses, each result is confirmed in the terminal:

Open http://localhost:5601 before or during the run to watch results appear in real time.

4. Verifying Data in OpenSearch

To confirm results are being indexed correctly, the Dev Tools console at http://localhost:5601/app/dev_tools#/console can be used to query the index directly:

GET robot-results/_mapping

GET robot-results/_search
{
  "size": 1
}

5. Exploring Results in Discover

OpenSearch Dashboards has a Discover view that lets you search and filter all indexed documents. It is useful for digging into specific failures, filtering by tag, run, or suite, and understanding patterns across multiple runs.

6. Building Dashboards Manually

The setup script creates a starter dashboard to get things going, but most people will want to build their own visualisations on top of the data. OpenSearch Dashboards has a full visual editor for this, no configuration files or scripts required.

To create a visualisation:

Go to http://localhost:5601 and open Visualize from the menu
Click Create visualization and choose a chart type
Select robot-results as the index
Configure the metric (e.g. Count) and bucket (e.g. Terms on status) to get a pass/fail breakdown
Save the visualisation with a name

To build a dashboard from those visualisations:

Open Dashboard from the menu
Click Create new dashboard
Click Add from library and select the saved visualisations
Arrange the panels, set a title, and save

Some useful combinations to start with:

Pie or donut chart of status for an overall pass/fail ratio
Data table of recent failures showing test, suite, and message
Bar chart of elapsed_seconds by test name to spot slow tests
A metric panel showing total fail count for the current run
Filter by run_id to isolate and compare specific runs

Result

Once the stack is running, every test result appears in the dashboard the moment it is indexed. Failures are immediately visible with the full message, test name, suite, and duration, without waiting for the suite to complete.

Conclusion

The standard approach to test reporting is batch: everything is available when the run finishes, and not before. For short suites this is acceptable. For longer ones it is a genuine visibility problem.

Streaming results to OpenSearch as each test completes inverts that. Results are available immediately, failures are visible in context, and the history of every run is retained and searchable without any extra work.

The listener pattern used here is Robot Framework-specific, but the underlying idea applies to any test framework with a comparable hook system.

Resources

GitHub - github.com/007bsd/results-execution-monitoring
Robot Framework Listener API - robotframework.org/robotframework/latest/RobotFrameworkUserGuide.html#listener-interface
OpenSearch Documentation - opensearch.org/docs/latest
OpenSearch Dashboards - opensearch.org/docs/latest/dashboards
Docker - docs.docker.com

If you try setting this up and encounter any issues, please leave a comment. The complete code for this project is available on GitHub

DEV Community