DEV Community

Cover image for Real-Time Test Results in Robot Framework
Das
Das

Posted on • Originally published at Medium

Real-Time Test Results in Robot Framework

No more waiting for the suite to finish to find out what failed.


Introduction

In a previous organisation, a proof of concept was built to stream Robot Framework test results into a live monitoring dashboard using the ELK stack (Elasticsearch, Logstash, and Kibana). The idea was simple: instead of waiting for the full run to complete and then reading the output report, results would appear in a dashboard the moment each test finished.

The problem with ELK is that it is no longer fully open source. Elastic changed its licensing in 2021. OpenSearch is the community fork that stayed open. This project is a rebuild of that original setup using OpenSearch, with the same goal: live test visibility while the suite is still running.

The approach works with any test framework that exposes hooks into the test lifecycle. Robot Framework is used here because that is where the original work was done, but the same pattern applies to pytest, Selenium, and others.


The Problem

When a test suite runs, results are only written to output.xml when the entire run completes. For short suites this is fine. For longer ones it creates a gap.

During the run, the only signal is whatever the terminal prints. A failure shows up as FAIL with a short message, but by the time the run finishes and the report is available, the context around that failure is gone. Was it a transient error? A timeout that only happens under load? Something that affected multiple tests in the same suite? There is no way to know without digging through logs after the fact.

What would actually help is seeing each result as it happens, with the full message, the test name, the suite, and the duration, in a searchable dashboard that stays available after the run.


Tech Stack

  • Robot Framework - test framework with a listener API for hooking into test execution
  • OpenSearch - open-source search and analytics engine for storing results
  • OpenSearch Dashboards - visualisation layer for building live dashboards
  • Docker - runs OpenSearch and the dashboard in containers
  • Python - the listener that ships results to OpenSearch

Architecture

+-------------------------------+
|  Robot Framework (local)      |
|  listener fires after each    |
|  test and ships the result    |
+---------------+---------------+
                |
                v
+---------------+---------------+        +---------------------------+
|  OpenSearch (Docker)          | -----> |  OpenSearch Dashboards    |
|  port 9200                    |        |  port 5601                |
+-------------------------------+        +---------------------------+
Enter fullscreen mode Exit fullscreen mode

Robot Framework and tests run locally as usual. Docker handles OpenSearch and the dashboard without any manual installation needed.


Step by Step

Prerequisites

1. How the Listener Works

Robot Framework exposes a listener API, a Python class it calls at specific points during execution. The hook used here is end_test, which fires immediately after each test completes, before the next one starts.

class OpenSearchListener:
    ROBOT_LISTENER_API_VERSION = 2

    def end_test(self, name, attrs):
        doc = {
            "run_id": self.run_id,
            "suite": self.suite_name,
            "test": name,
            "status": attrs["status"],
            "message": attrs["message"],
            "tags": list(attrs["tags"]),
            "start_time": ...,
            "elapsed_seconds": attrs["elapsedtime"] / 1000,
            "indexed_at": datetime.utcnow().isoformat() + "Z",
        }
        self.client.index(index=self.index, body=doc)
Enter fullscreen mode Exit fullscreen mode

Each test result is indexed into OpenSearch straight away. The document is in the index before the next test has started.

Each document looks like this:

{
  "run_id": "a3f2c1...",
  "suite": "Tests.Login",
  "test": "Valid credentials should log in",
  "status": "FAIL",
  "message": "Element not found: #submit-btn",
  "tags": ["smoke", "auth"],
  "start_time": "2026-03-21T12:20:35.701Z",
  "elapsed_seconds": 3.001
}
Enter fullscreen mode Exit fullscreen mode

The run_id field is generated once per test run, which makes it straightforward to filter the dashboard to a specific run or compare runs over time.

2. Setup

Clone the repo and install dependencies:

git clone https://github.com/007bsd/results-execution-monitoring
cd results-execution-monitoring
pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Start OpenSearch and the dashboard:

docker compose up -d
Enter fullscreen mode Exit fullscreen mode

Note: The first run downloads around 1 GB of images. Once containers are up, confirm OpenSearch is ready with curl http://localhost:9200. A JSON response means it is running.

Run the setup script once to create the index pattern and a starter dashboard:

python setup_dashboard.py
Enter fullscreen mode Exit fullscreen mode

This is meant to get things running quickly. The dashboard it creates is a starting point, not the final word. See the section below on building dashboards manually.

3. Running Tests with the Listener

python -m robot --listener opensearch_listener.OpenSearchListener tests/
Enter fullscreen mode Exit fullscreen mode

The listener confirms it has started and prints the run_id and index name:

Screenshot: Terminal showing the listener starting, creating the index and printing run_id

As the run progresses, each result is confirmed in the terminal:

Screenshot: Terminal showing PASS and FAIL lines streaming from the OpenSearchListener as tests complete

Open http://localhost:5601 before or during the run to watch results appear in real time.

4. Verifying Data in OpenSearch

To confirm results are being indexed correctly, the Dev Tools console at http://localhost:5601/app/dev_tools#/console can be used to query the index directly:

GET robot-results/_mapping
Enter fullscreen mode Exit fullscreen mode

Screenshot: Dev Tools console showing the robot-results index mapping with all fields correctly typed

GET robot-results/_search
{
  "size": 1
}
Enter fullscreen mode Exit fullscreen mode

5. Exploring Results in Discover

OpenSearch Dashboards has a Discover view that lets you search and filter all indexed documents. It is useful for digging into specific failures, filtering by tag, run, or suite, and understanding patterns across multiple runs.

Screenshot: Discover view showing filtered failed tests with full details including test name, suite, message, and elapsed time

6. Building Dashboards Manually

The setup script creates a starter dashboard to get things going, but most people will want to build their own visualisations on top of the data. OpenSearch Dashboards has a full visual editor for this, no configuration files or scripts required.

To create a visualisation:

  1. Go to http://localhost:5601 and open Visualize from the menu
  2. Click Create visualization and choose a chart type
  3. Select robot-results as the index
  4. Configure the metric (e.g. Count) and bucket (e.g. Terms on status) to get a pass/fail breakdown
  5. Save the visualisation with a name

To build a dashboard from those visualisations:

  1. Open Dashboard from the menu
  2. Click Create new dashboard
  3. Click Add from library and select the saved visualisations
  4. Arrange the panels, set a title, and save

Some useful combinations to start with:

  • Pie or donut chart of status for an overall pass/fail ratio
  • Data table of recent failures showing test, suite, and message
  • Bar chart of elapsed_seconds by test name to spot slow tests
  • A metric panel showing total fail count for the current run
  • Filter by run_id to isolate and compare specific runs

Result

Once the stack is running, every test result appears in the dashboard the moment it is indexed. Failures are immediately visible with the full message, test name, suite, and duration, without waiting for the suite to complete.

Screenshot: Robot Framework Results dashboard showing the failed tests table, Pass vs Fail donut chart, and Total Fail count

GIF: Dashboard with both panels side by side updating as test results come in


Conclusion

The standard approach to test reporting is batch: everything is available when the run finishes, and not before. For short suites this is acceptable. For longer ones it is a genuine visibility problem.

Streaming results to OpenSearch as each test completes inverts that. Results are available immediately, failures are visible in context, and the history of every run is retained and searchable without any extra work.

The listener pattern used here is Robot Framework-specific, but the underlying idea applies to any test framework with a comparable hook system.


Resources

If you try setting this up and encounter any issues, please leave a comment. The complete code for this project is available on GitHub

Top comments (0)