No more waiting for the suite to finish to find out what failed.
Introduction
In a previous organisation, a proof of concept was built to stream Robot Framework test results into a live monitoring dashboard using the ELK stack (Elasticsearch, Logstash, and Kibana). The idea was simple: instead of waiting for the full run to complete and then reading the output report, results would appear in a dashboard the moment each test finished.
The problem with ELK is that it is no longer fully open source. Elastic changed its licensing in 2021. OpenSearch is the community fork that stayed open. This project is a rebuild of that original setup using OpenSearch, with the same goal: live test visibility while the suite is still running.
The approach works with any test framework that exposes hooks into the test lifecycle. Robot Framework is used here because that is where the original work was done, but the same pattern applies to pytest, Selenium, and others.
The Problem
When a test suite runs, results are only written to output.xml when the entire run completes. For short suites this is fine. For longer ones it creates a gap.
During the run, the only signal is whatever the terminal prints. A failure shows up as FAIL with a short message, but by the time the run finishes and the report is available, the context around that failure is gone. Was it a transient error? A timeout that only happens under load? Something that affected multiple tests in the same suite? There is no way to know without digging through logs after the fact.
What would actually help is seeing each result as it happens, with the full message, the test name, the suite, and the duration, in a searchable dashboard that stays available after the run.
Tech Stack
- Robot Framework - test framework with a listener API for hooking into test execution
- OpenSearch - open-source search and analytics engine for storing results
- OpenSearch Dashboards - visualisation layer for building live dashboards
- Docker - runs OpenSearch and the dashboard in containers
- Python - the listener that ships results to OpenSearch
Architecture
+-------------------------------+
| Robot Framework (local) |
| listener fires after each |
| test and ships the result |
+---------------+---------------+
|
v
+---------------+---------------+ +---------------------------+
| OpenSearch (Docker) | -----> | OpenSearch Dashboards |
| port 9200 | | port 5601 |
+-------------------------------+ +---------------------------+
Robot Framework and tests run locally as usual. Docker handles OpenSearch and the dashboard without any manual installation needed.
Step by Step
Prerequisites
1. How the Listener Works
Robot Framework exposes a listener API, a Python class it calls at specific points during execution. The hook used here is end_test, which fires immediately after each test completes, before the next one starts.
class OpenSearchListener:
ROBOT_LISTENER_API_VERSION = 2
def end_test(self, name, attrs):
doc = {
"run_id": self.run_id,
"suite": self.suite_name,
"test": name,
"status": attrs["status"],
"message": attrs["message"],
"tags": list(attrs["tags"]),
"start_time": ...,
"elapsed_seconds": attrs["elapsedtime"] / 1000,
"indexed_at": datetime.utcnow().isoformat() + "Z",
}
self.client.index(index=self.index, body=doc)
Each test result is indexed into OpenSearch straight away. The document is in the index before the next test has started.
Each document looks like this:
{
"run_id": "a3f2c1...",
"suite": "Tests.Login",
"test": "Valid credentials should log in",
"status": "FAIL",
"message": "Element not found: #submit-btn",
"tags": ["smoke", "auth"],
"start_time": "2026-03-21T12:20:35.701Z",
"elapsed_seconds": 3.001
}
The run_id field is generated once per test run, which makes it straightforward to filter the dashboard to a specific run or compare runs over time.
2. Setup
Clone the repo and install dependencies:
git clone https://github.com/007bsd/results-execution-monitoring
cd results-execution-monitoring
pip install -r requirements.txt
Start OpenSearch and the dashboard:
docker compose up -d
Note: The first run downloads around 1 GB of images. Once containers are up, confirm OpenSearch is ready with
curl http://localhost:9200. A JSON response means it is running.
Run the setup script once to create the index pattern and a starter dashboard:
python setup_dashboard.py
This is meant to get things running quickly. The dashboard it creates is a starting point, not the final word. See the section below on building dashboards manually.
3. Running Tests with the Listener
python -m robot --listener opensearch_listener.OpenSearchListener tests/
The listener confirms it has started and prints the run_id and index name:
As the run progresses, each result is confirmed in the terminal:
Open http://localhost:5601 before or during the run to watch results appear in real time.
4. Verifying Data in OpenSearch
To confirm results are being indexed correctly, the Dev Tools console at http://localhost:5601/app/dev_tools#/console can be used to query the index directly:
GET robot-results/_mapping
GET robot-results/_search
{
"size": 1
}
5. Exploring Results in Discover
OpenSearch Dashboards has a Discover view that lets you search and filter all indexed documents. It is useful for digging into specific failures, filtering by tag, run, or suite, and understanding patterns across multiple runs.
6. Building Dashboards Manually
The setup script creates a starter dashboard to get things going, but most people will want to build their own visualisations on top of the data. OpenSearch Dashboards has a full visual editor for this, no configuration files or scripts required.
To create a visualisation:
- Go to
http://localhost:5601and open Visualize from the menu - Click Create visualization and choose a chart type
- Select
robot-resultsas the index - Configure the metric (e.g. Count) and bucket (e.g. Terms on
status) to get a pass/fail breakdown - Save the visualisation with a name
To build a dashboard from those visualisations:
- Open Dashboard from the menu
- Click Create new dashboard
- Click Add from library and select the saved visualisations
- Arrange the panels, set a title, and save
Some useful combinations to start with:
- Pie or donut chart of
statusfor an overall pass/fail ratio - Data table of recent failures showing
test,suite, andmessage - Bar chart of
elapsed_secondsby test name to spot slow tests - A metric panel showing total fail count for the current run
- Filter by
run_idto isolate and compare specific runs
Result
Once the stack is running, every test result appears in the dashboard the moment it is indexed. Failures are immediately visible with the full message, test name, suite, and duration, without waiting for the suite to complete.
Conclusion
The standard approach to test reporting is batch: everything is available when the run finishes, and not before. For short suites this is acceptable. For longer ones it is a genuine visibility problem.
Streaming results to OpenSearch as each test completes inverts that. Results are available immediately, failures are visible in context, and the history of every run is retained and searchable without any extra work.
The listener pattern used here is Robot Framework-specific, but the underlying idea applies to any test framework with a comparable hook system.
Resources
- GitHub - github.com/007bsd/results-execution-monitoring
- Robot Framework Listener API - robotframework.org/robotframework/latest/RobotFrameworkUserGuide.html#listener-interface
- OpenSearch Documentation - opensearch.org/docs/latest
- OpenSearch Dashboards - opensearch.org/docs/latest/dashboards
- Docker - docs.docker.com
If you try setting this up and encounter any issues, please leave a comment. The complete code for this project is available on GitHub






Top comments (0)