DEV Community

Cover image for How to Convert Logs to Metrics: A Practical Guide with OpenObserve Pipelines
Simran Kumari
Simran Kumari

Posted on • Originally published at openobserve.ai

How to Convert Logs to Metrics: A Practical Guide with OpenObserve Pipelines

Most engineering teams start their observability journey with logs. They're easy to implement, they capture exactly what happened, and when something breaks, logs are usually the first place you look.

But here's the thing: your logs already contain metrics—timestamps, status codes, error flags, latency values. The problem isn't a lack of data. It's that you're asking metric questions while relying entirely on logs.

In this guide, I'll show you how to extract metrics from logs using scheduled pipelines, step by step.

Why Convert Logs to Metrics?

Before diving into the how, let's understand the why.

Aspect Logs Metrics
What they represent Individual events Aggregated summaries
Detail level Per request/event High-level trends
Cardinality High Low
Best for Debugging, root cause analysis Monitoring, alerting, dashboards
Query cost Expensive at scale Cheap and fast

When you try to use logs as a substitute for metrics, you pay the cost of high cardinality for questions that don't need that detail.

Example: Building a dashboard showing "error rates over time" by scanning millions of log entries repeatedly? That's slow and expensive. Deriving a metric once per minute? That's fast and cheap.

Understanding Pipeline Types

In OpenObserve, pipelines fall into two categories:

Real-time Pipelines

Operate on individual events as they arrive. Great for:

  • Normalizing fields
  • Enriching records
  • Dropping noisy data
  • Routing events to different streams

Scheduled Pipelines

Run at fixed intervals over defined time windows. Perfect for:

  • Aggregating logs into metrics
  • Computing summaries
  • Generating time-series data

For logs-to-metrics conversion, scheduled pipelines are what you need.

The Logs-to-Metrics Flow

Here's how the data flows:

App Logs → Log Stream → Scheduled Pipeline (every 1min) → Metric Stream → Dashboards/Alerts
Enter fullscreen mode Exit fullscreen mode

The pipeline:

  1. Reads logs from the previous time window
  2. Filters and aggregates them
  3. Writes results to a metric stream

Step-by-Step: Converting Kubernetes Logs to Metrics

Let's build this with real Kubernetes logs.

Prerequisites

  • An OpenObserve instance (Cloud or Self-hosted)
  • Sample log data (or your own logs)

Step 1: Ingest Your Logs

For this demo, grab some sample Kubernetes logs:

curl -L https://zinc-public-data.s3.us-west-2.amazonaws.com/zinc-enl/sample-k8s-logs/k8slog_json.json.zip -o k8slog_json.json.zip
unzip k8slog_json.json.zip
Enter fullscreen mode Exit fullscreen mode

These logs contain typical K8s fields like _timestamp, code (HTTP status), kubernetes_container_name, kubernetes_labels_app, etc.

Step 2: Define Your Metrics

Before writing pipeline code, decide what you want to measure:

  • k8s_http_requests_total: Total requests per app per minute
  • k8s_http_errors_total: Total 5xx errors per app per minute

Step 3: Create the Scheduled Pipeline

Source Query for Request Count

SELECT
  'k8s_http_requests_total' AS "__name__",
  'counter' AS "__type__",
  COUNT(*) AS "value",
  kubernetes_labels_app AS app,
  kubernetes_namespace_name AS namespace, 
  MAX(_timestamp) AS _timestamp
FROM kubernetes_logs
GROUP BY
  kubernetes_labels_app,
  kubernetes_namespace_name
Enter fullscreen mode Exit fullscreen mode

Key fields explained:

  • __name__ → Metric name (required)
  • __type__ → Metric type: counter or gauge (required)
  • value → The actual metric value (required)
  • app, namespace → Labels for filtering/grouping

Source Query for Error Count

SELECT
  'k8s_http_errors_total' AS __name__,
  'counter' AS __type__,
  COUNT(*) AS value,
  kubernetes_labels_app AS app,
  kubernetes_namespace_name AS namespace,
  MAX(_timestamp) AS _timestamp
FROM kubernetes_logs
WHERE code >= 500
GROUP BY
  kubernetes_labels_app,
  kubernetes_namespace_name
Enter fullscreen mode Exit fullscreen mode

The WHERE code >= 500 filter ensures we only count server errors.

Step 4: Configure Pipeline Settings

  1. Test the query - Run it manually to verify output includes __name__, __type__, and value
  2. Set the interval - Typically 1 minute for real-time metrics
  3. Define the destination - Point to your metric stream
  4. Connect the nodes and save

Step 5: Verify Your Metrics

After the pipeline runs:

  1. Check the destination metric stream
  2. Verify records contain expected metric names, types, values, and labels
  3. Build dashboards using the new metrics

Troubleshooting Common Errors

error in ingesting metrics missing __name__

Your query output doesn't include the metric name.

Fix: Ensure your SQL includes:

SELECT 'metric_name' AS "__name__", ...
Enter fullscreen mode Exit fullscreen mode

error in ingesting metrics missing __type__

The metric type isn't being set.

Fix: Add the type field:

SELECT 'counter' AS "__type__", ...
Enter fullscreen mode Exit fullscreen mode

error in ingesting metrics missing value

The numeric value is missing.

Fix: Include an aggregation:

COUNT(*) AS "value"
Enter fullscreen mode Exit fullscreen mode

DerivedStream has reached max retries of 3

The pipeline failed multiple times due to validation errors.

Fix:

  1. Open the pipeline config
  2. Run the source SQL manually
  3. Verify all required fields are present
  4. Save and wait for next scheduled run

No Metrics Produced (But No Errors)

The pipeline runs but produces nothing.

Cause: No logs exist in the source stream for the time window.

Fix:

  1. Verify logs are being ingested to the source stream
  2. Run the SQL query manually against recent data
  3. Check the time window matches when logs exist

Debugging Pipeline Failures

Enable usage reporting to track pipeline execution:

ZO_USAGE_REPORTING_ENABLED=true
Enter fullscreen mode Exit fullscreen mode

This surfaces:

  • Error stream: Detailed failure messages
  • Triggers stream: Pipeline execution history
  • UI indicators: Visual failure signals with error messages

What's Next?

Once your logs-to-metrics pipeline is running:

  1. Build dashboards from your derived metrics
  2. Set up alerts on metric thresholds
  3. Create SLOs using your new metrics

Key Takeaways

  • Logs contain metric data—you just need to extract it
  • Scheduled pipelines bridge the gap between raw logs and aggregated metrics
  • Three required fields: __name__, __type__, value
  • No new instrumentation needed—work with data you already have
  • Result: Faster dashboards, cheaper queries, better observability

Have you implemented logs-to-metrics in your stack? What challenges did you face? Let me know in the comments! 👇

Originally published on the OpenObserve blog.

Top comments (0)