Simran Kumari

Posted on Feb 2 • Originally published at openobserve.ai

How to Convert Logs to Metrics: A Practical Guide with OpenObserve Pipelines

#observability #devops #monitoring #kubernetes

Most engineering teams start their observability journey with logs. They're easy to implement, they capture exactly what happened, and when something breaks, logs are usually the first place you look.

But here's the thing: your logs already contain metrics—timestamps, status codes, error flags, latency values. The problem isn't a lack of data. It's that you're asking metric questions while relying entirely on logs.

In this guide, I'll show you how to extract metrics from logs using scheduled pipelines, step by step.

Why Convert Logs to Metrics?

Before diving into the how, let's understand the why.

Aspect	Logs	Metrics
What they represent	Individual events	Aggregated summaries
Detail level	Per request/event	High-level trends
Cardinality	High	Low
Best for	Debugging, root cause analysis	Monitoring, alerting, dashboards
Query cost	Expensive at scale	Cheap and fast

When you try to use logs as a substitute for metrics, you pay the cost of high cardinality for questions that don't need that detail.

Example: Building a dashboard showing "error rates over time" by scanning millions of log entries repeatedly? That's slow and expensive. Deriving a metric once per minute? That's fast and cheap.

Understanding Pipeline Types

In OpenObserve, pipelines fall into two categories:

Real-time Pipelines

Operate on individual events as they arrive. Great for:

Normalizing fields
Enriching records
Dropping noisy data
Routing events to different streams

Scheduled Pipelines

Run at fixed intervals over defined time windows. Perfect for:

Aggregating logs into metrics
Computing summaries
Generating time-series data

For logs-to-metrics conversion, scheduled pipelines are what you need.

The Logs-to-Metrics Flow

Here's how the data flows:

App Logs → Log Stream → Scheduled Pipeline (every 1min) → Metric Stream → Dashboards/Alerts

The pipeline:

Reads logs from the previous time window
Filters and aggregates them
Writes results to a metric stream

Step-by-Step: Converting Kubernetes Logs to Metrics

Let's build this with real Kubernetes logs.

Prerequisites

An OpenObserve instance (Cloud or Self-hosted)
Sample log data (or your own logs)

Step 1: Ingest Your Logs

For this demo, grab some sample Kubernetes logs:

curl -L https://zinc-public-data.s3.us-west-2.amazonaws.com/zinc-enl/sample-k8s-logs/k8slog_json.json.zip -o k8slog_json.json.zip
unzip k8slog_json.json.zip

These logs contain typical K8s fields like _timestamp, code (HTTP status), kubernetes_container_name, kubernetes_labels_app, etc.

Step 2: Define Your Metrics

Before writing pipeline code, decide what you want to measure:

k8s_http_requests_total: Total requests per app per minute
k8s_http_errors_total: Total 5xx errors per app per minute

Step 3: Create the Scheduled Pipeline

Source Query for Request Count

SELECT
  'k8s_http_requests_total' AS "__name__",
  'counter' AS "__type__",
  COUNT(*) AS "value",
  kubernetes_labels_app AS app,
  kubernetes_namespace_name AS namespace, 
  MAX(_timestamp) AS _timestamp
FROM kubernetes_logs
GROUP BY
  kubernetes_labels_app,
  kubernetes_namespace_name

Key fields explained:

__name__ → Metric name (required)
__type__ → Metric type: counter or gauge (required)
value → The actual metric value (required)
app, namespace → Labels for filtering/grouping

Source Query for Error Count

SELECT
  'k8s_http_errors_total' AS __name__,
  'counter' AS __type__,
  COUNT(*) AS value,
  kubernetes_labels_app AS app,
  kubernetes_namespace_name AS namespace,
  MAX(_timestamp) AS _timestamp
FROM kubernetes_logs
WHERE code >= 500
GROUP BY
  kubernetes_labels_app,
  kubernetes_namespace_name

The WHERE code >= 500 filter ensures we only count server errors.

Step 4: Configure Pipeline Settings

Test the query - Run it manually to verify output includes __name__, __type__, and value
Set the interval - Typically 1 minute for real-time metrics
Define the destination - Point to your metric stream
Connect the nodes and save

Step 5: Verify Your Metrics

After the pipeline runs:

Check the destination metric stream
Verify records contain expected metric names, types, values, and labels
Build dashboards using the new metrics

Troubleshooting Common Errors

`error in ingesting metrics missing name`

Your query output doesn't include the metric name.

Fix: Ensure your SQL includes:

SELECT 'metric_name' AS "__name__", ...

`error in ingesting metrics missing type`

The metric type isn't being set.

Fix: Add the type field:

SELECT 'counter' AS "__type__", ...

`error in ingesting metrics missing value`

The numeric value is missing.

Fix: Include an aggregation:

COUNT(*) AS "value"

`DerivedStream has reached max retries of 3`

The pipeline failed multiple times due to validation errors.

Fix:

Open the pipeline config
Run the source SQL manually
Verify all required fields are present
Save and wait for next scheduled run

No Metrics Produced (But No Errors)

The pipeline runs but produces nothing.

Cause: No logs exist in the source stream for the time window.

Fix:

Verify logs are being ingested to the source stream
Run the SQL query manually against recent data
Check the time window matches when logs exist

Debugging Pipeline Failures

Enable usage reporting to track pipeline execution:

ZO_USAGE_REPORTING_ENABLED=true

This surfaces:

Error stream: Detailed failure messages
Triggers stream: Pipeline execution history
UI indicators: Visual failure signals with error messages

What's Next?

Once your logs-to-metrics pipeline is running:

Build dashboards from your derived metrics
Set up alerts on metric thresholds
Create SLOs using your new metrics

Key Takeaways

Logs contain metric data—you just need to extract it
Scheduled pipelines bridge the gap between raw logs and aggregated metrics
Three required fields: __name__, __type__, value
No new instrumentation needed—work with data you already have
Result: Faster dashboards, cheaper queries, better observability

Have you implemented logs-to-metrics in your stack? What challenges did you face? Let me know in the comments! 👇

Originally published on the OpenObserve blog.

DEV Community

How to Convert Logs to Metrics: A Practical Guide with OpenObserve Pipelines

Why Convert Logs to Metrics?

Understanding Pipeline Types

Real-time Pipelines

Scheduled Pipelines

The Logs-to-Metrics Flow

Step-by-Step: Converting Kubernetes Logs to Metrics

Prerequisites

Step 1: Ingest Your Logs

Step 2: Define Your Metrics

Step 3: Create the Scheduled Pipeline

Source Query for Request Count

Source Query for Error Count

Step 4: Configure Pipeline Settings

Step 5: Verify Your Metrics

Troubleshooting Common Errors

`error in ingesting metrics missing name`

`error in ingesting metrics missing type`

`error in ingesting metrics missing value`

`DerivedStream has reached max retries of 3`

No Metrics Produced (But No Errors)

Debugging Pipeline Failures

What's Next?

Key Takeaways

Top comments (0)

Why Convert Logs to Metrics?

Understanding Pipeline Types

Real-time Pipelines

Scheduled Pipelines

The Logs-to-Metrics Flow

Step-by-Step: Converting Kubernetes Logs to Metrics

Prerequisites

Step 1: Ingest Your Logs

Step 2: Define Your Metrics

Step 3: Create the Scheduled Pipeline

Source Query for Request Count

Source Query for Error Count

Step 4: Configure Pipeline Settings

Step 5: Verify Your Metrics

Troubleshooting Common Errors

error in ingesting metrics missing __name__

error in ingesting metrics missing __type__

error in ingesting metrics missing value

DerivedStream has reached max retries of 3

No Metrics Produced (But No Errors)

Debugging Pipeline Failures

What's Next?

Key Takeaways

`error in ingesting metrics missing name`

`error in ingesting metrics missing type`

`error in ingesting metrics missing value`

`DerivedStream has reached max retries of 3`