Milestone 2: Standardizing Telemetry Output with JSON, Prometheus, and OpenMetrics
In this milestone, we are focusing on one thing only: data format standardization.
The Heka Insights Agent already collects CPU, memory, and disk telemetry.
Now the goal is to emit the same logical metrics in three standard output formats:
- JSON
- Prometheus text exposition
- OpenMetrics text format
Why This Milestone Matters
If an agent has no clear format strategy, every downstream integration becomes custom work.
That slows down adoption and increases maintenance cost.
By standardizing format early, we get:
- stable contracts for integrations
- easier validation and testing
- portability across observability stacks
- clearer boundaries between collection and export
Milestone Scope (Only Data Format)
This milestone does not include transports, retry logic, or backend adapters.
It only covers how telemetry is represented and serialized.
Included:
- canonical internal metric model
- naming/type/unit rules
- serializers for
json,prometheus,openmetrics - deterministic output behavior
- contract tests with golden files
Out of scope:
- Datadog/New Relic senders
- batching/compression/persistence
- new collector domains
Canonical Metric Contract
Every metric will be representable through one shared contract:
-
name(string) -
description(string) -
type(gaugeorcounter) -
unit(e.g.bytes,seconds,percent,count) -
value(number) -
labels(map of string to string; empty allowed) -
timestamp_unix_ms(optional integer)
This contract is the core design decision in Milestone 2.
Serializers consume this model and render format-specific output without changing metric meaning.
Naming and Semantics Rules
To keep the output stable and machine-friendly:
- metric names are lowercase snake_case
- all names are prefixed with
heka_ - counters end in
_total - unit suffixes are explicit (
_bytes,_seconds,_percent) - label keys are lowercase snake_case
- metric identity must stay consistent across formats
Current Metric Mapping
Initial canonical mapping includes:
-
heka_cpu_usage_percent(gauge) -
heka_cpu_time_percent(gauge withmode=<field>) -
heka_memory_virtual_used_bytes(gauge) -
heka_memory_virtual_available_bytes(gauge) -
heka_memory_virtual_total_bytes(gauge) -
heka_memory_swap_used_bytes(gauge) -
heka_memory_swap_total_bytes(gauge) -
heka_disk_read_bytes_total(counter) -
heka_disk_write_bytes_total(counter) -
heka_disk_reads_total(counter) -
heka_disk_writes_total(counter)
Format-Specific Requirements
JSON
- UTF-8 JSON object
- includes
schema_version(starting atv1) - includes
generated_at(RFC3339 UTC) - includes top-level
metricsarray
Prometheus
- Prometheus text exposition format (
0.0.4) - include
# HELPand# TYPElines - deterministic label ordering
- no OpenMetrics-only directives
OpenMetrics
- OpenMetrics text format
- include
# HELP,# TYPE, and# UNITwhen known - terminate payload with
# EOF - metric names and labels remain aligned with Prometheus mode
Configuration Contract
One selector controls serialization:
OUTPUT_FORMAT=json|prometheus|openmetrics
- default:
json - invalid values: fail fast with a clear startup error
Acceptance Criteria
Milestone 2 is done when:
- same logical metric set is emitted in all three formats
- names/types/units are consistent
- Prometheus and OpenMetrics outputs validate
- JSON includes schema metadata and metrics array
- output order is deterministic
- golden-file tests exist for each format
GitHub Milestone Breakdown
Work is tracked through:
- M2-1 canonical metric model
- M2-2 collector-to-canonical mapping
- M2-3 JSON serializer
- M2-4 Prometheus serializer
- M2-5 OpenMetrics serializer
- M2-6 output format config + validation
- M2-7 fixture/contract tests
- M2-8 docs update
Top comments (0)