DEV Community

Cover image for How I Built Kafka Cost Tracking with Prometheus JMX
Pascal Clément
Pascal Clément

Posted on

How I Built Kafka Cost Tracking with Prometheus JMX

When you run Kafka at scale, you quickly realize that not all topics are created equal. Some topics consume gigabytes of storage. Others generate massive traffic. But without tooling, you have no idea which ones — or which team owns them.

This is the problem I ran into while building PartitionPilot, a Kafka cost management platform. In this post I'll explain how we track storage and traffic costs per topic using Prometheus JMX metrics.

The Problem: Kafka Has No Native Cost View

Kafka gives you offsets, consumer lag, and partition counts. What it doesn't give you is:

  • How much storage each topic is using
  • How much traffic (bytes in/out) each topic generates
  • Which team or service owns a topic
  • What that costs you per month

For small clusters this doesn't matter much. But once you have dozens of teams and hundreds of topics, the question "who is responsible for this 500GB topic?" becomes very real.

The Solution: Prometheus JMX Metrics

Kafka exposes JMX metrics that can be scraped by Prometheus. The two most useful metrics for cost tracking are:

kafka.log:type=Log,name=Size — the size in bytes of each topic-partition log on disk.

kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec and BytesOutPerSec — the traffic rate per topic.

With these two metrics you can calculate:

  • Storage cost: bytes stored × your storage price per GB
  • Traffic cost: bytes transferred × your network price per GB

How PartitionPilot Uses These Metrics

PartitionPilot connects to your Prometheus endpoint (the one scraping Kafka JMX) and takes periodic snapshots. Each snapshot captures the current storage and traffic rates per topic.

Kafka JMX → Prometheus → PartitionPilot → Cost snapshot (PostgreSQL)
Enter fullscreen mode Exit fullscreen mode

From these snapshots we calculate a rolling cost per topic, per day, per month. The result is a dashboard where you can see at a glance which topics are your "top talkers" — the ones driving most of your Kafka bill.

Assigning Ownership

Cost numbers alone aren't enough. You also need to know who owns each topic.

PartitionPilot lets you assign an owner (a person or team) to each topic and consumer group. Once ownership is assigned, you can generate a chargeback report: a CSV that breaks down Kafka costs by owner, ready to share with engineering managers or finance teams.

Exporting Metrics via /metrics

PartitionPilot also exposes a /metrics endpoint in Prometheus format. This means you can scrape PartitionPilot itself from your existing Prometheus setup and build Grafana dashboards on top of the cost and ownership data.

Works with Apache Kafka, AWS MSK, and Confluent

Any Kafka distribution that exposes JMX metrics via Prometheus is compatible. This includes:

  • Apache Kafka (self-hosted)
  • AWS MSK (Managed Streaming for Apache Kafka)
  • Confluent Platform

No agents or Kafka plugins are required — just a Prometheus scrape URL.

Getting Started

PartitionPilot is self-hosted via Docker Compose. You can start a free 30-day trial at partitionpilot.com — no credit card required.

If you're running Kafka and want to understand what it actually costs, give it a try. I'd love to hear your feedback.


Pascal Clément — founder of PartitionPilot

Top comments (0)