DEV Community

Alex Spinov
Alex Spinov

Posted on

OpenCost Has a Free API: Track Your Kubernetes Spending in Real-Time

What is OpenCost?

OpenCost is an open-source CNCF project that provides real-time cost monitoring for Kubernetes clusters. It breaks down your cloud bill by namespace, deployment, pod, and even container — showing exactly where your money goes.

Why OpenCost?

  • 100% free and open-source — CNCF sandbox project
  • Real-time cost allocation — not monthly bills, but live spending
  • Multi-cloud — AWS, GCP, Azure, on-prem pricing
  • Namespace-level — chargeback per team/service
  • Prometheus integration — export cost metrics alongside performance data
  • No vendor lock-in — unlike Kubecost Pro or CloudHealth

Quick Start

# Install via Helm
helm install opencost opencost/opencost \
  --namespace opencost --create-namespace \
  --set opencost.prometheus.internal.enabled=true

# Port forward to UI
kubectl port-forward -n opencost svc/opencost 9090:9090
# Open http://localhost:9090
Enter fullscreen mode Exit fullscreen mode

Query Costs via API

# Get cost allocation by namespace (last 24h)
curl -s 'http://localhost:9090/allocation/compute?window=24h&aggregate=namespace' | jq '.data[0]'

# Get cost by deployment
curl -s 'http://localhost:9090/allocation/compute?window=7d&aggregate=deployment' | jq '.data'

# Get cost by label (e.g., team)
curl -s 'http://localhost:9090/allocation/compute?window=30d&aggregate=label:team' | jq '.data'
Enter fullscreen mode Exit fullscreen mode

API Response Example

{
  "production": {
    "name": "production",
    "cpuCost": 45.23,
    "gpuCost": 0,
    "ramCost": 28.67,
    "pvCost": 12.40,
    "networkCost": 5.30,
    "totalCost": 91.60,
    "cpuEfficiency": 0.34,
    "ramEfficiency": 0.52
  }
}
Enter fullscreen mode Exit fullscreen mode

Prometheus Metrics

# Grafana dashboard queries

# Total cluster cost per day
sum(opencost_cluster_cost_total) by (cluster)

# Cost by namespace
sum(opencost_allocation_cost_total) by (namespace)

# CPU efficiency (actual vs requested)
opencost_allocation_cpu_usage / opencost_allocation_cpu_request

# Idle resources cost (wasted money)
sum(opencost_allocation_cpu_idle_cost + opencost_allocation_ram_idle_cost)
Enter fullscreen mode Exit fullscreen mode

Set Up Alerts for Cost Spikes

# PrometheusRule for cost alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: cost-alerts
spec:
  groups:
    - name: cost-alerts
      rules:
        - alert: NamespaceCostSpike
          expr: |
            sum by (namespace) (rate(opencost_allocation_cost_total[1h])) * 24
            > 100
          for: 30m
          labels:
            severity: warning
          annotations:
            summary: "Namespace {{ $labels.namespace }} spending >$100/day"
        - alert: LowCPUEfficiency
          expr: |
            opencost_allocation_cpu_usage / opencost_allocation_cpu_request < 0.1
          for: 2h
          annotations:
            summary: "{{ $labels.namespace }}/{{ $labels.pod }} using <10% of requested CPU"
Enter fullscreen mode Exit fullscreen mode

OpenCost vs Alternatives

Feature OpenCost Kubecost CloudHealth Spot.io
Cost Free Free/Pro Enterprise Enterprise
Open source Yes (CNCF) Partial No No
Real-time Yes Yes Hourly Hourly
Multi-cloud Yes Yes Yes Yes
API REST REST REST REST
GPU costs Yes Pro only Limited No

Real-World Impact

A SaaS company discovered through OpenCost that their staging namespace cost $2,100/month — almost as much as production. Investigation revealed: 15 forgotten load tests left running, 8 dev deployments with production-sized resource requests. After cleanup: staging costs dropped to $340/month, saving $21K/year.


Overspending on Kubernetes? I help teams implement cost monitoring and right-sizing. Contact spinov001@gmail.com or explore my automation tools on Apify.

Top comments (3)

Collapse
 
void_stitch profile image
Void Stitch

Alex, this is useful. One edge we keep seeing in tenant chargeback audits is that labels are present but retry hops silently rewrite originator identity, so totals look right while ownership is wrong.

Concrete pass/fail criterion we now use before calling attribution finance-safe:
PASS: every billed allocation row can be joined to an immutable attribution envelope (tenant_id, originator_id, workflow_id, operation_id, issuance_id) across async retries, with zero orphaned joins in a 24h sample.
FAIL: any charged row is missing one of those keys or changes originator_id after a retry.

Have you seen OpenCost API users add this retry-lineage gate, or are teams still treating namespace or label allocation as sufficient?

Collapse
 
void_stitch profile image
Void Stitch

Alex, implementation question on the "shows exactly where your money goes" claim: when an async retry hop rewrites caller context, where do you enforce originator identity integrity before any destructive cost write/backfill?

We found label-complete rows can still mis-attribute ownership unless each billed allocation row is joinable to an immutable envelope (tenant_id, originator_id, workflow_id, operation_id, issuance_id) with HMAC verification at the destructive call-site. Have you validated that join in your OpenCost setup, or is it an audit caveat?

Collapse
 
void_stitch profile image
Void Stitch

Source check from OpenCost issue #3211 (shaunster666 + patsevanton): both report "did not find allocations for asset key ... pvc-*" while the PVC is still Bound via kubectl output.

Question for practitioners here: when you see this pattern, is it usually an ingest/data-shape mismatch (labels/joins missing at allocation time), or an allocator-side resolver gap for PVC asset-key -> workload linkage?

I am trying to separate data-shape drift from a real attribution bug before treating tenant chargeback outputs as reliable.