EKS Metrics: Amazon Managed Prometheus vs Self-Managed Prometheus
Once your cluster is running workloads, you need a metrics backend: something that scrapes (or receives) time series, stores them, and powers dashboards and alerts. On AWS the fork is usually Amazon Managed Service for Prometheus (AMP)—a managed, Prometheus-compatible store—or self-managed Prometheus in the cluster (Helm chart, operator, or agent + remote storage).
This article is a practical decision guide for that choice on EKS, especially when you are not migrating a decade of PromQL dashboards and recording rules. It covers what each path optimizes for, how cost shows up on the invoice versus in engineer time, how alerting differs, and a short rubric for defaulting to AMP, self-managed, or a hybrid (remote_write).
1. Overview
This guide helps you decide, for a new or early EKS observability stack:
- What AMP and self-managed Prometheus each own (ingest, storage, query, alerting)
- How to compare cost at a high level (ingestion, storage, queries, and cluster resources)
- How to measure or estimate active series and ingestion before you pick a tier
- What you still run in-cluster either way (exporters, scrape config, ADOT)
- When a hybrid (Prometheus in-cluster → AMP via
remote_write) is the least painful path
2. Prerequisites
- Familiarity with Prometheus concepts: scrape targets, labels, TSDB retention, PromQL
- An EKS cluster (Standard or Auto Mode) with a plan for who operates platform add-ons
- Current Amazon Managed Service for Prometheus pricing and AMP documentation—limits and pricing change over time
3. Name the two starting paths
Both paths speak Prometheus (PromQL, exposition format, alert rule semantics). The split is who runs the TSDB and query API.
Side-by-side
| Amazon Managed Prometheus (AMP) | Self-managed Prometheus (in-cluster) | |
|---|---|---|
| AWS runs | Ingestion pipeline, durable TSDB, query plane (serverless-style ops) | — |
| You run | Collectors/agents, scrape or receive config, IAM, workspace wiring | Prometheus server (often StatefulSet), PVCs, upgrades, HA, backups |
| Typical ingest |
AWS Distro for OpenTelemetry (ADOT) collector, Prometheus agent mode, or remote_write from an in-cluster server |
In-cluster scrape of ServiceMonitors, Pod annotations, static targets |
| Alerting | AMP managed Alertmanager and/or route to SNS; Grafana alerting is a separate choice | Alertmanager you deploy (often same Helm release or a sibling chart) |
| Dashboards | Often Amazon Managed Grafana or self-hosted Grafana with AMP as datasource | Grafana (or similar) pointing at in-cluster Prometheus Service |
| Multi-cluster | Natural fit: one workspace per env/region or federation patterns with less TSDB ops | Per-cluster Prometheus + optional Thanos/Mimir if you outgrow one server |
AMP — responsibility flow
+-------------------------------+
| EKS: workloads + exporters |
| (node-exporter, kube-state, |
| app /metrics, ADOT collector)|
+-------------------------------+
|
| remote_write / ADOT pipeline
v
+-------------------------------+
| AWS: AMP workspace (TSDB + |
| PromQL query API) |
+-------------------------------+
|
v
+-------------------------------+
| You: Grafana / AMP Alertmgr / |
| SNS routes, dashboards, SLOs |
+-------------------------------+
Self-managed — responsibility flow
+-------------------------------+
| EKS: Prometheus server |
| (scrape + TSDB on PVC/emptyDir)|
+-------------------------------+
|
v
+-------------------------------+
| EKS: Alertmanager (optional) |
| receivers → Slack/PagerDuty |
+-------------------------------+
|
v
+-------------------------------+
| You: Grafana, rule lifecycle, |
| upgrades, capacity, backups |
+-------------------------------+
Same for both: You still own what gets scraped, label cardinality, RBAC for the UI, and runbooks when alerts fire. Picking AMP does not remove the need for good metric and alert design.
4. Cost at a glance
Pricing moves; verify against AMP pricing and your EC2/EBS bill for self-managed. Think in three buckets: ingestion, storage/retention, queries (AMP) versus compute + disk + people (self-managed).
AMP (managed)
| Cost driver | What drives it up |
|---|---|
| Ingested samples | Short scrape intervals, high-cardinality labels, many targets |
| Storage | Long retention, high churn, many series |
| Queries | Heavy Grafana dashboards, ad-hoc PromQL, recording rules evaluated in AMP |
AMP can be cheap at small scale and expensive when cardinality explodes (unbounded pod labels, high-cardinality HTTP paths, per-UUID labels). Cost guardrails (sampling, relabel drops, allowed metric lists) matter more than on a single dev cluster where you only notice disk growth.
Self-managed (in-cluster)
| Cost driver | What drives it up |
|---|---|
| EC2 / Karpenter nodes | CPU/memory for Prometheus replicas and rule evaluation |
| EBS (or equivalent) | TSDB size × retention; compactions and WAL |
| Engineer time | Chart upgrades, PVC expansion, HA drills, backup/restore testing |
A minimal HA self-managed stack (two Prometheus replicas, anti-affinity, 20–50 Gi PVCs each, small Alertmanager) is often tens of dollars per month in AWS resources for a small cluster—before counting on-call and upgrade work. The invoice line is predictable; the hidden cost is operational.
Bottom line: AMP trades variable, usage-based spend for less TSDB operations. Self-managed trades fixed-ish infra cost for more control and more chores. Neither removes the need to design metrics carefully.
5. How to check and estimate your usage
Cost conversations are easier when you separate active series (how many distinct time series the TSDB holds) from ingestion rate (how many samples per second you push—what AMP bills most directly).
What counts as one series
A time series is one metric name plus one unique label set. Example: http_requests_total{method="GET",status="200",pod="abc"} is one series. More pods, paths, or IDs in labels → more series.
Rough scale tiers (planning only)
These are order-of-magnitude guides, not limits:
| Tier | Active series (ballpark) | Typical EKS picture |
|---|---|---|
| Small | ~1k–5k | Minimal scrape (little kube-state-metrics, no full cAdvisor), or aggressive metric_relabel_configs drops |
| Medium | ~10k–30k | node-exporter + kube-state-metrics + kubelet/cAdvisor + apiserver on a small cluster (handful of nodes, tens of pods) |
| Large | 50k+ | Full default chart scrape, many microservices, per-path HTTP metrics, or duplicate exporters |
A standard platform scrape (kube-state-metrics, kubernetes-nodes-cadvisor, apiserver, node-exporter) on a small cluster often lands above 10k even when it still feels “small” operationally—measure rather than assuming the 1k tier.
Turn series into ingestion (AMP-style math)
At a steady scrape interval, each active series produces about one sample per scrape:
samples_per_second ≈ active_series ÷ scrape_interval_seconds
Examples at 30s scrape:
| Active series | ~samples/s | ~samples/month (30 days) |
|---|---|---|
| 1,000 | ~33 | ~86M |
| 10,000 | ~333 | ~864M |
| 25,000 | ~833 | ~2.2B |
| 50,000 | ~1,667 | ~4.3B |
Use 86,400 × 30 seconds per month for back-of-envelope planning. If jobs use different intervals, sum per job or use a weighted average.
Back-of-envelope before deploy
List scrape jobs and estimate series per target:
| Source | How to estimate |
|---|---|
| node-exporter | ~400–1,000 series × node count (disks and NICs add labels) |
| kube-state-metrics | ~20–40 series × pod count, plus Deployments, PVCs, HPA, PDB, … |
kubelet / cAdvisor (kubernetes-nodes-cadvisor) |
Often the largest bucket—scales with containers, not just nodes |
| apiserver | Often thousands of series (histogram buckets and verb/resource labels) |
Application /metrics |
Highly variable—histograms and high-cardinality labels dominate |
active_series ≈ Σ (targets × series_per_target) # plus a little for Prometheus self-metrics
Churn (pods created and destroyed) affects AMP storage more than steady active series; dynamic fleets need headroom.
How to check (self-managed Prometheus already running)
Port-forward to the Prometheus server (adjust namespace and service name to match your install):
kubectl -n prometheus port-forward svc/prometheus-server 9090:80
Headline count — TSDB stats API:
curl -s http://127.0.0.1:9090/api/v1/status/tsdb | jq '.data.numSeries'
Same value in the UI: Status → TSDB Stats. The metric prometheus_tsdb_head_series tracks it over time.
Top metrics by cardinality (can be expensive on very large TSDBs—use in non-prod or off-peak):
curl -sG 'http://127.0.0.1:9090/api/v1/query' \
--data-urlencode 'query=topk(20, count by (__name__) ({__name__=~".+"}))' \
| jq '.data.result[] | {metric: .metric.__name__, series: .value[1]}'
Top scrape jobs (find what to trim):
curl -sG 'http://127.0.0.1:9090/api/v1/query' \
--data-urlencode 'query=topk(10, count by (job) ({__name__=~".+"}))' \
| jq '.data.result[] | {job: .metric.job, series: .value[1]}'
In the UI: Status → Targets shows per-target health and last scrape size—useful when a single job spikes.
How to check (AMP)
- CloudWatch metrics on the AMP workspace (ingestion and active series—see Monitor AMP).
- AWS console → AMP workspace usage views for growth over time.
- If you use ADOT or remote_write, the sender (collector or in-cluster Prometheus) still exposes scrape stats—debug cardinality at the source before it hits the workspace.
What usually blows up series count
-
High-cardinality labels —
pod,url,trace_id, unboundeduser_idon high-volume metrics. - Duplicate scrape — EKS addon node-exporter and Helm node-exporter, or two Prometheus replicas each scraping everything without needing two full TSDBs for AMP.
-
Histograms — each bucket, plus
_sumand_count, is multiple series per logical metric. - Full cAdvisor / apiserver defaults with no relabel drops.
After you have numSeries and the top two job values, you can map yourself to the small / medium / large table above and plug numbers into the samples per month formula for AMP pricing.
6. Who runs what—and what the first year feels like
Where the work lands
| Area | AMP | Self-managed |
|---|---|---|
| TSDB durability & scaling | AWS | You (PVC size, retention flags, compaction behavior) |
| Prometheus version upgrades | Managed compatibility window | You (Helm chart / operator upgrades, CRD drift) |
| Scrape discovery | You (collector config, EKS receivers, ServiceMonitor CRDs) | You (same; often more familiar with prometheus.yml in-cluster) |
| Recording / alerting rules | AMP rule groups or in-cluster evaluation + remote_write
|
Native serverFiles / PrometheusRule CRDs in Git |
| Long-term retention / global view | AMP + optional export; or Mimir/Thanos later | Add Thanos/Mimir/Cortex when one server is not enough |
| UI for debugging | Grafana → AMP; limited “SSH into Prometheus” |
kubectl port-forward to :9090—fast for platform engineers |
AMP in practice
- Less toil on TSDB HA, backups, and “disk full” pages for the metrics store
- Strong fit for org-wide standards and IAM-bound workspaces
- Watch ingestion billing and cardinality; use ADOT/Prometheus relabeling deliberately
- Alertmanager behavior is AWS-shaped—read AMP Alertmanager docs before assuming OSS Alertmanager feature parity
Self-managed in practice
- Full control of scrape config, rule files, federation, and “break glass” PromQL on localhost
- Familiar path: prometheus-community/prometheus Helm chart, GitOps overlays per cluster
- Recurring work: chart bumps with Kubernetes upgrades, PVC growth, proving Alertmanager HA, securing the Prometheus UI (it has no built-in auth—use network policy, private ingress, or an auth proxy)
- EKS add-ons can cover node-exporter (managed addon) while you keep a narrow Prometheus release for the server and rules—avoid duplicating the same metrics twice
7. EKS-specific wiring (both paths)
Neither option removes in-cluster collection:
| Component | Typical role |
|---|---|
| Metrics server | HPA CPU/memory—not a Prometheus replacement |
| prometheus-node-exporter | Node/host metrics (DaemonSet or EKS managed addon) |
| kube-state-metrics | Kubernetes object metrics (Deployment, Pod, PVC, …) |
| CoreDNS / kubelet / apiserver | Cluster health; scrape config or ADOT receivers |
Application /metrics |
Your SLOs and business metrics |
AMP path: deploy ADOT (or Prometheus in agent mode) with remote_write to the AMP workspace endpoint; use EKS Pod Identity or IRSA for SigV4 authentication.
Self-managed path: enable targets in Helm values or the Prometheus Operator; pin kube-state-metrics and exporters intentionally so you do not scrape the same series twice.
Hybrid (common): run a small in-cluster Prometheus for fast local debugging and federation-style rules, remote_write aggregates or critical series to AMP for org dashboards and long retention. You pay complexity once, not double storage for every raw sample.
8. Alerting and Grafana
| Topic | AMP | Self-managed |
|---|---|---|
| Rule evaluation | AMP rule groups and/or in-cluster Prometheus firing → AMP | Prometheus alerting_rules.yml / PrometheusRule CRDs |
| Alert routing | AMP Alertmanager, SNS, EventBridge; fewer “random webhook” examples in AWS docs | Alertmanager receivers (Slack, PagerDuty, Opsgenie) in Git—with secrets via External Secrets |
| Grafana | Managed Grafana with AMP datasource is the path of least resistance | Platform Grafana in-cluster; datasource = Prometheus Service DNS |
| Double paging | Risk if both AMP rules and Grafana unified alerting fire on the same metric | Risk if both Prometheus and Grafana own the same alerts—pick one owner |
9. Decision rubric (greenfield)
Lean toward AMP when:
- You want no StatefulSet TSDB to babysit and are fine with usage-based metrics cost
- Centralized observability across many accounts/clusters matters soon
- You will standardize on ADOT / AWS observability patterns and Managed Grafana
- The team is small and should not own Prometheus compaction, PVC resize, and version skew
Lean toward self-managed Prometheus when:
- You need maximum control (custom scrape hooks, exotic service discovery, air-gapped patterns)
- Most engineers live in port-forward PromQL and Git-managed
prometheus.yml/ rules - Predictable infra cost matters more than eliminating ops (small cluster, disciplined cardinality)
- You already run kube-prometheus-stack patterns and want one chart to own rules + Alertmanager
Lean toward hybrid when:
- You need in-cluster debugging and org-wide long retention or dashboards in AMP
- You are migrating: start self-managed,
remote_writeto AMP, cut over Grafana datasources, then shrink in-cluster retention
Default suggestion for a greenfield EKS platform team: if nobody wants to own TSDB operations, start with AMP + ADOT and Managed Grafana; if the team is building GitOps muscle on platform charts and wants the fastest “see /targets and /rules locally” loop, self-managed with a narrow chart (server + optional Alertmanager, exporters wired deliberately) is a solid teaching path. Revisit when cardinality, retention, or multi-cluster pain appears.
10. Troubleshooting: common misconceptions
- “AMP means we don’t run anything in the cluster.” You still run collectors/exporters and own cardinality.
- “Self-managed is always cheaper.” EBS + replicas can be modest; engineer time and incident cost often dominate.
- “We installed the prometheus-node-exporter EKS addon, so we have Prometheus.” The addon is node metrics, not the TSDB server.
- “Grafana alerting replaces Prometheus/Alertmanager.” It can—but two owners for the same alert is how you get double pages.
- “remote_write is free duplication.” It is not; you pay network, ingest, and often double evaluation unless you design what gets forwarded.
11. References
- What is Amazon Managed Service for Prometheus?
- Amazon Managed Service for Prometheus pricing
- Set up ingestion from EKS clusters
- AMP Alertmanager
- AWS Distro for OpenTelemetry
- Prometheus documentation
- prometheus-community/prometheus Helm chart
- Amazon Managed Grafana
- EKS managed add-ons (including prometheus-node-exporter)
- Monitor Amazon Managed Service for Prometheus
Top comments (0)