DEV Community

Cover image for Top APM Tools in 2026 A Practical Guide for Engineering Teams
David wenham
David wenham

Posted on • Edited on

Top APM Tools in 2026 A Practical Guide for Engineering Teams

Choosing an APM tool in 2026 is harder than it looks. The category has matured, the number of credible options has grown, and the cost differences between vendors have become stark enough to matter at budget review time - often by a factor of 6–10x at scale. This guide covers tools across the spectrum - from lean startups to enterprise incumbents - with honest assessments of where each one earns its place and where it struggles.

If your team cares about infrastructure costs, data sovereignty, or moving to open standards like OpenTelemetry, the numbers in this guide will make the decision straightforward - self-hosted alternatives have made the cost gap concrete enough to act on. If you're evaluating purely on integration depth or enterprise relationships, the calculus is different.

Pricing in this space is notoriously opaque. Where possible, we've included approximate costs at 30TB/month ingestion as a common reference point. Actual bills vary significantly based on retention settings, user counts, feature add-ons, and negotiated contracts.

Pricing Methodology - 30TB/Month Scenario

Volume: 30TB/month - ~20TB logs, 7TB traces, 3TB metrics

Retention: 30 days across all signal types

Indexing: 30% of logs indexed (70% ingested to archive)

Hosts: 100 hosts (used where vendors charge per-host)

Users: 20 full-platform users (used where vendors charge per-seat)

Metrics: 500,000 active series

Add-ons: Core observability only - no security, profiling, or synthetics

Note: orgs at 30TB/month typically run 200–500 hosts; per-host vendor costs scale linearly.

Estimates are directional, based on public rate cards as of early 2026.

Vendor discounts and EDP commitments can significantly reduce SaaS costs.

What to Look for in an APM Tool in 2026

  • Predictable pricing at scale - surprise bills remain one of the top reasons teams switch tools. Multi-dimensional billing models compound unpredictably as you grow.

  • Full-stack observability - metrics, logs, and traces unified, not siloed

  • OpenTelemetry support - the industry is converging on OTel; proprietary agents create lock-in and unexpected cost when OTel metrics are reclassified as custom metrics

  • Data residency - increasingly non-negotiable in regulated industries (BFSI, healthcare, government). For most SaaS vendors this is a paid add-on or not available at all; for self-hosted platforms like CubeAPM it's guaranteed by architecture

  • Support quality - when production is down, how fast does your vendor respond?

1. CubeAPM

Best for: Cost-sensitive engineering teams, data-sovereign organizations, and teams actively migrating to open standards

Overview

CubeAPM is a self-hosted, OpenTelemetry-native observability platform covering APM, logs, infrastructure, Kubernetes, RUM, synthetic monitoring, Kafka monitoring, and error tracking - all in one system. It runs inside your own cloud or on-premises environment, so telemetry data never leaves your infrastructure.

Recognized as a High Performer in G2's Spring 2026 APM Grid Report and used by redBus (part of NASDAQ-listed MakeMyTrip, 8+ countries in Asia and Latin America), Delhivery (\$3.5B valuation), Mamaearth (\$1.2B valuation), Policybazaar, Practo, and others - a mix of industries and scale that reflects broad applicability rather than a niche fit.

Key Features

  • Full-stack unified monitoring - APM, logs, infrastructure, Kubernetes, Kafka, RUM, synthetic monitoring, error tracking

  • OpenTelemetry-native from day one - no proprietary agents; compatible with existing Open Telemetry, Elastic, New Relic, Datadog and Prometheus agents, making migration incremental rather than a hard cutover

  • Self-hosted and BYOC deployment - data sovereignty by design

  • Unlimited data retention with no egress surprises

  • AI-based trace sampling - intelligently retains traces that matter while reducing storage overhead

  • Direct engineering support via shared channels - not a ticket queue

Pricing

Ingestion-based at \$0.15/GB with no per-host or per-seat fees.

At 30TB/month: ~\$5,100/month all-in

\$4,500/month license (\$0.15/GB × 30,000 GB) + ~\$600/month cloud infrastructure (\$0.02/GB covering compute + storage). Unlike SaaS vendors where infrastructure is bundled invisibly into the price, the total cost here is transparent and independently auditable.

Delhivery documented 75% savings after replacing three separate monitoring tools with CubeAPM. Mamaearth documented nearly 70% savings and completed migration in under an hour with zero downtime. redBus reported 4× faster dashboards and 50% faster MTTR. Multiple customers at petabyte-scale monthly ingestion have reported similar results.

The savings are structural: no per-host fee, no custom metrics coverage, no log indexing surcharge, no retention cliff - all the billing dimensions that drive enterprise APM costs at scale are absent.

Pros

  • Consistently 70–75% lower cost than enterprise APM at scale

  • Complete data ownership - no telemetry leaves your infrastructure

  • Multi-agent compatible - works alongside Datadog, New Relic, Elastic, Prometheus and Open Telemetry agents; incremental migration, no re-instrumentation

  • Unlimited retention, predictable pricing, no egress charges

  • Engineering-level support that responds in minutes during incidents

  • AI-based trace sampling included

  • Fast onboarding - zero-downtime migration documented by multiple customers

Cons

  • Requires BYOC or on-premise deployment comfort

  • No autonomous anomaly detection (AI sampling ≠ full AIOps)

  • SSO/RBAC less mature than enterprise SaaS incumbents

2. Datadog

Best for: Cloud-native organizations that need the broadest possible integration ecosystem and have the budget to match

Overview

Datadog is the category leader by market capitalization (~\$40B) and integration depth. With 700+ integrations, a polished UI, and tight correlation between metrics, logs, traces, and security data, it's the default choice for many well-funded engineering teams. The trade-off is pricing complexity that requires careful architecture planning to predict accurately.

Key Features

  • Unified observability: metrics, logs, APM, RUM, synthetics, security, database monitoring

  • 700+ integrations

  • Watchdog AI for anomaly detection and root cause surfacing

  • Service maps and dependency tracking

  • Strong CI/CD and deployment tracking integration

Pricing

Datadog's billing is multi-dimensional. Charges span: hosts (per-host/month), custom metrics (per unique metric timeseries), log ingestion (\$0.10/GB) + log indexing (\$1.70/million log events for 15-day retention, ~\$2.50/million for 30-day retention), APM span volume, RUM sessions, and container overages.

A key pricing consideration: custom metrics. Metrics sent via OpenTelemetry or application code are often billed as custom metrics at up to \$5 per 100 per month beyond host allotment. At scale, custom metrics can constitute 30–52% of the total bill - a dimension that's easy to underestimate during evaluation.

Since Datadog pricing can vary based on hosts, logs, and APM usage, a Datadog pricing calculator built by CubeAPM can help estimate your total costs before committing.

At 30TB/month: ~\$30,000–\$45,000+/month

Breakdown (30% logs indexed): 100 hosts ~\$2,400 + log ingest 20TB ~\$2,000 + log indexing at 30% of 20TB ≈12B events at \$2.50/million ~\$30,000 + APM spans ~\$3,000–5,000 + custom metrics ~\$5,000+. Log indexing is the dominant cost driver.

Several third-party calculators exist for modeling Datadog bills at scale - worth using before committing to an annual contract.

Pros

  • Best-in-class integration ecosystem

  • Tight metric/log/trace correlation out of the box

  • Watchdog AI proactively surfaces anomalies

  • Strong CI/CD, deployment, and security visibility

Cons

  • Billing complexity: host fees + custom metric overages + log indexing + per-feature charges combine unpredictably

  • OTel metrics are often billed as custom metrics - adds cost for teams adopting open standards instrumentation

  • No self-hosted option; data leaves your infrastructure (for teams where this is a hard requirement, self-hosted platforms like CubeAPM are worth evaluating before committing)

  • Retention is limited on standard tiers; longer retention adds cost

3. Dynatrace

Best for: Large enterprises needing automated root cause analysis and willing to commit to an annual contract

Overview

Dynatrace differentiates primarily through Davis AI, its causal AI engine that performs automated root cause analysis by correlating topology, dependencies, and performance data. It's available both as SaaS and as Dynatrace Managed - a full on-premises or BYOC deployment - making it one of the few enterprise APM vendors that supports true data residency.

Key Features

  • Davis AI: causal root cause analysis, not just anomaly detection

  • Automatic service discovery and full dependency mapping (Smartscape)

  • Full-stack monitoring: applications, infrastructure, Kubernetes, cloud services

  • Dynatrace Managed: self-hosted deployment for data-residency requirements

  • OneAgent for automated instrumentation; OTel support for traces, logs, and metrics

Pricing

Consumption-based via Dynatrace Platform Subscription (DPS), with an annual minimum commitment (~\$2,000/month minimum reported in practice). Rate cards: full-stack monitoring at \$0.08/hour per 8 GiB host, log ingest at \$0.20/GiB, log retention at \$0.0007/GiB-day. Hosts under 4 GiB RAM are billed at a 4 GiB minimum.

At 30TB/month: ~\$20,000–\$35,000+/month

Breakdown: 100 hosts × \$0.08/hr × 8 GiB × 730 hrs ~\$4,700 + log ingest 20TB × \$0.20/GiB ~\$4,100 + log retention ~\$430 + traces/metrics/APM + annual commitment overhead.

Pros

  • Best automated root cause analysis in the market

  • Davis AI reduces time-to-root-cause without manual correlation

  • Dynatrace Managed supports genuine data residency (unlike Datadog)

  • Automatic full-topology discovery - minimal manual configuration

  • Strong compliance and enterprise security features

Cons

  • High cost of ownership with mandatory annual commitment

  • Davis AI works best after a baselining period - new deployments don't get full value immediately

  • Heavy reliance on proprietary OneAgent; OTel support exists but is not the primary instrumentation path

  • 4 GiB minimum billing for small hosts - adds cost for lightweight container architectures

4. New Relic

Best for: Smaller to mid-size teams that want a broad platform with a free tier, or teams with predictable user count and data volumes

Overview

New Relic rebuilt its platform around NRDB (New Relic Database), a unified telemetry store that ingests metrics, events, logs, and traces. NRQL, its SQL-like query language, makes ad-hoc analysis accessible. The free tier (100GB/month + 1 full platform user) makes it the easiest entry point in this list.

Key Features

  • NRDB: unified telemetry database for metrics, events, logs, and traces

  • NRQL: SQL-like query language for custom analysis

  • Distributed tracing, service maps, browser and mobile monitoring

  • Free tier: 100 GB/month + 1 full platform user

  • User-based and compute-based pricing models available

Pricing

Two dimensions: data ingest (\$0.40/GB standard, \$0.60/GB for Data Plus with extended retention) + user fees (Core \$49/user/month; Full Platform \$99–\$349/user/month). A team of 20 engineers on Full Platform Pro adds \$1,980–\$6,980/month in user fees on top of data ingest costs.

At 30TB/month: ~\$20,000–\$25,000+/month

Breakdown: 30TB at \$0.40/GB ~\$12,000 + Data Plus for 90-day retention ~\$6,000 + 20 full-platform users ~\$2,000–\$7,000.

Pros

  • NRDB gives genuinely unified telemetry storage and flexible querying

  • 100 GB/month free tier - best in class for getting started

  • Compute-based pricing option available for large teams to avoid per-user costs

  • Strong developer experience; NRQL lowers barrier for custom analysis

Cons

  • User-based pricing adds a second cost axis that scales with team growth

  • Data retention only 8 days on standard; 90 days requires Data Plus at \$0.60/GB

  • Cost surprises from enabling new telemetry types without understanding ingest scope

  • No self-hosted option; all data in New Relic's cloud

5. Grafana Cloud (LGTM Stack)

Best for: Teams already running open-source observability, OTel-native shops, and engineers comfortable managing or funding a managed stack

Overview

Grafana Labs assembled the LGTM stack - Loki (logs), Grafana (dashboards), Tempo (traces), Mimir (metrics) - into a coherent observability platform. Grafana Cloud is the managed version. Self-hosted is free but operationally demanding. It's one of the most OTel-native options available.

Key Features

  • LGTM stack: Loki, Grafana, Tempo, Mimir

  • Full OpenTelemetry native support - no custom metrics penalty

  • Adaptive Metrics and Adaptive Logs to actively reduce unnecessary ingestion costs

  • Self-hosted (free) or Grafana Cloud (managed, usage-based)

  • 13-month metric retention; 30-day log/trace retention on Pro

Pricing

Grafana Cloud Pro: \$19/month base + usage. Logs: \$0.40/GB ingested + \$0.05/GB processing + \$0.10/GB/month retention = ~\$0.55/GB effective cost for 30-day retention. Traces and profiles: \$0.50/GB. Metrics: \$8 per 1,000 active series. Enterprise: minimum \$25,000/year commitment.

At 30TB/month (managed cloud): ~\$15,000–\$20,000+/month

Breakdown: 20TB logs at \$0.55/GB effective ~\$11,000 + 7TB traces at \$0.50/GB ~\$3,500 + 500K metric series ~\$4,000 + base. Adaptive Metrics/Logs features can reduce this materially.

Pros

  • Fully OTel-native, no proprietary agents, no custom metrics penalty

  • Adaptive Metrics/Logs actively help reduce billing

  • Strong open-source community; highly customizable

  • Self-hosted path available for cost-driven teams with operational capacity

Cons

  • Self-hosting at 30TB scale requires dedicated SRE expertise to maintain and scale

  • Managed cloud costs approach Datadog/New Relic territory at high log volumes

  • No built-in AI/ML for anomaly detection; relies on community plugins

  • Grafana's APM story is less mature than purpose-built APM tools

6. Elastic APM

Best for: Teams already running the Elastic (ELK) stack for log management who want to add traces and APM without introducing another vendor

Overview

Elastic APM is the distributed tracing and application monitoring component of the Elastic Stack. For teams already indexing logs in Elasticsearch and visualizing in Kibana, adding APM is natural - the data lives in the same store and queries across logs and traces work natively.

Key Features

  • Native Elasticsearch integration: APM data correlates directly with existing log indices

  • OpenTelemetry compatible (OTel collector to Elasticsearch)

  • Machine learning-based anomaly detection via Elastic ML

  • Available self-hosted (free, open-source) or Elastic Cloud

  • Service maps and distributed tracing

Pricing

Self-hosted Elastic is open-source and free; you cover infrastructure. Elastic Cloud pricing is based on deployment configuration - compute resources, storage tiers (hot/warm/cold), replica count, and region - rather than a simple \$/GB ingest meter.

At 30TB/month (Elastic Cloud): ~\$8,000–\$15,000/month

Reference architecture: hot tier, 30-day retention, 1 replica. Elastic's calculator is the most reliable source for specific configurations.

Pros

  • Zero incremental cost if already running Elastic for logs

  • Strong log + trace correlation - same query interface for both

  • Self-hosted option keeps data on your infrastructure

  • ML-based anomaly detection included

Cons

  • Running Elasticsearch at 30TB/month scale requires significant infrastructure investment and operational expertise

  • APM experience is less polished than Datadog, Dynatrace, or purpose-built tools

  • Elastic's 2021 licensing change (SSPL) affects self-hosted deployments; teams with open-source compliance requirements should review the current license terms

  • Support for self-hosted is limited to paid Elastic subscriptions

7. Splunk Observability Cloud

Best for: Large enterprises with existing Splunk investments and substantial compliance requirements

Overview

Splunk Observability Cloud (built on the former SignalFx platform) offers full-fidelity distributed tracing with no sampling by default, plus infrastructure monitoring, AI-powered alerting, and deep integration with Splunk's security and log analytics portfolio. Note: Splunk Observability Cloud (APM/infrastructure) and Splunk Enterprise/SIEM (log analytics) are separate products with separate pricing, though they integrate.

Key Features

  • Full-fidelity distributed tracing (no default sampling)

  • AI-based alerting and anomaly detection

  • Deep Splunk SIEM and log analytics integration

  • Real-time stream processing for telemetry

  • Strong enterprise compliance story

Pricing

Splunk Observability Cloud starts at \$15/host/month billed annually for infrastructure monitoring, with APM and log analytics priced separately via enterprise contract.

At 30TB/month: ~\$35,000–\$60,000+/month

Estimated based on 100 hosts infrastructure monitoring ~\$1,500, APM per-host fees, log analytics volume, and enterprise contract minimums. Treat this as a floor, not a ceiling.

Pros

  • Full-fidelity traces - no sampling means no blind spots in high-cardinality environments

  • Best-in-class integration with Splunk Security for unified IT and security observability

  • AI-powered alerting with built-in noise reduction

Cons

  • Among the most expensive options at scale

  • Meaningful time investment to deploy and configure

  • Best value only for teams with existing Splunk investments; overkill otherwise

  • Significant vendor lock-in given the proprietary ecosystem

Cost Comparison at 30TB/Month Ingestion

–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––-
Tool Est. Cost @ 30TB/mo Pricing Model OTel Native Data Residency Self-Hosted
–––––––- –––––––––––––––––––––––––- ––––––––––– ––––––––- –––––––––– ––––––––-
CubeAPM ~\$5,100/mo all-in(\$4,500 license +\$600 infra) \$0.15/GB flat ✓ Native ✓ Always ✓ Yes

Elastic APM ~\$8K–\$15K (cloud) Deployment-based ✓ Partial ✓ If self-hosted ✓ Yes

Grafana Cloud ~\$15K–\$20K+ Usage-based ✓ Native ✓ If self-hosted ✓ Yes

New Relic ~\$20K–\$25K+ Ingest + per-user Partial ✗ SaaS only ✗ No

Dynatrace ~\$20K–\$35K+ GiB-hour + commit Partial ✓ Managed option ✓ Managed

Datadog ~\$30K–\$45K+ Host + feature-based Partial* ✗ SaaS only ✗ No

Splunk ~\$35K–\$60K+ Host + enterprise Partial Limited Limited
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––-

* OTel metrics in Datadog are often billed as custom metrics. All estimates use the methodology assumptions above. Vendor discounts and EDP commitments can significantly reduce SaaS costs.

How to Choose the best apm tools in 2026?

Choose CubeAPM if cost at scale, data residency, or open standards migration is a priority. At 30TB/month, the all-in cost of ~\$5,100 vs \$30K+ for enterprise SaaS makes the math concrete. Self-hosted architecture eliminates data sovereignty risk for regulated teams.

Choose Datadog if you need the widest integration coverage and your team has the budget and willingness to manage billing complexity. It earns its market leadership - but model your custom metrics costs before committing.

Choose Dynatrace if automated root cause analysis is the primary need and you're in a large enterprise environment. Davis AI is genuinely differentiated. Be prepared for the annual commitment and the baselining period before it pays off.

Choose New Relic if you're a smaller team that wants a broad platform and values the free tier to get started without upfront commitment.

Choose Grafana Cloud if you're OTel-first, want zero proprietary lock-in, and are comfortable either managing self-hosted infra or paying the managed cloud rate.

Choose Elastic APM if your team already runs ELK and wants to add distributed tracing without introducing a new vendor. Incremental cost can be near zero.

Choose Splunk if your organization already has a Splunk investment and needs unified IT and security observability under one contract.

Final Thoughts

The APM market in 2026 looks very different from five years ago. The incumbents still earn their position through ecosystem depth, AI maturity, and enterprise support infrastructure - and for the right teams, those advantages are real.

But the pricing gap between category leaders and newer platforms has grown large enough that it's no longer rational to default to Datadog or Dynatrace without explicitly running the numbers. For teams where observability costs have become a line item that finance asks about, the math has changed - newer self-hosted platforms show that full-stack observability doesn't have to cost 6-10x more. For teams with data residency requirements, the SaaS-only constraint of Datadog and New Relic is a genuine architectural limitation, not a preference. And for teams building OTel-first infrastructure, paying a custom metrics premium for adopting an open standard is a perverse incentive worth avoiding.

None of that makes the incumbents wrong for the right teams. It just means the decision deserves more deliberate analysis than it used to.

Keywords: best APM tools 2026, affordable APM, OpenTelemetry APM, self-hosted observability, Datadog alternative, New Relic alternative, application performance monitoring, observability platform, CubeAPM

Top comments (0)