Breaking Logging's Flywheel of Compromises

#logging #devops #observability #architecture

Authored by Mike Neville-O'Neill

Let's face it — logging is broken. Not just a little broken, but fundamentally misaligned with the needs of modern engineering teams. At a recent AWS Summit talk in London, Benoit Gaudin (our Head of Infrastructure) and I shared Bronto's vision for fixing this mess once and for all.

The Problem We're All Living In

If you're running any significant infrastructure today, you're probably stuck in what we call the "3C flywheel of compromises":

Cost — Logging at scale has become ridiculously expensive
Coverage — So you cut corners, dropping those infra logs and long-tail workflows
Complexity — And end up with a Frankenstein's monster of 5–8 different systems duct-taped together

This isn't just inefficient — it's actively harmful. Engineers end up building parallel solutions just to get basic visibility because the main tool is too limited, too slow, or too expensive.

Logs Matter More Than Ever

Logs aren't just a compliance checkbox anymore. They're your operational ground truth in the AI era.

They feed your LLMs. They power your agents. They're your audit trail, your RAG source, your behavioral training set. And one log message from an LLM-based system might contain 50–100 nested events in a single payload.

Try scaling that with a solution built before the separation of compute and storage was even a thing.

How We're Breaking the Cycle

Bronto was built to tackle this head-on with three non-negotiable capabilities:

Subsecond search on all logs — whether they're two seconds or two years old
Petabyte-scale retention — no infrastructure for you to manage
Completely different pricing — think cents per GB, not dollars

The platform is built natively on AWS (S3, Lambda, DynamoDB), but engineered so you don't have to deal with pipelines, pre-processing, or glue code.

Bronto's Architectural Advantage

The ingestion layer accepts data from standard sources — OpenTelemetry Collector, FluentD, FluentBit — through HTTP endpoints, with AWS EC2 load balancers doing the heavy lifting. Data is buffered through Kafka (AWS MSK), but then things diverge from the standard playbook.

Instead of traditional approaches, data is processed from Kafka and written to S3 in a proprietary format that borrows techniques from data analytics: data partitioning, Bloom filtering, push predicates, compression, and columnar-based formats. Metadata lives in DynamoDB for speed.

The real magic happens at search time. When you query through the UI or API, Lambda functions launch in parallel and process data directly from S3. No overprovisioning for big queries — horizontal scaling on demand, paying only while functions run.

This architecture is what enables both the performance (subsecond on terabytes, seconds on petabytes) and the pricing model. No expensive clusters running 24/7 — just cloud resources used exactly when and where they're needed.

Real Teams, Real Results

API-First Content Platform

A team running a massive content delivery platform, serving APIs behind a global CDN for websites, mobile apps, and e-commerce systems. Every request hits their API with a unique key — they need to trace errors, group by status codes, and export logs to their own customers.

Before Bronto

40TB monthly ingestion cap
30+ minute query times (when they worked at all)
Dashboards that routinely failed
Constant budget pressure

After Bronto

Boosted ingestion to 60TB monthly
Cut their logging bill in half
Complex multi-day queries now return in subseconds
Built reliable log exports for their own customers

Their exact words? "Bronto changed our lives." A logging tool. Actually improving engineers' lives.

Global SaaS Project Management Platform

A company running a suite of SaaS tools across distributed cloud services and product lines.

Before Bronto

Graylog for live logs
S3 for long-term storage
HAProxy logs dumped into S3 with gnarly Athena queries
A mix of Athena, Superset, and QuickSight for analytics
Just 1–2 days of retention across most systems

After Bronto

Everything centralized — HAProxy, Kubernetes, application logs, audit trails
Extended to 90-day hot retention
Real dashboards tracking error spikes, traffic anomalies, and app version drift
Engineers focused on product, not maintaining logging infrastructure

They went from managing logs to actually using them.

Logs as Your Secret Weapon

Your log data is massively undervalued — not because it lacks signal, but because current tooling hides that signal behind cost barriers, friction, and compromises.

Logs used to be a liability. With the right approach, they can be your secret weapon.

We're building Bronto to be for logging what Dyson was for vacuum cleaners, what iPhone was for smartphones, and what Tesla was for electric cars — a complete reimagining of what's possible when you refuse to accept the status quo.

After all, when was the last time your logging tool made your life better instead of worse?

See Bronto in Action