Andrew Kalik

Posted on Jan 3

Event-Driven Data Pipelines - Real-Time Orchestration on AWS

#dataengineering #aws #eventdriven

For a long time, batch pipelines were “good enough.”

Nightly jobs ran. Dashboards updated the next morning. Everyone learned to live with the lag.

But as data volumes grew — and expectations for freshness grew even faster — those tradeoffs stopped being acceptable.

I originally developed this material while preparing a talk for AWS Summit Los Angeles, and later refined it through conversations and feedback at the Portland AWS User Group. This post is the expanded, written version of that work — focused on what actually breaks in real systems, and how event-driven architectures help fix it.

Why Batch Pipelines Start to Break Down

Most teams don’t choose slow pipelines — they inherit them.

Over time, the same failure modes show up again and again:

Slow feedback loops – Nightly batch jobs mean yesterday’s data drives today’s decisions.
Manual orchestration – Scripts and human coordination introduce fragility.
Duplicate or failed runs – No idempotency leads to wasted compute and inconsistent results.
Missed or late events – Downstream teams lose trust when data silently disappears.
Over-provisioned infrastructure – Jobs sized “just in case” drive unnecessary cost.
Limited observability – It’s difficult to answer a basic question: Where is my data right now?

These were the exact pain points that kept coming up in conversations after both talks — and they’re strong signals that schedule-driven pipelines are being pushed past what they were designed to do.

What “Event-Driven” Really Means

At a high level, an event-driven pipeline reacts to something happening:

A file lands in object storage
An API request is received
A message is published to a queue or stream
A record arrives from an upstream system

Instead of polling on a fixed schedule, the pipeline starts the moment the event occurs.

This framing resonated strongly at both AWS Summit LA and the Portland AWS User Group:

Stop asking “when should this run?”

Start asking “what should trigger this?”

That shift alone simplifies architecture decisions and reduces wasted compute.

Event Triggers & Routing: The Backbone

Modern AWS architectures give you multiple ways to capture and route events:

Object storage events
API-driven ingestion
Message queues
Streaming platforms

What matters most is decoupling producers from consumers.

This is where event routing becomes more than just plumbing. A centralized event bus allows you to:

Filter noisy events
Transform payloads
Fan out to multiple consumers
Make data flow explicit and observable

One point I emphasized heavily in the Portland AWS User Group talk is that routing is an architectural boundary. When done well, teams can evolve independently without coordinating deployments or breaking downstream consumers.

Workflow Orchestration Without Schedule Glue

Once an event is routed, something still needs to coordinate the work.

Depending on complexity, orchestration might involve:

Lightweight coordination for simple pipelines
Stateful workflows for multi-step transformations
Long-running or dependency-heavy DAGs
Request-driven data products

Airflow still plays an important role here — not as a time-based scheduler, but as a state coordinator.

This distinction landed particularly well at AWS Summit Los Angeles, where many teams were already using Airflow but struggling to move beyond cron-driven DAGs.

Transforming Data with Serverless ETL

Once data is flowing, transformation is where value is created.

A serverless ETL approach works especially well in event-driven systems because it:

Scales automatically with demand
Eliminates idle infrastructure
Aligns cost with actual work performed
Integrates cleanly with cataloged datasets

Common patterns include:

Micro-batch processing as data lands
Small-file compaction and partition optimization
Deduplication and data quality enforcement
Normalizing raw inputs into analytics-ready formats

These patterns consistently came up in follow-up discussions after both talks, especially from teams trying to reduce operational overhead without sacrificing data freshness.

Resiliency Is Not Optional

In event-driven systems, failures don’t disappear — they become more visible.

That’s a good thing.

Resilient pipelines are built with:

Retries at every execution boundary
Idempotent processing to avoid duplicates
Dead-letter queues for poison messages
Buffering to absorb traffic spikes
Clear failure paths instead of silent drops

This section generated some of the best questions at the Portland AWS User Group, particularly around how to design for failure without over-engineering.

Observability: Knowing Where Your Data Is

If you can’t answer “what’s happening right now?”, the pipeline isn’t finished.

Strong observability means:

End-to-end visibility into pipeline state
Metrics that surface lag and backlog
Clear lineage from source to output
The ability to trace a single event across services

Event-driven architectures make this easier — but only if observability is designed in from the start.

Final Thoughts

This post reflects lessons learned not just from slides, but from real conversations — at AWS Summit Los Angeles, at the Portland AWS User Group, and with teams actively modernizing their data platforms.

Event-driven pipelines aren’t about chasing trends.

They’re about aligning your data systems with how the business actually operates — in real time, not yesterday.

When done well, they are:

Faster
More cost-efficient
More resilient
Easier to reason about at scale

And most importantly: they restore trust in the data.

If you attended either talk — or you’re tackling similar challenges — feel free to connect with me. I’m always happy to dig deeper into specific patterns, tradeoffs, or failure modes.