For a long time, batch pipelines were “good enough.”
Nightly jobs ran. Dashboards updated the next morning. Everyone learned to live with the lag.
But as data volumes grew — and expectations for freshness grew even faster — those tradeoffs stopped being acceptable.
I originally developed this material while preparing a talk for AWS Summit Los Angeles, and later refined it through conversations and feedback at the Portland AWS User Group. This post is the expanded, written version of that work — focused on what actually breaks in real systems, and how event-driven architectures help fix it.
Why Batch Pipelines Start to Break Down
Most teams don’t choose slow pipelines — they inherit them.
Over time, the same failure modes show up again and again:
- Slow feedback loops – Nightly batch jobs mean yesterday’s data drives today’s decisions.
- Manual orchestration – Scripts and human coordination introduce fragility.
- Duplicate or failed runs – No idempotency leads to wasted compute and inconsistent results.
- Missed or late events – Downstream teams lose trust when data silently disappears.
- Over-provisioned infrastructure – Jobs sized “just in case” drive unnecessary cost.
- Limited observability – It’s difficult to answer a basic question: Where is my data right now?
These were the exact pain points that kept coming up in conversations after both talks — and they’re strong signals that schedule-driven pipelines are being pushed past what they were designed to do.
What “Event-Driven” Really Means
At a high level, an event-driven pipeline reacts to something happening:
- A file lands in object storage
- An API request is received
- A message is published to a queue or stream
- A record arrives from an upstream system
Instead of polling on a fixed schedule, the pipeline starts the moment the event occurs.
This framing resonated strongly at both AWS Summit LA and the Portland AWS User Group:
Stop asking “when should this run?”
Start asking “what should trigger this?”
That shift alone simplifies architecture decisions and reduces wasted compute.
Event Triggers & Routing: The Backbone
Modern AWS architectures give you multiple ways to capture and route events:
- Object storage events
- API-driven ingestion
- Message queues
- Streaming platforms
What matters most is decoupling producers from consumers.
This is where event routing becomes more than just plumbing. A centralized event bus allows you to:
- Filter noisy events
- Transform payloads
- Fan out to multiple consumers
- Make data flow explicit and observable
One point I emphasized heavily in the Portland AWS User Group talk is that routing is an architectural boundary. When done well, teams can evolve independently without coordinating deployments or breaking downstream consumers.
Workflow Orchestration Without Schedule Glue
Once an event is routed, something still needs to coordinate the work.
Depending on complexity, orchestration might involve:
- Lightweight coordination for simple pipelines
- Stateful workflows for multi-step transformations
- Long-running or dependency-heavy DAGs
- Request-driven data products
Airflow still plays an important role here — not as a time-based scheduler, but as a state coordinator.
This distinction landed particularly well at AWS Summit Los Angeles, where many teams were already using Airflow but struggling to move beyond cron-driven DAGs.
Transforming Data with Serverless ETL
Once data is flowing, transformation is where value is created.
A serverless ETL approach works especially well in event-driven systems because it:
- Scales automatically with demand
- Eliminates idle infrastructure
- Aligns cost with actual work performed
- Integrates cleanly with cataloged datasets
Common patterns include:
- Micro-batch processing as data lands
- Small-file compaction and partition optimization
- Deduplication and data quality enforcement
- Normalizing raw inputs into analytics-ready formats
These patterns consistently came up in follow-up discussions after both talks, especially from teams trying to reduce operational overhead without sacrificing data freshness.
Resiliency Is Not Optional
In event-driven systems, failures don’t disappear — they become more visible.
That’s a good thing.
Resilient pipelines are built with:
- Retries at every execution boundary
- Idempotent processing to avoid duplicates
- Dead-letter queues for poison messages
- Buffering to absorb traffic spikes
- Clear failure paths instead of silent drops
This section generated some of the best questions at the Portland AWS User Group, particularly around how to design for failure without over-engineering.
Observability: Knowing Where Your Data Is
If you can’t answer “what’s happening right now?”, the pipeline isn’t finished.
Strong observability means:
- End-to-end visibility into pipeline state
- Metrics that surface lag and backlog
- Clear lineage from source to output
- The ability to trace a single event across services
Event-driven architectures make this easier — but only if observability is designed in from the start.
Final Thoughts
This post reflects lessons learned not just from slides, but from real conversations — at AWS Summit Los Angeles, at the Portland AWS User Group, and with teams actively modernizing their data platforms.
Event-driven pipelines aren’t about chasing trends.
They’re about aligning your data systems with how the business actually operates — in real time, not yesterday.
When done well, they are:
- Faster
- More cost-efficient
- More resilient
- Easier to reason about at scale
And most importantly: they restore trust in the data.
If you attended either talk — or you’re tackling similar challenges — feel free to connect with me. I’m always happy to dig deeper into specific patterns, tradeoffs, or failure modes.
Top comments (0)