AI's energy problem has a software fix. Most teams aren't using it.

#ai #infrastructure #devops #cloud

Data centers will drive 40% of electricity demand growth through the end of the decade, according to Goldman Sachs. Most conversations about fixing that focus on chips, cooling, and renewable power contracts. There's a faster, cheaper intervention most teams haven't touched: how they process data.

Shifting AI workloads from batch processing to real-time streaming can meaningfully cut energy consumption — with no new hardware required.

"Systems no longer need to be sized for the worst-case burst capacity; they can scale dynamically in response to actual throughput."

What's actually wrong with batch

Batch processing is still the dominant model for data analysis — data accumulated, staged, then processed in scheduled bursts. That creates sharp compute spikes. Infrastructure has to be provisioned for peak load, which means capacity sitting idle between runs, cooling systems taxed during bursts, then quiet again until the next cycle.

It's the difference between flooring the accelerator from a standstill versus steady highway cruising. Same destination, very different fuel bill. Electricity prices jumped 6.9% last year. That math is getting worse.

Why streaming changes the energy profile

Streaming architectures — Apache Kafka, Apache Flink — process data continuously as it arrives. The compute load flattens. Instead of provisioning for worst-case bursts, you scale dynamically against actual throughput.

There are downstream gains too. Streaming typically cleans and deduplicates data in transit before it hits storage. Leaner data means lighter queries and less disk I/O — both energy-intensive operations. And because individual systems process independently in a decoupled event-driven setup, you avoid cascading compute loads that tightly integrated batch pipelines tend to generate.

Why AI workloads specifically benefit

AI agents need current data. Static datasets refreshed on batch cycles lead to stale context or force reprocessing. In many setups, the batch pipeline is the bottleneck — not the models themselves.

Streaming fixes both problems: lower energy footprint and fresher inputs for your models.

Where to start

Not everything needs to migrate at once. The strongest first candidate is preprocessing for AI workloads — drop a stream processor in front of your AI pipeline to filter, aggregate, and normalize data before it reaches the model. You get leaner inputs, lower GPU/CPU load, and a measurable energy reduction.

From there, identify which batch jobs create the sharpest demand spikes and evaluate whether real-time processing is practical. The migration happens at the software layer — no new hardware, no waiting for power contracts.

The hardware improvements are already underway. The software conversation is overdue.

Source: The New Stack — Warren Vella

✏️ Drafted with KewBot (AI), edited and approved by Drew.