Real-Time vs Batch: Why Live Sports Highlights Need a Different Architecture

#ai #systemdesign #architecture #performance

Most video processing is a batch job. You upload a file, a pipeline chews through it, and minutes or hours later you get an output. That model breaks completely when the goal is to publish a highlight while the match is still being played. Live sports highlight generation is one of the clearest examples of an AI workload where the architecture, not just the model, is the hard part.

The constraint that changes everything

In a batch pipeline, latency is a convenience. In a live pipeline, latency is the product. If a goal goes in and the clip is not on social within a minute or two, the moment is gone. That single constraint forces a different design at every layer.

Streaming ingestion, not file uploads

A live system taps the broadcast over RTMP or HLS and processes it as a continuous stream, frame by frame, rather than waiting for a finished file. You are running inference on an open-ended input with no end-of-file to wait for.

Detection has to be incremental

Batch detection can look at the whole game and pick the best moments in hindsight. A real-time detector has to decide, in the moment, whether what just happened is worth clipping, with no knowledge of what comes next. That is why the best systems fuse signals, vision, audio, and live data, to raise confidence fast.

Assembly under a deadline

Once a moment fires, the clip has to be cut, padded, reframed to vertical, and delivered, all within the latency budget. There is no overnight render queue. This is where many architectures fall over: the model is fine, but the surrounding pipeline cannot keep up at scale when dozens of matches run at once.

Who runs this in production

Platforms like Zentag AI are built around exactly this real-time constraint: ingest a live RTMP or HLS stream, detect key moments as they happen, and generate reframed reels on the fly across 50+ sports. Adjacent tooling, from capture and production systems to data providers, sits around that core, but the real-time generation step, under a live latency budget, is the hard center.

Takeaway

If you are building anything that reacts to live video, the lesson from sports highlights generalizes: the model gets the headlines, but the latency budget and the streaming architecture determine whether it works. Batch thinking will quietly sink a real-time product.

More on real-time sports highlight automation at zentag.ai.