DEV Community

Machine coding Master
Machine coding Master

Posted on

Reactive is Dead: Build Low-Latency Voice Agents with OpenAI Realtime and JDK WebSockets

Reactive is Dead: Build Low-Latency Voice Agents with OpenAI Realtime and JDK WebSockets

Building voice agents with OpenAI's Realtime API shouldn't require dragging the massive, mind-bending complexity of Spring WebFlux into your codebase. Thanks to JDK virtual threads, we can finally dump reactive streams and build ultra-low-latency, bi-directional audio pipelines using simple, blocking java.net.http.WebSocket code that looks synchronous but scales infinitely.

Why Most Developers Get This Wrong

  • Over-engineering with WebFlux: Devs default to Project Reactor (Flux/Mono) for audio streaming, resulting in unmaintainable stack traces and brutal debugging sessions when tracking frame drops.
  • Ignoring Thread-per-Connection Reality: They forget that the OpenAI Realtime API requires persistent, stateful, bi-directional WebSockets where audio chunks must be processed in strict chronological order.
  • Ignoring Backpressure on Audio: Reactive libraries often mask underlying buffer bloat, causing massive latency spikes in voice turn-taking.

The Right Way

The modern way to handle OpenAI's gpt-4o-realtime-preview is pairing native JDK WebSockets with Structured Concurrency and Virtual Threads.

  • Pin Thread to WebSocket: Use java.net.http.HttpClient to open a WebSocket connection and let virtual threads block on incoming audio frames.
  • Structured Concurrency (StructuredTaskScope): Run the audio input (mic to OpenAI) and audio output (OpenAI to speaker) in a clean, scope-bound parent-child relationship.
  • Plain Old Blocking Queues: Use standard LinkedBlockingQueue for thread-safe audio chunk handoffs—no complex reactive operators needed.

Show Me The Code

Here is how you orchestrate the bi-directional audio loops cleanly using Java's structured concurrency:

try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
    WebSocket ws = HttpClient.newHttpClient().newWebSocketBuilder()
        .header("Authorization", "Bearer " + API_KEY)
        .buildAsync(URI.create("wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview"), listener)
        .join();

    // Virtual threads run the bi-directional audio loops concurrently
    scope.fork(() -> { streamMicToWebSocket(ws); return null; });
    scope.fork(() -> { streamWebSocketToSpeaker(listener.audioQueue()); return null; });

    scope.join().throwIfFailed();
}
Enter fullscreen mode Exit fullscreen mode

Key Takeaways

  • Reactive is a legacy paradigm for I/O-bound streaming; virtual threads have made blocking code the gold standard for performance and readability in 2026.
  • OpenAI's Realtime API demands strict sequential audio frame delivery, which is trivial to guarantee with standard Java blocking queues and loops.
  • Keep your dependencies clean by relying on java.net.http.WebSocket instead of pulling in heavy, third-party Netty wrappers.

Shameless plug: javalld.com has full LLD implementations with step-by-step execution traces — free to use while prepping.

Top comments (0)