DEV Community

Diya
Diya

Posted on

I opened my first PR to LiveKit's agents repo — here's the bug I found

I've been growing my open source portfolio one contribution at a time, and this week I landed on something genuinely interesting in livekit/agents (11k+ stars, the framework behind a ton of real-time voice AI agents).

The bug

If you're building a voice agent on a realtime model (OpenAI Realtime, xAI, Gemini Live), the model streams your transcription back in chunks. A single utterance can fire many user_input_transcribed events before it's final — token by token for OpenAI/xAI, or as one big interim blob for Gemini.

If you want to react exactly once per utterance (say, show a "user is typing" indicator on your frontend via RPC), you need a stable key to correlate all those interim events together.

That key already existed internally — InputTranscriptionCompleted carries an item_id. But when the framework re-emitted it upward as the public UserInputTranscribedEvent, the item_id was silently dropped — leaving consumers with no reliable way to dedupe across providers.

The fix

Small once you see it: add the field, forward it.

class UserInputTranscribedEvent(BaseModel):
    transcript: str
    is_final: bool
    item_id: str | None = None  # new
    ...

def _on_input_audio_transcription_completed(self, ev: llm.InputTranscriptionCompleted) -> None:
    self._session._user_input_transcribed(
        UserInputTranscribedEvent(
            transcript=ev.transcript, is_final=ev.is_final, item_id=ev.item_id
        )
    )
Enter fullscreen mode Exit fullscreen mode

Two files, about 10 lines of real change. The actual work was tracing the event from the realtime model layer, through AgentActivity, up to AgentSession, to find exactly where the field got swallowed.

The takeaway

I didn't need to understand all of livekit-agents to land this — just one event's lifecycle, end to end. Small, well-scoped issues are the most achievable way into a big codebase, especially when someone's already mapped the territory in the issue itself.

PR is up, CI green, waiting on review: github.com/livekit/agents/pull/6172

Top comments (1)

Collapse
 
armorer_labs profile image
Armorer Labs

Nice catch. Stable event identity is one of those tiny fields that becomes huge once an agent is part of a larger workflow.

Without an id like this, every downstream system has to guess: is this a new user utterance, a partial update, a retry, or a duplicate? That gets painful fast when you start logging runs, replaying sessions, or correlating UI state with agent state.

Small patch, but it improves the operational surface of the framework a lot.