hiyoyo

Posted on Jun 7

Gemini Streaming in Rust + Tauri — Real-Time AI Responses Without the Jank

#tauri #rust #ai #programming

All tests run on an 8-year-old MacBook Air. All results from shipping 7 Mac apps as a solo developer. No sponsored opinion.

Waiting for a full AI response before showing anything feels broken in 2026. Streaming makes the UX feel instant. Here's how I implemented Gemini streaming in Rust with Tauri events.

The Approach

Gemini supports server-sent events (SSE) for streaming responses. The flow:

Rust backend opens an SSE connection to the Gemini API
As chunks arrive, emit Tauri events to the frontend
Frontend appends chunks to the UI in real time

No waiting. No spinner for 3 seconds. Text appears as it generates.

The Rust Side

use futures_util::StreamExt;

#[tauri::command]
async fn stream_gemini(
    handle: AppHandle,
    prompt: String,
    api_key: String,
) -> Result<(), AppError> {
    let client = reqwest::Client::new();

    // Explicitly serialize body + set Content-Type header
    // Note: .json() can silently drop the body in Universal Binary release builds
    let body = serde_json::to_string(&serde_json::json!({
        "contents": [{"parts": [{"text": prompt}]}]
    }))
    .map_err(|e| AppError::Serialize(e.to_string()))?;

    let mut stream = client
        .post(format!(
            "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key={}",
            api_key
        ))
        .header("Content-Type", "application/json")
        .body(body)
        .send()
        .await?
        .bytes_stream();

    while let Some(chunk) = stream.next().await {
        let chunk = chunk?;
        let text = String::from_utf8_lossy(&chunk);

        for line in text.lines() {
            if let Some(data) = line.strip_prefix("data: ") {
                if data == "[DONE]" {
                    break;
                }
                if let Ok(json) = serde_json::from_str::<serde_json::Value>(data) {
                    if let Some(content) = json
                        .pointer("/candidates/0/content/parts/0/text")
                        .and_then(|v| v.as_str())
                    {
                        handle.emit("gemini-chunk", content).ok();
                    }
                }
            }
        }
    }

    handle.emit("gemini-done", ()).ok();
    Ok(())
}

The Frontend Side

import { listen } from '@tauri-apps/api/event'

async function startStreaming(prompt: string, apiKey: string) {
    let response = ''

    const unlisten = await listen<string>('gemini-chunk', (event) => {
        response += event.payload
        setDisplayText(response)
    })

    const unlistenDone = await listen('gemini-done', () => {
        unlisten()
        unlistenDone()
        setIsStreaming(false)
    })

    try {
        await invoke('stream_gemini', { prompt, apiKey })
    } catch (e) {
        // Clean up listeners on error too
        unlisten()
        unlistenDone()
        setIsStreaming(false)
        setError(String(e))
    }
}

The Gotchas

SSE parsing is fragile. The data: prefix, blank lines between events, and [DONE] markers all need handling. The code above handles the common case — add logging for malformed chunks in production.

Always clean up listeners. If the user navigates away mid-stream, orphaned listeners keep firing against dead state. The try/catch above ensures cleanup happens on both success and error paths.

Rate limits mid-stream. Gemini can 429 mid-response. Detect it in the Rust error path and emit a dedicated error event:

// In your error handling
if status == 429 {
    handle.emit("gemini-error", "Rate limit hit — please retry").ok();
    return Ok(());
}

Don't use .json() in release builds. On Universal Binary (Intel + Apple Silicon) DMG builds, reqwest's .json() can silently drop the request body. Always use explicit .header("Content-Type", "application/json").body(serde_json::to_string(&payload)?) instead.

The Result

Streaming makes AI features feel responsive even on slow networks. The implementation is ~50 lines of Rust and ~20 lines of TypeScript.

Worth it for every AI feature in every app.

If this was useful, a ❤️ helps more than you'd think!

👉 HiyokoBar → https://hiyokomtp.lemonsqueezy.com/checkout/buy/f9b85321-6878-40aa-a472-ff748d6de2d5

X → @hiyoyok

Top comments (3)

hiyoyo • Jun 7

TL;DR:

Use reqwest + futures_util::StreamExt to consume Gemini's SSE stream in Rust
Emit each chunk as a Tauri event → frontend appends in real time
Never use .json() in release builds — use explicit .body() + .header()
Always clean up event listeners on both success and error paths
Handle [DONE] and 429 mid-stream explicitly

mote • Jun 10

That reqwest .json() silently dropping request bodies in Universal Binary builds is the kind of bug that makes you question your entire toolchain. I've hit similar cross-compilation quirks in Rust — the linker silently picks the wrong symbol and you get zero warnings. Maddening.

On the SSE parsing: have you tried the eventsource-stream crate instead of manual line-by-line parsing? Handles data: prefix, empty lines, and [DONE] correctly. Way less brittle.

One thing I'd add: backpressure. If the frontend can't keep up with chunks, Tauri events queue up in memory. A bounded channel with try_send on the Rust side would prevent OOM on slow UIs.

Question: any reason you chose Tauri events over a WebSocket from Rust to the webview? Events feel cleaner for this use case, but curious about your reasoning.

hiyoyo • Jun 10

Great points all around!

On Tauri events vs WebSocket — for a one-directional stream (Rust → frontend), Tauri events felt like the natural fit with zero setup overhead. WebSocket makes more sense for bidirectional communication, but for this use case it felt unnecessarily complex.

On eventsource-stream — I've actually used it before and you're right, it's far less brittle. I went with manual parsing first and it just stuck. Probably worth swapping in.

The backpressure point is something I hadn't considered — good catch. A bounded channel with try_send is a clean solution I'll look into.

Thanks for the thoughtful feedback!