DEV Community

Cover image for Gemini Streaming in Rust + Tauri — Real-Time AI Responses Without the Jank
hiyoyo
hiyoyo

Posted on

Gemini Streaming in Rust + Tauri — Real-Time AI Responses Without the Jank

All tests run on an 8-year-old MacBook Air. All results from shipping 7 Mac apps as a solo developer. No sponsored opinion.

Waiting for a full AI response before showing anything feels broken in 2026. Streaming makes the UX feel instant. Here's how I implemented Gemini streaming in Rust with Tauri events.


The Approach

Gemini supports server-sent events (SSE) for streaming responses. The flow:

  1. Rust backend opens an SSE connection to the Gemini API
  2. As chunks arrive, emit Tauri events to the frontend
  3. Frontend appends chunks to the UI in real time

No waiting. No spinner for 3 seconds. Text appears as it generates.


The Rust Side

use futures_util::StreamExt;

#[tauri::command]
async fn stream_gemini(
    handle: AppHandle,
    prompt: String,
    api_key: String,
) -> Result<(), AppError> {
    let client = reqwest::Client::new();

    // Explicitly serialize body + set Content-Type header
    // Note: .json() can silently drop the body in Universal Binary release builds
    let body = serde_json::to_string(&serde_json::json!({
        "contents": [{"parts": [{"text": prompt}]}]
    }))
    .map_err(|e| AppError::Serialize(e.to_string()))?;

    let mut stream = client
        .post(format!(
            "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key={}",
            api_key
        ))
        .header("Content-Type", "application/json")
        .body(body)
        .send()
        .await?
        .bytes_stream();

    while let Some(chunk) = stream.next().await {
        let chunk = chunk?;
        let text = String::from_utf8_lossy(&chunk);

        for line in text.lines() {
            if let Some(data) = line.strip_prefix("data: ") {
                if data == "[DONE]" {
                    break;
                }
                if let Ok(json) = serde_json::from_str::<serde_json::Value>(data) {
                    if let Some(content) = json
                        .pointer("/candidates/0/content/parts/0/text")
                        .and_then(|v| v.as_str())
                    {
                        handle.emit("gemini-chunk", content).ok();
                    }
                }
            }
        }
    }

    handle.emit("gemini-done", ()).ok();
    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

The Frontend Side

import { listen } from '@tauri-apps/api/event'

async function startStreaming(prompt: string, apiKey: string) {
    let response = ''

    const unlisten = await listen<string>('gemini-chunk', (event) => {
        response += event.payload
        setDisplayText(response)
    })

    const unlistenDone = await listen('gemini-done', () => {
        unlisten()
        unlistenDone()
        setIsStreaming(false)
    })

    try {
        await invoke('stream_gemini', { prompt, apiKey })
    } catch (e) {
        // Clean up listeners on error too
        unlisten()
        unlistenDone()
        setIsStreaming(false)
        setError(String(e))
    }
}
Enter fullscreen mode Exit fullscreen mode

The Gotchas

SSE parsing is fragile. The data: prefix, blank lines between events, and [DONE] markers all need handling. The code above handles the common case — add logging for malformed chunks in production.

Always clean up listeners. If the user navigates away mid-stream, orphaned listeners keep firing against dead state. The try/catch above ensures cleanup happens on both success and error paths.

Rate limits mid-stream. Gemini can 429 mid-response. Detect it in the Rust error path and emit a dedicated error event:

// In your error handling
if status == 429 {
    handle.emit("gemini-error", "Rate limit hit — please retry").ok();
    return Ok(());
}
Enter fullscreen mode Exit fullscreen mode

Don't use .json() in release builds. On Universal Binary (Intel + Apple Silicon) DMG builds, reqwest's .json() can silently drop the request body. Always use explicit .header("Content-Type", "application/json").body(serde_json::to_string(&payload)?) instead.


The Result

Streaming makes AI features feel responsive even on slow networks. The implementation is ~50 lines of Rust and ~20 lines of TypeScript.

Worth it for every AI feature in every app.


If this was useful, a ❤️ helps more than you'd think!

👉 HiyokoBar → https://hiyokomtp.lemonsqueezy.com/checkout/buy/f9b85321-6878-40aa-a472-ff748d6de2d5

X → @hiyoyok

Top comments (1)

Collapse
 
hiyoyok profile image
hiyoyo

TL;DR:

  • Use reqwest + futures_util::StreamExt to consume Gemini's SSE stream in Rust
  • Emit each chunk as a Tauri event → frontend appends in real time
  • Never use .json() in release builds — use explicit .body() + .header()
  • Always clean up event listeners on both success and error paths
  • Handle [DONE] and 429 mid-stream explicitly