Handling Gemini API Errors Gracefully — 429, 503, and the Free Tier Reality

#ai #gemini #rust #programming

All tests run on an 8-year-old MacBook Air.

The Gemini free tier is generous. It's also shared with everyone else using the free tier.

At peak times, you'll hit 503s. Heavy usage will trigger 429s. If your app just shows "error" and dies, users blame your app — not Google's infrastructure.

Here's how I handle it gracefully in HiyokoLogcat.

The error codes you'll actually see

429 Too Many Requests — you've hit your rate limit. For the free tier, this is 15 requests per minute (RPM) and 1 million tokens per day.

503 Service Unavailable — the API is overloaded. Usually transient. Retry after a short wait.

400 Bad Request — your prompt is malformed, or you've exceeded the per-request token limit. Don't retry — fix the request.

The Rust handler

#[derive(Debug)]
pub enum GeminiError {
    RateLimit,           // 429
    ServiceUnavailable,  // 503
    BadRequest(String),  // 400
    Unknown(u16),
}

pub async fn call_gemini(prompt: &str) -> Result {
    let response = send_request(prompt).await;

    match response.status().as_u16() {
        200 => Ok(parse_response(response).await),
        429 => Err(GeminiError::RateLimit),
        503 => Err(GeminiError::ServiceUnavailable),
        400 => Err(GeminiError::BadRequest(response.text().await.unwrap_or_default())),
        code => Err(GeminiError::Unknown(code)),
    }
}

User-facing messages in the UI

Technical error codes mean nothing to users. The frontend shows human messages:

const getErrorMessage = (error: string): string => {
  switch (error) {
    case 'RateLimit':
      return '🐥 少し待ってからもう一度お試しください（APIの制限に達しました）';
    case 'ServiceUnavailable':
      return '🐥 Geminiサーバーが混雑しています。しばらく待ってから再試行してください';
    case 'BadRequest':
      return '🐥 ログの解析に失敗しました。別のエラー行を選択してみてください';
    default:
      return '🐥 診断に失敗しました。もう一度お試しください';
  }
};

Friendly, specific, tells the user what to do next. No raw error codes.

Automatic retry for 503

Service unavailable is usually transient. Auto-retry once after 2 seconds:

pub async fn call_gemini_with_retry(prompt: &str) -> Result {
    match call_gemini(prompt).await {
        Err(GeminiError::ServiceUnavailable) => {
            tokio::time::sleep(Duration::from_secs(2)).await;
            call_gemini(prompt).await
        }
        other => other,
    }
}

One retry only. If it fails twice, surface the error — don't loop forever.

The free tier is actually fine

For a developer tool used intermittently (click diagnose on an error, read, move on), 15 RPM is plenty. I've never hit the daily token limit in normal use.

The 503s happen occasionally during peak hours. With graceful handling, they're a minor annoyance rather than a broken feature.

HiyokoLogcat is free and open source → github.com/hiyoyok/HiyokoLogcat
X → @hiyoyok