DEV Community

Alex Neamtu
Alex Neamtu

Posted on • Originally published at sendrec.eu

How We Fixed AI Summarization Timeouts for Self-Hosted Ollama

SendRec can generate automatic summaries and chapter markers for transcribed videos using any OpenAI-compatible API — Mistral AI, OpenAI, or a local Ollama instance. A user running Ollama on a basic NVIDIA GPU reported that the AI summarization was timing out before Ollama could finish generating a response.

The problem was a hardcoded 60-second HTTP timeout in the AI client.

The bug

The AIClient created its HTTP client with a fixed timeout:

func NewAIClient(baseURL, apiKey, model string) *AIClient {
    return &AIClient{
        baseURL: baseURL,
        apiKey:  apiKey,
        model:   model,
        httpClient: &http.Client{
            Timeout: 60 * time.Second,
        },
    }
}
Enter fullscreen mode Exit fullscreen mode

60 seconds is fine for cloud APIs like Mistral or OpenAI, where inference runs on fast hardware. But local inference with Ollama on a consumer GPU can take several minutes per request, especially with larger models or longer transcripts.

The fix

We added an AI_TIMEOUT environment variable that accepts Go's duration format (60s, 5m, 10m):

func NewAIClient(baseURL, apiKey, model string, timeout time.Duration) *AIClient {
    if timeout <= 0 {
        timeout = 60 * time.Second
    }
    return &AIClient{
        baseURL: baseURL,
        apiKey:  apiKey,
        model:   model,
        httpClient: &http.Client{
            Timeout: timeout,
        },
    }
}
Enter fullscreen mode Exit fullscreen mode

On startup, the app parses the env var:

aiTimeout := 60 * time.Second
if v := os.Getenv("AI_TIMEOUT"); v != "" {
    if d, err := time.ParseDuration(v); err == nil {
        aiTimeout = d
    }
}
aiClient = video.NewAIClient(
    os.Getenv("AI_BASE_URL"),
    os.Getenv("AI_API_KEY"),
    getEnv("AI_MODEL", "mistral-small-latest"),
    aiTimeout,
)
Enter fullscreen mode Exit fullscreen mode

The default stays at 60 seconds — existing deployments aren't affected. Self-hosters running Ollama can set AI_TIMEOUT=5m in their Docker Compose environment.

Configuration

The timeout applies to all AI providers. Here's the updated configuration for local Ollama:

environment:
  - AI_ENABLED=true
  - AI_BASE_URL=http://ollama:11434
  - AI_MODEL=llama3.2
  - AI_TIMEOUT=5m
Enter fullscreen mode Exit fullscreen mode

The full list of AI environment variables is in the self-hosting guide.

First community bug report

This was SendRec's first bug report from a community user. From issue to deployed fix in under an hour — open source at its best.

Try it

The fix is live in v1.53.0. Self-hosters can pull the latest image and add AI_TIMEOUT to their environment. SendRec is open source (AGPL-3.0) — the AI client code is in ai_client.go.

Top comments (0)