How I fixed silent Ollama failures in my local AI Assistant

#python #opensource #machinelearning #ollama

How I fixed silent Ollama failures in my local AI assistant

Neo-AI is an offline assistant with episodic memory, running entirely on-device using Ollama, SQLite, and LanceDB — no cloud, no data leaks.

The Failure

While testing before a release, the CLI showed a generic "An error occurred." I opened the log file and found the real error:
ERROR: Failed to connect to Ollama. Please check that Ollama is downloaded, running and accessible.
Ollama was installed but not running. Neo had no way to start it automatically.

The naive fix

def start_ollama():
    try:
        import httpx
        httpx.get("http://localhost:11434")
        return
    except Exception:
        console.print(Text("Starting OLlama server...",style="dim"))
        subprocess.Popen(["ollama","serve"],stdout=subprocess.DEVNULL,stderr=subprocess.DEVNULL)
        time.sleep(2)
        console.print(Text("Ollama server started.", style="dim"))

This uses a hardcoded health check — a GET request to localhost:11434. If Ollama responds, skip. If not, spawn it as a background process using subprocess.Popen, which launches Ollama silently without blocking Neo. stdout and stderr are discarded via DEVNULL so Ollama's internal logs don't pollute Neo's terminal.
The flaw: time.sleep(2) is a guess. On a cold WSL start, Ollama can take longer. Neo proceeds before Ollama is actually ready.

The Better Fix

def start_ollama():
    def is_running():
        try:
            httpx.get("http://localhost:11434", timeout=2)
            return True
        except Exception:
            return False
    if is_running():
        return
    console.print(Text("Starting Ollama server...", style="dim"))
    subprocess.Popen(["ollama", "serve"], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
    for _ in range(10):  # wait up to 10 seconds
        time.sleep(1)
        if is_running():
            console.print(Text("Ollama server started.", style="dim"))
            return
    raise RuntimeError("Ollama failed to start after 10 seconds.")

Instead of sleeping blindly, the retry loop polls every second for up to 10 seconds. Neo only proceeds once Ollama confirms it's alive. If it never starts, a RuntimeError is raised and logged.

Takeaway

Local LLM reliability is an engineering problem, not a hardware problem. Don't sleep — poll.
Neo-AI on GitHub