Tian AI LLM Manager: Process Lifecycle on Mobile Devices

#ai #llm #python

Tian AI LLM Manager: Process Lifecycle on Mobile Devices

Running a 1.5B parameter LLM on a phone is hard. Keeping it alive is harder.

The Problem

Termux (Android terminal) kills background processes when the device goes to sleep or when memory is low. The LLM process (llama-server) is especially vulnerable because it uses 2-4GB of RAM.

The Solution: LLMManager

Tian AI includes a dedicated process manager that monitors and maintains the LLM process:

class LLMManager:
    def __init__(self):
        self.health_check_interval = 30  # seconds
        self.max_restart_attempts = 5
        self.backoff_strategy = [1, 2, 4, 8, 15]  # seconds

Key Features

Health Checks
Every 30 seconds, the manager pings the llama-server endpoint. If no response within 5 seconds, it attempts recovery.

Auto-Restart
If the process dies, LLMManager spawns a new instance with the same parameters. Up to 5 restart attempts with exponential backoff.

Resource Monitoring
Tracks memory usage and CPU load. If memory usage exceeds 80%, it triggers a soft restart with reduced context window.

Graceful Shutdown
When the user exits Tian AI, LLMManager sends SIGTERM to the LLM process, waits 10 seconds, then SIGKILL if needed.

Implementation

health_url = f"http://localhost:{port}/health"
try:
    r = requests.get(health_url, timeout=5)
    if r.status_code == 200:
        return True
except:
    return False

Why This Matters

Without LLMManager, the LLM process would die silently after a few minutes of inactivity. Users would have to manually restart it. With LLMManager, Tian AI "just works" — even on resource-constrained mobile devices.

This is the engineering that makes local AI practical.