Tian AI LLM Manager: Process Lifecycle on Mobile Devices
Running a 1.5B parameter LLM on a phone is hard. Keeping it alive is harder.
The Problem
Termux (Android terminal) kills background processes when the device goes to sleep or when memory is low. The LLM process (llama-server) is especially vulnerable because it uses 2-4GB of RAM.
The Solution: LLMManager
Tian AI includes a dedicated process manager that monitors and maintains the LLM process:
class LLMManager:
def __init__(self):
self.health_check_interval = 30 # seconds
self.max_restart_attempts = 5
self.backoff_strategy = [1, 2, 4, 8, 15] # seconds
Key Features
Health Checks
Every 30 seconds, the manager pings the llama-server endpoint. If no response within 5 seconds, it attempts recovery.
Auto-Restart
If the process dies, LLMManager spawns a new instance with the same parameters. Up to 5 restart attempts with exponential backoff.
Resource Monitoring
Tracks memory usage and CPU load. If memory usage exceeds 80%, it triggers a soft restart with reduced context window.
Graceful Shutdown
When the user exits Tian AI, LLMManager sends SIGTERM to the LLM process, waits 10 seconds, then SIGKILL if needed.
Implementation
health_url = f"http://localhost:{port}/health"
try:
r = requests.get(health_url, timeout=5)
if r.status_code == 200:
return True
except:
return False
Why This Matters
Without LLMManager, the LLM process would die silently after a few minutes of inactivity. Users would have to manually restart it. With LLMManager, Tian AI "just works" — even on resource-constrained mobile devices.
This is the engineering that makes local AI practical.
Top comments (0)