DEV Community

Ai developer
Ai developer

Posted on

I Self-Hosted an AI Assistant: Lessons from 48 Hours of Debugging

I Self-Hosted an AI Assistant: Lessons from 48 Hours of Debugging

I wanted a local AI assistant. Expected: 2 hours. Reality: 2 days of edge cases, broken dependencies, and discovering that "local" doesn't mean "free."

The Stack

  • OpenClaw (open-source AI assistant framework)
  • VPS with limited console access (had to file tickets to enable)
  • OpenRouter for model access
  • Local Qwen as fallback

What Broke

1. Dependency Hell

Pre-installed OpenClaw came with an outdated library. Updated manually. Then updated again. OpenRouter integration only worked after the second update.

2. Certificate Issues

Self-hosted means self-managed certificates. Let's Encrypt, reverse proxy, CORS headers. Each layer adds a new failure mode.

3. "Free" API Credits Aren't

OpenRouter's "free" models have limits. Hit them within hours. The API key died silently — no error message, just empty responses.

4. Local Model Reality Check

Qwen promised tool-use support. Reality:

  • Absolute paths broke tool calling (relative only)
  • Model experienced "amnesia" — couldn't open .md files it created
  • Larger models need more RAM but run slower
  • 200K context window sounds great until you hit memory limits

5. The Debugging Cascade

Fix one thing → break another. Add skills for email and search. DuckDuckGo API rate-limits kill the search skill. Switch to alternative. New limits.

What Worked

Despite everything, the assistant is now running. Key insight:

Boxed solutions (Kimi, GLM native APIs) are more reliable. But self-hosting teaches you how the pieces actually connect — tool calling, memory management, model routing, context windows.

The Real Cost

Item Expected Actual
Setup time 2 hours 2 days
API costs $0 $20+ before limits
Compute Minimal 16GB+ RAM for usable local models
Maintenance Zero Ongoing dependency updates

Should You Self-Host?

Yes if:

  • You want to understand LLM infrastructure deeply
  • Data privacy is non-negotiable
  • You enjoy debugging more than using

No if:

  • You need reliability today
  • Your time has a cost
  • You're not ready to file support tickets for console access

What's Next

I'm keeping the local setup as a learning environment but routing production tasks to managed APIs. The hybrid approach: local for experimentation, cloud for reliability.


More self-hosting experiments and production AI infrastructure notes — follow my Telegram channel:

https://t.me/ai_tablet (Russian, technical)


More AI engineering notes, RAG benchmarks, and production insights from inside a bank — follow my Telegram channel:

🚀 https://t.me/ai_tablet (Russian, technical)

Top comments (0)