DEV Community

RagLeap
RagLeap

Posted on

How I Built a Multilingual AI Call Center on a 4GB VPS Using Django, Neo4j, and Twilio

I spent 18 months building RagLeap. Here's the full technical breakdown of how I got a production RAG system with voice, WhatsApp, and Telegram running on a $8/month VPS.
The stack
Backend: Django 4.2 + DRF
Vector DB: pgvector (PostgreSQL)
Graph DB: Neo4j (included on all plans, even free)
Queue: Celery + Redis
Voice: Twilio + ElevenLabs TTS
Messaging: Twilio WhatsApp, Telegram Bot API, Discord
AI: Any provider (OpenAI, Gemini, Anthropic, Mistral, private)
Server: 4GB RAM VPS (Ubuntu 24)
The RAG architecture
Standard vector search gets you ~78% retrieval accuracy. That's not good enough for business-critical answers.
I combined pgvector with Neo4j knowledge graphs for hybrid retrieval:
python# Hybrid retrieval: vector (75%) + graph (25%)
result = hybrid_retrieval.search(
query=user_question,
workspace_id=workspace_id,
vector_weight=0.75,
graph_weight=0.25
)
The graph stores entity relationships extracted from documents. When a customer asks "What's included in the Pro plan?", the graph knows that Pro → includes → Feature X → requires → Setup Y. Vector search alone misses these hops.
Result: 94.3% retrieval accuracy.
The multilingual pipeline
No hardcoded translations. Every response is generated by the LLM in the target language:
pythonrag_result = orchestrator.execute_rag(
query=user_message,
language=detected_language, # auto-detected from message
response_language=workspace_language, # forced by workspace settings
custom_api_key=owner_api_key, # owner's own key
)
The workspace owner sets their language. All customer responses come in that language regardless of what language the customer writes in.
The voice routing
One Twilio number serves two completely different AI experiences:
pythondef twilio_voice_incoming(request):
caller = request.POST.get('From')

# Check if caller is verified owner
owner_plan = UserPlan.objects.filter(
    user=workspace.owner,
    mobile_verified=True
).first()

if owner_plan and normalize(caller) == normalize(owner_plan.mobile_number):
    # Route to Manager AI (private mode)
    return redirect_to_manager_ai(workspace)
else:
    # Route to customer RAG bot (public mode)
    return customer_rag_response(workspace)
Enter fullscreen mode Exit fullscreen mode

The Manager AI
The owner's AI Manager has 50+ executive actions registered:
pythonACTION_HANDLERS = {
'check_owner_emails_now': check_owner_emails_now_action,
'create_report': create_report,
'query_external_database': query_external_database,
'setup_ai_call_center': setup_ai_call_center,
'analyse_database_for_automation': analyse_database_for_automation,
# ... 45 more actions
}
When the owner sends "How many orders today?" on Telegram, the system parses the intent, executes the database query, and returns a formatted answer — all in the owner's language.
Performance on 4GB RAM
Neo4j: ~800MB
PostgreSQL: ~400MB

Redis: ~200MB
Gunicorn: ~600MB
Celery workers: ~400MB
Total: ~2.4GB (leaves headroom)
The whole system runs on a $8/month Contabo VPS.
Full docs: docs.ragleap.com
Try it: ragleap.com (7-day free trial, bring your own API key)

Top comments (0)