DEV Community

Franck Hlb
Franck Hlb

Posted on

JARVIS OS — Building a Distributed AI Operating System with 928 Autonomous Agents

I've been building JARVIS OS for the past year — a fully distributed AI operating system running 928 autonomous agents across 6 GPUs. Here's what I learned.

What is JARVIS OS?

JARVIS OS is a custom Linux-based operating system designed from the ground up to orchestrate hundreds of AI agents simultaneously. It combines:

  • Multi-GPU distributed computing (6x NVIDIA GPUs)
  • 928 autonomous agents with specialized roles
  • Voice AI with sub-300ms response time
  • 99.7% uptime on bare metal
  • Real-time coordination via MCP (Model Context Protocol)

Architecture Overview

The system is built on Ubuntu 22.04 LTS with heavy Docker containerization. Each agent runs in an isolated container with resource limits, communicating through a central message broker.

Key Components

Core Orchestrator — The brain of JARVIS. Routes tasks to specialized agents based on capability matching and current load.

GPU Cluster Manager — Dynamically allocates GPU memory across agents. Uses custom VRAM partitioning to run multiple models simultaneously without conflicts.

Voice Pipeline — Built with Whisper large-v3 + Porcupine for wake word detection + EasySpeak TTS. End-to-end latency under 300ms.

Agent Registry — Tracks all 928 agents, their status, capabilities, and performance metrics in real-time via PostgreSQL.

My GitHub Projects

All open source. Here are the main repositories:

  • JARVIS-OS — Core orchestration system, agent registry, GPU manager
  • TradingBot — Crypto trading automation with CCXT, technical analysis, multi-exchange support
  • BrowserOS — Multi-browser orchestration with MCP connectors for AI agent web automation
  • VoiceAI — Whisper + Porcupine + EasySpeak pipeline for real-time voice interaction
  • N8N-Workflows — Automation workflows connecting JARVIS to external APIs (Piste, Chorus Pro, webhooks)

Find all repos at: https://github.com/Turbo31150

Freelance Services

Based in Toulouse, France. Available for:

  • AI agent architecture & deployment
  • Distributed system design
  • Linux optimization & GPU clustering
  • Automation & workflow engineering
  • Voice AI integration

Rate: 55€/h | Contact via Codeur.com or LinkedIn

Technical Stack

OS: Ubuntu 22.04 LTS (custom kernel)
GPUs: 6x NVIDIA (multi-VRAM partitioning)
Orchestration: Docker + custom MCP layer
AI Models: Claude, Gemini CLI, LM Studio (local LLMs)
Voice: Whisper large-v3, Porcupine, EasySpeak
DB: PostgreSQL + Redis
Automation: N8N, Bash, Python (Flask)
Trading: CCXT, custom TA engine
Enter fullscreen mode Exit fullscreen mode

Lessons Learned

1. Agent isolation is critical. One rogue agent can cascade failures. Container limits + circuit breakers saved us countless times.

2. VRAM fragmentation is the enemy. Dynamic allocation beats static partitioning at scale. We built a custom VRAM allocator.

3. Voice latency is mostly network. Moving inference local (Whisper on-device) cut latency from 800ms to under 300ms.

4. MCP changes everything. The Model Context Protocol gives agents a common language. Before MCP, inter-agent communication was a mess of custom APIs.

5. 99.7% uptime is achievable on bare metal. With proper watchdogs, health checks, and auto-restart policies — no cloud required.

What's Next

  • Scaling to 1500+ agents
  • WebRTC-based real-time collaboration between agent clusters
  • Public API for third-party agent integration
  • Hackathon Airia 2026 — already ranked top participant

If you're building something similar or want to collaborate, reach out. Always happy to discuss distributed AI architecture.

GitHub: https://github.com/Turbo31150
LinkedIn: Franck Hlb
Freelance: Codeur.com | Malt

Top comments (0)