Rahul Patel

Posted on Mar 4 • Originally published at foodlbs.github.io

Openclad: A 24/7 Personal AI Agent That Actually Does Things

#ai #python #claude #mac

Full visual version : foodlbs.github.io/openclad

There's a gap between what AI chatbots can do and what they actually do for you day-to-day. ChatGPT can write a poem, but it can't turn off your living room lights when you're already in bed. Claude can analyze a document, but it won't proactively send you a news briefing at 9 AM.

I wanted something different: an AI agent that runs continuously, has access to my real tools and services, makes decisions autonomously for low-risk tasks, and asks permission before doing anything destructive. Something I could message from my phone and get things done — not just get answers.

So I built Jarvis.

What Jarvis Can Do

Time	What happens
9:00 AM	Jarvis sends a news briefing to Telegram — top headlines, tech, markets. No prompt needed.
10:30 AM	"What's the status of my job applications?" → queries SQLite tracker, returns summary
2:00 PM	Send a PDF → Jarvis summarizes, stores in vector memory, saves markdown to Documents
11:00 PM	"Turn off all the lights." → Approval button on Telegram → tap Approve → lights off

The key insight: Jarvis doesn't just respond to questions. It executes tasks, remembers context, and operates on a schedule — all while respecting a risk classification system that keeps me in control of anything with real-world side effects.

Architecture Overview

The system breaks down into four layers:

The Tech Stack

Layer	Technology	Why
Language	Python 3.12 + uv workspace	Fast deps, monorepo support
Agent	Claude Agent SDK (subprocess)	Process isolation, crash recovery
Chat	aiogram 3.x	Async Telegram, inline keyboards
State	Redis	Task state, buffers, retry queue
Vector Memory	ChromaDB (local)	Free, no cloud dependency
Embeddings	OpenAI text-embedding-3-small	Cost-effective semantic search
MCP Framework	FastMCP	Simple Python MCP server creation
Config	pydantic-settings + YAML	Type-safe + env var overrides
Logging	structlog	Structured JSON logs
Daemon	macOS LaunchAgent	Auto-start, background execution

Deep Dive: How Each Piece Works

1. The Conversation Flow

When you send a message on Telegram:

Smart model selection — Short, simple messages (<100 chars, no complex keywords) auto-route to Haiku instead of Sonnet. Saves cost and cuts latency for quick tasks while preserving Sonnet's reasoning for demanding work.

Conversation buffer — Last 10 turns stored in Redis with a 1-hour TTL. Follow-ups like "What about the bedroom lights?" work because Jarvis remembers the context.

2. The Risk System

The most important piece. Two-tier classification:

✅ AUTONOMOUS — No approval needed

File reads, web search, memory queries
Code sandbox, browser navigation
Job tracker reads, calendar reads

🔒 REQUIRE APPROVAL — Inline Telegram button

File writes / deletes
Email sends, calendar edits
Smart home control, phone calls, purchases

Classification isn't just by tool name — it examines input parameters too. Reading ~/Documents is autonomous. Writing to /etc/ always requires approval. Checking a lock status is autonomous. Unlocking it requires approval regardless of context.

Configured via risk_policy.yaml — no code changes needed:

risk_overrides:
  mcp__filesystem__write_file: autonomous       # Trust local file writes
  mcp__smart_home__call_service: require_approval  # Always ask

context_escalation:
  dangerous_paths:
    - /system
    - /etc
    - /usr/bin
  sensitive_entities:
    - lock
    - alarm
    - security

3. The Memory System

Two layers, each optimized for different retrieval patterns:

Short-term: Markdown files injected into the system prompt

File	Purpose
`preferences.md`	Learned preferences ("prefers TypeScript over JavaScript")
`projects.md`	Active project status
`chat_history.md`	Recent session summaries — auto-trims at 30 sessions
`journal.md`	Agent reflections and observations

The ContextLoader stitches these into every agent invocation. When chat_history.md exceeds 30 sessions, older entries compress into bullet-point summaries to keep the prompt under control.

Long-term: ChromaDB vector store with semantic search

After completing significant tasks, Jarvis stores a 2-3 sentence summary with metadata. Before tackling complex work, it searches: "Have I solved something like this before?"

The combination gives Jarvis both working memory (always loaded) and recall (searchable when needed).

4. The Skill Framework

Skills are Markdown files defining triggers, steps, and required tools:

## Daily News Briefing
**Trigger:** "news update", "daily news", "morning briefing"
**Schedule:** Every day at 9:00 AM EST

### Steps
1. Search for current top headlines
2. Search for tech industry news
3. Search for business/markets news
4. Format into clean briefing with sections
5. Output ONLY the briefing — no meta-commentary

The skill manager routes incoming messages by matching keywords and context. If a request matches multiple skills, it chains them.

The interesting part: when Jarvis notices a repeatable pattern (3+ similar tool call sequences), it suggests creating a new skill. The system grows organically based on actual usage.

5. The Scheduler

Parses a human-readable schedules.md into cron-like entries:

## Daily News Briefing
**Schedule:** Every day at 9:00 AM EST
**Action:** Execute daily-news-briefing skill
**Output:** Send formatted news briefing via Telegram
**Status:** Active

Each job runs as an async task with max turns capped at 15 to enforce conciseness and suppress the model's meta-commentary tendency.

6. Resilience & Fallback

Running 24/7 means things will break. Three mechanisms handle it:

Retry queue — Failed tasks enter a Redis sorted set with exponential backoff (30s → 60s → 120s). After 3 retries, abandoned with a failure notification. On retry, model downgrades to Haiku to reduce cost.

API fallback — If the Claude API is rate-limited, Jarvis detects error keywords ("rate_limit", "529", "overloaded") and falls back to the Claude Code CLI with a cached OAuth token.

Circuit breaker — Standard closed → open → half-open pattern. After 5 consecutive failures, circuit opens and rejects calls for 60 seconds before allowing a probe request.

Project Structure

personal-ai-agent/
├── main.py                    # Entry point & orchestrator
├── pyproject.toml             # uv workspace root
├── compose.yaml               # Docker Compose (Redis)
├── packages/
│   ├── core/                  # Agent, config, state, risk, retry, scheduler
│   ├── interfaces/            # Telegram bot, handlers, approval flow
│   └── mcp_servers/           # Job tracker, smart home, memory
├── data/
│   ├── agent_context/         # personality.md, skills/, schedules.md
│   ├── memory/chroma/         # Vector store persistence
│   └── secrets/               # OAuth credentials (gitignored)
├── configs/
│   ├── agent.yaml             # Runtime config
│   └── risk_policy.yaml       # Risk classification overrides
└── tests/                     # 20+ unit tests

The monorepo keeps things modular — core has no Telegram dependency, interfaces has no MCP dependency, and mcp_servers are standalone FastMCP processes. Want Discord instead of Telegram? Replace interfaces without touching core.

Key Design Decisions

Why local ChromaDB over Pinecone? No cloud dependency. The vector store lives at data/memory/chroma/ — just files on disk. Zero cost, zero round-trip latency, git-backupable.

Why the subprocess model? Each agent invocation is isolated. If it crashes, nothing leaks. SDK upgrade? Restart the process. Simplest possible isolation boundary.

Why Telegram over a custom UI? Already on my phone, laptop, and watch. Inline keyboards, file attachments, voice messages, rich formatting — all built-in. A custom UI would have taken weeks for a worse experience.

Why Markdown for schedules and skills? Editable with any text editor, version-controlled with git, readable by the agent itself. When Jarvis creates a new skill, it writes a .md file.

Why Redis for everything stateful? One dependency, in-memory speed, TTL for auto-cleanup, pub/sub for future real-time features.

Running It Yourself

# Clone the starter
git clone https://github.com/yourusername/jarvis-starter.git
cd jarvis-starter

# Configure
cp .env.example .env   # Add your API keys

# Install
uv sync --all-packages

# Start Redis
docker compose up -d redis

# Run
uv run python main.py

You'll need:

Anthropic API key (Claude)
Telegram bot token (from @BotFather)
Your Telegram chat ID
Optional: OpenAI key (embeddings), Google OAuth (Calendar/Gmail), Home Assistant URL + token

The starter ships with the full structure, a working Telegram bot, the risk system, two MCP servers (filesystem + memory), and one example skill (daily news briefing). Everything else added incrementally.

What I'd Do Differently

Voice pipeline — Telegram voice transcription works but is clunky. A dedicated streaming voice pipeline would transform the experience.
Parameterized skills — Skills as templates: Research {topic} with depth {shallow|deep} instead of flat instruction sets.
Multi-agent orchestration — Some tasks need parallel sub-agents (researcher + writer). The single-agent model hits turn limits on complex workflows.
Observability dashboard — Task history, tool usage, cost tracking, memory growth. The event stream is there but underutilized.

Wrapping Up

Jarvis has been running 24/7 on my Mac for about two weeks — delivering morning news, tracking job applications, helping with research, and controlling my apartment — all through Telegram.

The total codebase is ~2,000 lines of Python across three packages, plus Markdown files for personality, skills, and schedules. The agent framework (Claude SDK + MCP servers) does the heavy lifting; the surrounding infrastructure — risk classification, retry logic, conversation persistence, skill routing — is what transforms a chatbot into an actual assistant.

Check out the starter repo and if you build something cool with it, I'd love to hear about it.

Built with Claude Agent SDK · aiogram · Redis · ChromaDB · FastMCP · Running on a Mac Mini as a LaunchAgent