The Hidden Tax of AI-Assisted Development (And How I Fixed It)

#ai #programming #python #opensource

Every AI coding session starts the same way. You open your editor, the assistant says hello, and you spend the first five minutes orienting it.

"What branch am I on?"

"What services are running?"

"Where did we leave off last session?"

"Is the test suite green?"

It's a tax you pay on every session. Multiply that by days, weeks, a whole team — it adds up to a real cost in both time and attention. And tokens, if you're paying by the token.

The Industry's Answer: Runtime Tool Calls

The standard solution is to let the assistant figure it out at runtime. MCP servers, function calling, Claude Code hooks — the assistant asks "what's running?" mid-conversation, and something answers. Repeat for every fact it needs.

This works. It's also one round-trip per fact. 50 facts = 50 round-trips. If you're paying for Claude Opus or GPT-5.5 by the token, every one of those orientation questions burns tokens. Quickly.

A Different Bet: Resolve Before They Read

I built Perseus to go the other direction. Instead of the assistant discovering facts at runtime, you resolve them at render time — before the assistant ever reads them.

You write a context file with directives:

@perseus v0.8

# Current State
@query "git status --short"
@query "git log --oneline -5"

# Services
@services

# Last Session
@waypoint ttl=86400

# Ports
@read .env key="API_PORT" fallback="3001"

Perseus runs those directives, resolves them to live values, and outputs a plain markdown document. Your assistant reads facts, not instructions to go find facts.

Without Perseus                     With Perseus
────────────────────────────────    ──────────────────────────────
"Port is 3001 (check .env)"    →   Port: 3001
"47 tests (may be stale)"      →   Tests: 597 passing (run 8s ago)
"Check docker ps first"        →   mongo-dev: Up 4h 12m
"Where did we leave off?"      →   Checkpoint: webhook done, pending test run

The Speed Story

The delta is structural, not incremental:

1 directive via runtime tool call: ~50ms (one round-trip)
10,000 directives via Perseus: 0.36 seconds (total, rendered once)
That's ~23,000× faster for large directive counts

With caching (@cache ttl=300), the warm path resolves 500 directives in 0.28 seconds — 40× faster than cold. For a typical project context file (20-50 directives), Perseus finishes before you notice it ran.

Multi-Agent: The Swarm Demo

Perseus has a coordination layer called Agora. Multiple agents can write to the same task board simultaneously using filesystem-based atomic locks.

To stress-test this, I ran a 120-agent swarm — all 120 agents writing to the same task board, 150 concurrent writes. Result: 9.7 seconds, zero collisions.

No server. No database. Just @agora and @inbox directives resolved to plain markdown.

What Ships

20 directives — @query, @services, @waypoint, @agora, @inbox, @memory, @read, @env, @skills, @session, @date, @health, @agent, @tree, @list, @include, @if/@else/@endif, @constraint, @validate, @cache
Assistant-agnostic — outputs plain markdown. Works with Claude Code, Cursor, Codex, Rovo Dev, and anything else that reads a file
CLAUDE.md / AGENTS.md targets — perseus render --format agents-md outputs AGENTS.md every tool already reads
MCP server — 13 tools for any MCP-compatible assistant: perseus mcp serve
Single file, one dependency — perseus.py (~12,000 lines) + pyyaml
Nearly 600 tests, MIT license

Why Not Just Use AGENTS.md?

AGENTS.md is your project's bio. Perseus is your project's heartbeat. One is static text you write once. The other resolves live state every time you render it.

They compose. Perseus can render to AGENTS.md — keep your static instructions, add live state, one file your assistant already reads.

Why Not Just Use MCP?

MCP is runtime. One fact per tool call. Perseus is compile-time — N facts in one file. They compose too: Perseus has its own MCP server that exposes 13 directive tools for assistants that prefer the runtime model.

The right question isn't "MCP or Perseus?" — it's "which facts should arrive before the assistant speaks, and which should it discover on demand?" Perseus handles the first category. MCP handles the second.

Quick Start

pip install perseus-ctx
perseus init                     # scaffold .perseus/context.md
perseus render --format agents-md  # your first live briefing

For Claude Code users:

perseus install --target claude-code  # auto-inject context at session start

Then set up a cron job to re-render every 5 minutes — your assistants always start briefed.

Bottom Line

I built Perseus because I was tired of every AI session starting with "what branch am I on? what's running? where were we?" The assistant should know before it says hello.

If you've felt the same frustration, give it a try. It's MIT licensed, one dependency, and takes 30 seconds to set up. If it saves you even one orientation exchange per session, it's paid for itself.

github.com/tcconnally/perseus | perseus.observer

What's your cold-start routine? Do you use AGENTS.md, Claude hooks, or just re-explain every session? I'm curious how others are solving this.