Building a Self-Improving God Agent with Claude AI
After running this system in production for several weeks, I can tell you it's equal parts fascinating and humbling to watch a piece of software genuinely improve itself over time. What started as a task router became something closer to an autonomous engineering team member.
Here's how we built it.
The Architecture
The core idea is simple: instead of manually triaging and assigning tasks, a God Agent acts as an autonomous orchestrator. It wakes up every 2 minutes, surveys the task queue, makes routing decisions, dispatches specialist agents, and — critically — learns from what works and what doesn't.
The stack:
- Next.js 14 (App Router) for the dashboard and API routes
- Supabase for task persistence and agent state
- Claude claude-sonnet-4-6 as the intelligence layer
- PM2 to keep the orchestration loop alive
-
TypeScript throughout, with the God Agent itself running as an
.mjsdaemon
┌─────────────────────────────────────┐
│ God Agent (PM2) │ ← runs every 2 min
│ god-agent-loop.mjs │
└──────────────┬──────────────────────┘
│ classifies + routes
┌──────────┼──────────┐
▼ ▼ ▼
db-specialist ui-specialist ruflo-agents
│ (critical/high/medium)
▼
Council Mode ← for complex decisions
(N parallel Claude instances)
The God Agent Loop
The orchestrator runs as a standalone Node process managed by PM2. Every cycle it pulls pending tasks, classifies them, and makes routing decisions.
// god-agent-loop.mjs
import Anthropic from '@anthropic-ai/sdk';
import { createClient } from '@supabase/supabase-js';
import { readFileSync, writeFileSync } from 'fs';
const client = new Anthropic();
const supabase = createClient(process.env.SUPABASE_URL, process.env.SUPABASE_KEY);
const WISDOM_PATH = './god-wisdom.json';
const CYCLE_INTERVAL_MS = 2 * 60 * 1000;
async function loadWisdom() {
try {
return JSON.parse(readFileSync(WISDOM_PATH, 'utf8'));
} catch {
return { lessons: [], totalCycles: 0, successPatterns: {} };
}
}
async function runCycle() {
const wisdom = await loadWisdom();
const { data: tasks } = await supabase
.from('tasks')
.select('*')
.eq('status', 'pending')
.order('priority', { ascending: false })
.limit(10);
if (!tasks?.length) return;
const classifiedTasks = await classifyAndRoute(tasks, wisdom);
for (const task of classifiedTasks) {
await dispatchToSpecialist(task, wisdom);
}
wisdom.totalCycles++;
writeFileSync(WISDOM_PATH, JSON.stringify(wisdom, null, 2));
}
setInterval(runCycle, CYCLE_INTERVAL_MS);
runCycle(); // run immediately on start
Task Classification
The classifier sends task descriptions to Claude with context from accumulated wisdom. This is where the system starts feeling intelligent — it's not just keyword matching, it's understanding intent.
// lib/classify-task.ts
export async function classifyTask(
task: Task,
wisdom: WisdomStore
): Promise<ClassifiedTask> {
const recentLessons = wisdom.lessons.slice(-10).join('\n');
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 500,
messages: [{
role: 'user',
content: `Classify this task and route it to the appropriate specialist.
Categories: db | ui | infra | analysis
Specialists: db-specialist | ui-specialist | ruflo-critical | ruflo-high | ruflo-medium
Recent wisdom from previous cycles:
${recentLessons}
Task: ${task.description}
Priority: ${task.priority}
Respond with JSON: { category, specialist, reasoning, estimatedComplexity }`
}]
});
return JSON.parse(response.content[0].text);
}
The recentLessons injection is the key. If the system learned last week that "Supabase RLS policy tasks always need the db-specialist even when they look like infra tasks," that lesson surfaces here and influences every future routing decision.
The Wisdom System
god-wisdom.json is the system's long-term memory. It persists across restarts, crashes, and deployments. Each completed task cycle generates a lesson.
// lib/wisdom.ts
interface WisdomStore {
lessons: string[];
totalCycles: number;
successPatterns: Record<string, number>;
failurePatterns: Record<string, string>;
lastUpdated: string;
}
export async function extractLesson(
task: Task,
result: TaskResult,
specialist: string
): Promise<string> {
const response = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 200,
messages: [{
role: 'user',
content: `Extract a single, reusable lesson from this task execution.
Be specific and actionable. Max 2 sentences.
Task: ${task.description}
Specialist used: ${specialist}
Outcome: ${result.success ? 'SUCCESS' : 'FAILED'}
Notes: ${result.notes}
Lesson:`
}]
});
return response.content[0].text.trim();
}
export function appendLesson(wisdom: WisdomStore, lesson: string): WisdomStore {
return {
...wisdom,
lessons: [...wisdom.lessons.slice(-99), lesson], // keep last 100
lastUpdated: new Date().toISOString()
};
}
After a few hundred cycles, god-wisdom.json reads like engineering documentation written by the system itself. It's genuinely useful to read.
Council Mode
For high-complexity tasks — architectural decisions, ambiguous requirements, anything the classifier marks with estimatedComplexity > 8 — the system spins up a council: multiple Claude instances with different prompt framings, then synthesizes their outputs.
// lib/council.ts
const COUNCIL_PERSPECTIVES = [
'You are a skeptical senior engineer. Identify risks and edge cases.',
'You are an optimistic architect focused on elegant solutions.',
'You are a pragmatist focused on the fastest path to working code.'
];
export async function conveneCouncil(task: Task): Promise<CouncilDecision> {
const opinions = await Promise.all(
COUNCIL_PERSPECTIVES.map(perspective =>
client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 800,
messages: [
{ role: 'user', content: `${perspective}\n\nTask: ${task.description}` }
]
})
)
);
// Synthesize the council
const synthesis = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 1000,
messages: [{
role: 'user',
content: `Three engineers reviewed a task. Synthesize their views into a final recommendation.
${opinions.map((o, i) => `Engineer ${i + 1}:\n${o.content[0].text}`).join('\n\n')}
Provide: { recommendation, consensus_level, action_items[], risks[] }`
}]
});
return JSON.parse(synthesis.content[0].text);
}
Council mode is expensive — 4 Claude calls per task — so the cost guard (below) is critical.
Top comments (0)