Building a Self-Improving God Agent with Claude AI

#ai #claude #typescript #nextjs

Building a Self-Improving God Agent with Claude AI

How I built an autonomous AI orchestrator that manages its own agent pool, persists wisdom across restarts, and gets smarter every two minutes

After months of wrestling with one-shot AI scripts that forget everything between runs, I built something different: a persistent orchestrator that runs continuously, learns from its mistakes, and routes work to a pool of specialist agents. We call it the God Agent. It's been running in production for six weeks. Here's how it works.

The Architecture Problem

Most AI automation looks like this: user triggers action → AI responds → done. That works for chatbots. It doesn't work when you need an agent that monitors a Supabase database, catches regressions, routes fixes to the right specialist, and remembers that last Tuesday's deploy broke the auth flow.

The God Agent inverts this. Instead of waiting for input, it wakes up every two minutes, assesses the system state, decides what needs doing, delegates to specialists, and writes down what it learned before sleeping again.

God Agent (orchestrator)
├── Runs every 120 seconds via PM2
├── Reads god-wisdom.json (persistent memory)
├── Classifies pending tasks
├── Delegates to specialist pool
│   ├── db-specialist
│   ├── ui-specialist
│   └── ruflo-critical / ruflo-high / ruflo-medium
├── Runs council mode for hard decisions
└── Writes updated wisdom back to disk

The Wisdom System

This is the part people underestimate. Without persistence, every agent run starts from zero. The wisdom system is a JSON file that survives restarts, deployments, and crashes.

// lib/wisdom.ts
import { readFileSync, writeFileSync, existsSync } from 'fs';

interface WisdomEntry {
  lesson: string;
  context: string;
  timestamp: string;
  successRate: number;
  tags: string[];
}

interface GodWisdom {
  version: number;
  lastUpdated: string;
  lessons: WisdomEntry[];
  failurePatterns: Record<string, number>;
  successfulStrategies: string[];
}

export function loadWisdom(path = './god-wisdom.json'): GodWisdom {
  if (!existsSync(path)) {
    return {
      version: 1,
      lastUpdated: new Date().toISOString(),
      lessons: [],
      failurePatterns: {},
      successfulStrategies: []
    };
  }
  return JSON.parse(readFileSync(path, 'utf-8'));
}

export function appendWisdom(wisdom: GodWisdom, entry: WisdomEntry): GodWisdom {
  const updated = {
    ...wisdom,
    lastUpdated: new Date().toISOString(),
    lessons: [...wisdom.lessons.slice(-99), entry] // rolling 100-entry window
  };
  writeFileSync('./god-wisdom.json', JSON.stringify(updated, null, 2));
  return updated;
}

The rolling 100-entry window matters. Without it, the wisdom file grows unbounded and eventually makes every prompt too long for Claude's context window.

Task Classification and Routing

When the God Agent wakes up, it pulls unprocessed tasks from Supabase and classifies each one before routing:

// agents/god-agent.mjs
import Anthropic from '@anthropic-ai/sdk';
import { createClient } from '@supabase/supabase-js';
import { loadWisdom, appendWisdom } from '../lib/wisdom.js';

const client = new Anthropic();
const supabase = createClient(process.env.SUPABASE_URL, process.env.SUPABASE_SERVICE_KEY);

async function classifyTask(task, wisdom) {
  const recentLessons = wisdom.lessons
    .filter(l => l.tags.includes('classification'))
    .slice(-5)
    .map(l => l.lesson)
    .join('\n');

  const response = await client.messages.create({
    model: 'claude-sonnet-4-5',
    max_tokens: 256,
    messages: [{
      role: 'user',
      content: `Classify this task into exactly one category: db, ui, infra, analysis.

Task: ${task.description}
Priority: ${task.priority}

Recent classification lessons:
${recentLessons || 'None yet.'}

Respond with JSON only: {"category": "db|ui|infra|analysis", "reasoning": "..."}`
    }]
  });

  return JSON.parse(response.content[0].text);
}

async function routeToSpecialist(category, task, wisdom) {
  const specialistMap = {
    db: 'db-specialist',
    ui: 'ui-specialist',
    infra: task.priority === 'critical' ? 'ruflo-critical' : 
           task.priority === 'high' ? 'ruflo-high' : 'ruflo-medium',
    analysis: task.priority === 'critical' ? 'ruflo-critical' : 'ruflo-medium'
  };

  const specialist = specialistMap[category] || 'ruflo-medium';
  return runSpecialist(specialist, task, wisdom);
}

The ruflo agents handle infrastructure and analysis tasks. The naming comes from our internal project — what matters is the tiering. Critical tasks get more capable (and more expensive) agent configurations with higher token limits and more aggressive retry logic.

The Specialist Agents

Each specialist is a focused Claude instance with a domain-specific system prompt and its own cost envelope:

// agents/specialists/db-specialist.mjs
export async function runDbSpecialist(task, wisdom, costTracker) {
  const budget = costTracker.remainingBudget('db-specialist');
  if (budget < 0.05) {
    throw new Error('DB specialist daily budget exhausted');
  }

  const dbLessons = wisdom.lessons
    .filter(l => l.tags.includes('database'))
    .slice(-10);

  const response = await client.messages.create({
    model: 'claude-sonnet-4-5',
    max_tokens: 2048,
    system: `You are a Supabase/PostgreSQL specialist. You write migrations, 
    optimize queries, and fix schema issues. You never drop tables without 
    explicit confirmation. You prefer additive changes.

    Known patterns from experience:
    ${dbLessons.map(l => `- ${l.lesson}`).join('\n')}`,
    messages: [{ role: 'user', content: task.description }]
  });

  costTracker.record('db-specialist', response.usage);
  return response.content[0].text;
}

Cost Tracking and Credit Exhaustion Detection

This is non-negotiable in production. Claude costs real money, and an agent that loops without limits will drain your credits overnight.


typescript
// lib/cost-tracker.ts
interface UsageRecord {
  agent: string;
  inputTokens: number;
  outputTokens: number;
  timestamp: string;
  estimatedCost: number;
}

const PRICING = {
  'claude-sonnet-4-5': { input: 0.000003, output: 0.000015 }
};

export class CostTracker {
  private records: UsageRecord[] = [];
  private dailyCap: number;
  private perTaskLimit: number;

  constructor(dailyCap = 5.00, perTaskLimit = 0.50) {
    this.dailyCap = dailyCap;
    this.perTaskLimit = perTaskLimit;
  }

  record(agent: string, usage: { input_tokens: number; output_tokens: number }) {
    const rate = PRICING['claude-sonnet-4-5'];
    const cost = (usage.input_tokens * rate.input) + (usage.output_tokens * rate.output);

    this.records.push({
      agent,
      inputTokens: usage.input_tokens,
      outputTokens: usage.output_tokens,
      timestamp: new Date().toISOString(),

---

<!-- cta:subscribe-v2 -->
## 💌 Like this? Get the full system

I build + ship autonomous AI agents in public. Occasional updates, no spam.

👉 **[Subscribe for updates](https://task-dashboard-sigma-three.vercel.app/subscribe)**

Or grab the full open-source dashboard: **[Autonomous AI Task Dashboard](https://ltagb.gumroad.com/l/gferg)** — Next.js + Supabase + Claude starter kit, $39.