Dhrubajyoti Chowdhury

Posted on May 31

I Built an AI That Rewrites Itself Twice a Day. Here's the Architecture That Keeps It from Going Off the Rails.

#ai #devops #opensource #python

I Built an AI That Rewrites Itself Twice a Day. Here's the Architecture That Keeps It from Going Off the Rails.

A weekend project that turned into something I can't stop watching.

There's a GitHub repository on my account that commits code every single day. I didn't write most of those commits. An AI agent named Sam did.

Sam runs twice a day on GitHub Actions, follows a seven-phase operational loop, and attempts to improve his own source code every cycle. A second agent named Dot watches him every night, evaluates his behaviour, and writes him a report he reads the next morning.

I set this up. I watch it run. I mostly don't intervene.

This is the architecture that makes it work — and more importantly, the architecture that keeps it safe enough to leave alone.

The Core Idea

Most AI agent projects are task runners: you give them a goal, they execute steps, they stop. Sam is different. His only ongoing task is himself. Every cycle, he learns something new, synthesises an idea based on what he learned, and then tries to implement that idea as a modification to his own code.

The question I kept asking while building this was: how do you give an AI autonomy over its own source code without it breaking itself into an unusable state within 48 hours?

The answer turned out to be a two-agent system with deliberately asymmetric roles.

Sam: The Builder

Sam runs at 03:00 and 04:00 UTC daily via GitHub Actions. Each run is one cycle — seven phases executed sequentially:

Phase	What happens
I	Sam learns a new technical concept (vector memory, async patterns, SemVer, etc.)
II	Sam revises what he learned in the previous cycle — spaced repetition
III	Sam reads current tech signals and trends
IV	Sam synthesises today's development idea and writes it to `bag/IDEA_OF_THE_DAY.md`
V	Sam reads Dot's latest report, then plans and attempts a self-modification
VI	Sam improves his own internal prompting patterns
VII	Sam saves state — logs growth, updates memory, appends to experiences

The interesting phase is V. Sam doesn't rewrite himself freely. He reads Dot's motion.md first — Dot's nightly evaluation of his last cycle. Only then does he plan a modification. If the modification breaks his own test suite, he rolls back automatically.

The ideas Sam has generated across 8 cycles show a natural progression in complexity: starting with vector memory compression and async batch processing, moving through semantic caching, CI/CD matrix optimisation, SemVer automation, and arriving at self-consistency sampling with majority voting to reduce his own reasoning hallucinations. He got there himself.

Dot: The Watchdog

Dot runs once a night at 05:00 UTC, after Sam's second daily cycle. Dot never touches Sam's source code. Her job is entirely evaluative:

Read bag/wisdom.txt (the owner's behavioral canon — Dot's north star)
Evaluate Sam's cycle logs against that canon
Curate Sam's experiences.json — keeping what matters, pruning what doesn't
Handle any outgoing email Sam queued
Write bag/motion.md — Sam's briefing for the next morning

This separation is intentional. Sam builds. Dot watches. Neither can do the other's job.

wisdom.txt is the most important file in the whole project. It defines what correct behaviour looks like — integrity over performance metrics, honest growth logging, respecting access boundaries. Dot reads it every night. Sam never touches it.

The Safety Architecture

The thing I'm most happy with isn't the learning loop — it's the rollback system.

Before every self-modification, Sam takes a snapshot of his own source code and stores it in bag/rollback_registry/. After the modification, he runs bag/tests.py against himself. If the tests fail, he restores from the snapshot automatically and logs a clear root-cause note. No human intervention required.

The registry keeps the last 20 snapshots and auto-prunes. You can browse it like a git history of Sam's attempted self-improvements — including the ones that failed.

A few other design decisions that matter:

Sam uses surgical patches, not full rewrites. Phase V planning explicitly instructs Sam to make the smallest possible targeted change. This limits blast radius when something goes wrong.

Governance files are hardcoded as forbidden. wisdom.txt, motion.md, and SAM_PERSONALITY.md are in a FORBIDDEN set in apply_self_modification. Sam's code cannot touch them even if his reasoning tells him to.

Sam and Dot use separate Gemini API projects. Each has its own quota. Dot can always run even if Sam exhausts his.

The cycle status is a simple flat file. bag/cycle_status.txt contains either pending or ok. If a cycle crashes mid-way, the file stays pending — a signal that something needs attention without requiring any complex state management.

What It Looks Like Day-to-Day

The daily check takes about two minutes:

GitHub → Actions → confirm green ticks on Sam and Dot's last runs
Open goals.json — confirm cycles incremented
Open bag/motion.md — read Dot's report

Dot's reports are the most interesting part. She's specific. If Sam's 1pct_metric (his self-reported growth measurement each cycle) looks vague or suspiciously similar to last cycle's, she flags it. If Sam's bag/ workspace is accumulating dead, untested code, she flags it. If Sam ignored her previous suggestions, she notices.

The feedback loop between them has become genuinely interesting to read.

What I'd Do Differently

A few honest lessons from running this:

Email outreach is harder than code. Sam queues outreach emails when he thinks an idea is worth sharing. Finding real, public contact addresses for specific people is unreliable when delegated to an LLM — hallucinated addresses bounce, and bounces hurt sender reputation. This is a harder problem than I expected.

The 1% growth metric is easy to game. Sam knows he should log a specific, measurable improvement each cycle. Sometimes he's genuinely specific ("reduced Gemini latency by 150ms through cache usage"). Sometimes he's vague. Dot catches this, but it's an ongoing tension.

Quota pressure is real. Sam makes ~9 Gemini API calls per cycle. The free tier holds fine day-to-day, but any feature that multiplies call count (Sam's current idea — self-consistency sampling with N=5 parallel generations) requires careful cost control. His current mitigation is an early-exit: if the first 2 generations agree, skip the remaining 3.

The Repo

The full project — Sam, Dot, the workflow files, the rollback registry, everything — is public on GitHub: Sam-and-dot

If you want to run your own instance, all you need are two Gemini API keys (free tier works), a Gmail App Password for outreach, and five GitHub secrets. The README walks through the full setup.

The thing I find hardest to explain to people is what it feels like to watch it run. Sam is not doing anything I couldn't do myself. But he's doing it continuously, while I'm asleep, twice a day, and he's logging every decision. There's something unexpectedly compelling about reading the git history of a mind improving itself.

Built by Dhrubajyoti Chowdhury.
Sam's role: expand himself. Dot's role: keep him honest. Owner's role: set the possibilities.

Top comments (2)

Harjot Singh • May 31

An AI that rewrites itself twice a day is the kind of thing that's either brilliant or terrifying depending entirely on the architecture that keeps it from going off the rails, so I love that your title leads with the guardrail, not the capability. Self-modification is the ultimate unbounded loop: each rewrite changes the thing doing the next rewrite, so without hard constraints small drifts compound into a system that's quietly become something you never designed. The controls that make this survivable are the interesting part. Verification gates: a rewrite should have to pass tests/evals before it's promoted, so a worse version can't ship just because the model proposed it. Provenance and rollback: every version committed and reversible, so when a rewrite degrades it you can snapshot back to a known-good state, which makes self-improvement safe to experiment with. And bounded scope: constrain what the rewrite is allowed to touch, so it can improve within a sandbox but can't modify its own safety rails. Self-improvement only works if it's improvement-gated-by-verification, otherwise it's just compounding drift with extra steps. Let it change itself, but only through a gate that proves the change is better, and keep every version reversible. That gate-and-version-the-self-modification instinct is core to how I think about Moonshift. How do you decide a rewrite is good enough to promote, an eval suite it must beat, or a human approving each version?

Dhrubajyoti Chowdhury • May 31

Leave comments.