DEV Community: Shridhar Shah

Agents Are Learning to Write Their Own SKILL.md Files

Shridhar Shah — Sat, 27 Jun 2026 21:53:22 +0000

The Agent Skills open standard today, and the 2026 research on agents that write their own skills.

TL;DR: I've been using skills in Claude Code daily, and one question stuck with me: what happens when the agent writes them itself? Quick background: "Agent Skills" (late 2025) are a dead-simple way to teach an agent a task — a folder with a SKILL.md file of Markdown instructions, now an open standard. The wild part is what's coming next: agents that write their own skills. I built a demo where an agent solves a task the hard way once, saves a real SKILL.md, and then reuses it — cutting its total effort almost in half. No API key.

First, what's a "skill"?

If you've used Claude Code or similar tools lately, you've probably seen SKILL.md files. The idea is refreshingly low-tech. A "skill" is just a folder with a Markdown file that says how to do something:

---
name: csv-to-markdown
description: Turn comma-separated text into a Markdown table. Use when the input looks
  like CSV and the user wants a table.
---

# CSV to Markdown

## Instructions
Split the text into rows on newlines and columns on commas. Make the first row the
header, add a `---` divider row, then format every row as `| a | b | c |`.

That's it. No SDK, no config. Anthropic introduced this in October 2025 and then published it as an open standard (agentskills.io) in December 2025, so the same skill folder now works across ~30+ different agent tools (Claude Code, Cursor, Copilot, and more).

The full rules are short (agentskills.io/specification): the only required fields are name (1–64 chars, lowercase-with-hyphens, and it must match the folder name) and description (≤1024 chars, saying what it does and when to use it). Everything else — license, metadata, compatibility, allowed-tools — is optional. That's the whole spec. The SKILL.md files my demo writes follow it to the letter, so they'd load unmodified in any compatible CLI.

The clever trick: progressive disclosure

Here's the smart part. If you just dumped 50 skills' worth of instructions into the agent's context, you'd fill it up and leave no room for actual work. So skills load in stages:

Always loaded: just the name and one-line description of every skill (tiny).
Loaded only when it matches: the full instructions, once a task actually needs them.
Loaded only if referenced: extra files or scripts the skill bundles.

So the agent can have hundreds of skills installed and barely pay for it — it only reads the short descriptions until one matches, then pulls in the details. My demo shows the math: to use 1 skill out of 3 installed, loading everything costs ~1500 "tokens"; the SKILL.md way costs ~560. That gap gets huge as your library grows.

This is also why people say skills and MCP are teammates, not rivals: MCP is how an agent connects to tools; a skill is how an agent knows the procedure for using them.

The frontier: agents that write their own skills

Today, humans write SKILL.md files. The 2026 research is about agents that write their own — and get better over time as their skill library grows. This goes back to Voyager (2023), an agent that played Minecraft and saved working code as reusable skills, getting dramatically faster at the game. The new wave makes it general:

MUSE-Autoskill (2026) treats a skill as a living asset with a full lifecycle — create it, give it its own memory file, manage it, test it, and refine it. Each skill even keeps a .memory.md of notes about itself.
Memento-Skills (2026) stores skills as Markdown files that double as the agent's evolving memory, and turns task failures into new skills automatically.
Skill-Pro (2026) defines a skill as "when to use it + how to do it + when to stop," and only keeps a new skill if it passes a quality gate — so the library improves instead of filling up with junk.

The common thread: solve it once, save the recipe, reuse it forever — and let the collection get smarter on its own.

📄 The "this is the future" link: Anthropic's own writeup, Equipping agents for the real world with Agent Skills, and the open standard at agentskills.io. For the research direction, MUSE-Autoskill (arXiv:2605.27366) and Skill-Pro (arXiv:2602.01869) are the clearest reads on agents that grow their own skill libraries.

You can do this today in the Claude Code CLI

This isn't theoretical — the exact pattern from my demo already ships in coding CLIs. In Claude Code, a skill is just a folder under .claude/skills/ in your repo:

# Anywhere in your project — drop a skill in and the CLI auto-discovers it
mkdir -p .claude/skills/csv-to-markdown
$EDITOR .claude/skills/csv-to-markdown/SKILL.md   # same SKILL.md format as my demo

Now the agent loads only that skill's one-line description until a task matches — then pulls in the full instructions (that's progressive disclosure doing its job). Type /skills inside the CLI to see what's loaded.

The best part: because it's an open standard, the same folder works unmodified across tools. You're not locked in:

Claude Code — Anthropic's CLI, where the format started.
opencode — a popular open-source terminal agent.
Goose — Block's open-source agent.
Plus Cursor, GitHub Copilot, and 30+ others.

Write the skill once, use it everywhere. The future bit my demo points at: instead of you hand-writing that file, the agent writes it for itself after solving the task the first time — and from then on, your repo quietly accumulates a library of skills your agent earned.

The 10-second version (my demo)

Same stream of 7 tasks. "Cost" is how much effort each one took.

	No-skills agent	Skill-writing agent
What it does	re-solves everything from scratch	learns a task once, saves a `SKILL.md`, reuses it
Total cost	35	19
Both correct?	7/7	7/7

[5] csv-to-markdown  learned it and wrote SKILL.md
[5] slugify          learned it and wrote SKILL.md
[1] csv-to-markdown  reused skill 'csv-to-markdown'   ← cheap now
[5] extract-emails   learned it and wrote SKILL.md
[1] slugify          reused skill 'slugify'
[1] csv-to-markdown  reused skill 'csv-to-markdown'
[1] extract-emails   reused skill 'extract-emails'

It writes real SKILL.md files into a ./skills folder you can open. The first time it sees a task it pays full price; after that, it finds its own saved skill and reuses it for cheap.

Why this matters

Two big reasons engineers should care:

Agents stop repeating themselves. Right now most agents re-derive the same thing over and over, paying for it every time. A skill library means "figure it out once, then it's free" — like a teammate who writes things down instead of relearning them daily.
A whole new ecosystem. There are already 65,000+ shared skills and a scramble to build "the npm of agent skills" — registries and marketplaces where you install a skill like a package. Skills are becoming a unit of shareable expertise: a senior engineer's know-how, packaged in a folder, that any agent can pick up.

Tools tell an agent what it can do. Skills tell it how to do things well — and soon, agents will write that part themselves, and trade it with each other.

Try it

git clone https://github.com/Shridhar-2205/living-software
cd living-software/06-agent-skills
python demo.py
cat skills/csv-to-markdown/SKILL.md   # a skill the agent wrote itself

Honest note: this is a POC. Real systems decide when a new skill is worth saving, test it, and refine it over time (that's exactly what the 2026 papers above tackle). Mine keeps that part simple so the core idea — learn once, save a SKILL.md, reuse it — is easy to see.

The rest of the series — Toward Living Software

I built an AI agent that rewrites its own code
Do AI agents need to sleep?
Can an AI agent pass the Sally-Anne test?
An AI agent that gets curious on its own
How do you trust an AI agent with your money?
Agents that write their own SKILL.md files (you're reading it)

Shridhar Shah — Senior Software Engineer on the AI team at Cisco. Part 6 of Toward Living Software.

GitHub · LinkedIn

Sources: Anthropic, "Equipping agents for the real world with Agent Skills" (2025) and the Agent Skills open standard (agentskills.io); Voyager (arXiv:2305.16291); MUSE-Autoskill (arXiv:2605.27366); Memento-Skills (arXiv:2603.18743); Skill-Pro (arXiv:2602.01869); MemSkill (arXiv:2602.02474).

How Do You Trust an AI Agent With Your Money? You Don't — You Check Its Receipt

Shridhar Shah — Sat, 27 Jun 2026 21:53:20 +0000

Cryptographically verifiable agent behavior: swap, edit, or forge a step and it's rejected.

TL;DR: We're about to hand agents our refunds, our data, and our prod APIs — and that made me nervous enough to build this. Once agents do real things, "just trust it" stops being good enough. The fix: the agent hands you a tamper-proof receipt that proves it followed the approved rules and didn't fake anything. Change the rules, edit a step, or fake the signature, and the check fails every time. Normal everyday crypto, no API key.

The scary question

You're about to let an agent issue refunds, move files, or hit your production APIs. How do you actually know it followed the rules you approved — and not some changed version? And how do you know the log it gives you afterward wasn't edited?

Right now, the honest answer is usually: you don't. You trust the logs. But logs can be edited, the rules an agent runs can be quietly swapped, and a compromised agent can claim it did one thing while doing another.

The 2026 fix is called verifiable agent behavior (the research term is "zkML"): the agent produces a tamper-proof receipt that proves it ran exactly the approved process — and anyone can check that receipt without having to trust the agent.

The 10-second version

What happened	Result
Agent ran the approved refund rules, honestly	✅ ACCEPT
Someone swapped in sneaky "refund anything" rules	🚨 REJECT — rules don't match the approved ones
Someone edited a step (turned a $40 refund into $5000)	🚨 REJECT — receipt doesn't add up
Someone faked the receipt without the secret key	🚨 REJECT — signature is invalid

Only the honest run passes. Every kind of cheating gets caught.

How it works (in plain terms)

Three normal building blocks, no magic:

A fingerprint of the approved rules. Run the rules through a hashing function and you get a short, unique fingerprint. Anyone can fingerprint the approved rules and compare — if the agent used different rules, the fingerprints won't match.
A receipt you can't edit. Every step the agent takes is chained together so each step depends on all the steps before it. Change any one step and the whole thing stops adding up — like a tamper-evident seal:

seal = fingerprint(rules)
for step in steps:
    seal = hash(seal + step)   # each step folds into the seal

A signature. The agent signs the final seal with a secret key. If someone tries to forge a receipt without that key, the signature won't check out.

To verify, you just redo all three and ask: Did it use the approved rules? Is the receipt intact? Is the signature real? All three have to pass.

Why this matters

Every other post in this series makes agents more independent — they rewrite their own code, sleep, model other people, get curious. This one is the safety net for all of that: independence without a way to check up on it is a liability.

The more power we hand to agents, the less we can afford to just trust them — and the more we need a way to check them.

The end goal of the real research is even stronger: prove an agent followed the approved rules without re-running it and without exposing any private data or secret model. That lets two companies trust each other's agents — yours proves it behaved, mine checks the proof, and neither of us has to reveal our secrets.

Try it

git clone https://github.com/Shridhar-2205/living-software
cd living-software/05-verifiable-agent
python demo.py

Honest note: the real research uses heavier cryptography so the checker doesn't have to re-run anything and never sees the secret model. My demo re-checks a signed, sealed receipt instead — much simpler, and it shows the same payoff (cheat in any way ⇒ rejected) so you can feel what "verifiable behavior" actually buys you. It uses only standard, modern hashing (SHA-256), and the "secret key" is an obvious fake, never a real credential.

The rest of the series — Toward Living Software

I built an AI agent that rewrites its own code
Do AI agents need to sleep?
Can an AI agent pass the Sally-Anne test?
An AI agent that gets curious on its own
How do you trust an AI agent with your money? (you're reading it)
Agents that write their own SKILL.md files

Shridhar Shah — Senior Software Engineer on the AI team at Cisco. Part 5 (the finale) of Toward Living Software.

GitHub · LinkedIn

Background: "zkML" / verifiable inference — proving an AI model ran exactly as claimed. See "Verifiable evaluations of machine learning models using zkSNARKs" (arXiv:2402.02675) and the survey "Zero-Knowledge Proof Based Verifiable Machine Learning" (arXiv:2502.18535). Tools like EZKL do this for real ONNX models today.

I Built an AI Agent That Gets Curious On Its Own

Shridhar Shah — Sat, 27 Jun 2026 21:43:35 +0000

Active inference: curiosity emerges for free from minimizing surprise — 48% vs 100% on a foraging task.

TL;DR: Most AI agents chase rewards — they pick whatever action scores the most points. I wanted to see what happens if you build one that just tries not to be surprised. Something neat happened — the agent became curious without being told to. It goes looking for information before acting, and that takes it from 48% to 100% on a simple task.

Two different ways to make decisions

Most AI agents are "reward chasers." Give them points for doing well, and they'll pick whatever action they expect to score highest. Simple and effective.

There's another idea from brain science: instead of chasing points, try to avoid being surprised — act so the world matches what you expected. It sounds almost too simple, but it leads to a surprising bonus: when you're trying not to be surprised, going and finding out what you don't know becomes valuable all by itself. In other words, curiosity isn't something you have to bolt on. It comes for free.

This is called active inference, and in 2026 it jumped from neuroscience into AI as a serious approach (here's a 2026 paper). Here's the smallest demo that makes it click.

The 10-second version

The task: a reward is hidden behind either the LEFT door or the RIGHT door (50/50). There's also a hint you can check that tells you which door — if you bother to look.

	❌ Reward-chaser	✅ Curious agent
What it cares about	getting the reward, right now	getting the reward + not being unsure
What it does	guesses a door	checks the hint first, then opens the right door
Success (400 tries)	48%	100%

Nobody told the second agent "go check the hint." It did it on its own, because being unsure bothered it.

How it works

Before acting, the agent scores each option on two things:

Does this get me closer to the reward?
Does this make me less unsure about what's going on?

value_of_checking_the_hint = how_unsure_am_i    # high when it's a total coin-flip
value_of_just_guessing     = chance_of_being_right  # only ~50% on a blind guess

if value_of_checking_the_hint > value_of_just_guessing:
    check_the_hint()     # this is where curiosity shows up
open(best_door)          # now actually go get the reward

When it's a total coin-flip, checking the hint is worth a lot (it removes all the doubt), way more than a 50/50 guess. So it looks first. Once it knows, there's nothing left to be unsure about, so it just grabs the reward. The reward-chaser never sees any value in the hint, so it flips a coin forever.

Why this matters

Two reasons engineers should care:

Curiosity for free. A long-standing headache in AI is agents getting stuck doing the same thing, never trying anything new. People hand-tune "exploration bonuses" to force them to explore. This approach gives you curiosity automatically — the agent looks for info exactly when it's unsure, and stops once it isn't.
It handles surprises. An agent built to avoid surprises is built to deal with situations it wasn't trained for. When reality stops matching its expectations, closing that gap becomes its goal — so it keeps adapting instead of breaking.

A reward-chaser asks "what gets me the most points?" A surprise-avoider asks "what don't I understand yet?" — and that second question is what makes it adapt.

Try it

git clone https://github.com/Shridhar-2205/living-software
cd living-software/04-active-inference
python demo.py

Honest note: the full version of this idea has a fair bit of math behind it. I've boiled it down to the one decision that makes it obvious — being unsure has a cost — so you can watch curiosity appear in just a little code.

The rest of the series — Toward Living Software

I built an AI agent that rewrites its own code
Do AI agents need to sleep?
Can an AI agent pass the Sally-Anne test?
An AI agent that gets curious on its own (you're reading it)
How do you trust an AI agent with your money?
Agents that write their own SKILL.md files

Shridhar Shah — Senior Software Engineer on the AI team at Cisco. Part 4 of Toward Living Software.

GitHub · LinkedIn

Background: Karl Friston's "Free Energy Principle" (the brain-science origin); "Active Inference as the Test-Time Scaling Law for Physical AI Agents" (arXiv:2606.22813).

Can an AI Agent Pass the Test We Give 4-Year-Olds?

Shridhar Shah — Sat, 27 Jun 2026 21:43:33 +0000

Theory of Mind and the Sally-Anne false-belief test, in plain Python.

TL;DR: There's a famous test that kids pass around age 4, and a lot of AI still trips on it — I had to build it to see where the line really is. It checks whether you understand that other people can believe things that aren't true. I built two AI agents: one that only knows "what's actually happening" (fails, like a toddler) and one that keeps track of what each person believes (passes). It's the foundation for agents that can actually work together.

The test

Sally puts her marble in the basket, then leaves the room.
While she's gone, Anne moves the marble to the box.
Sally comes back. Where will she look for her marble?

If you said basket, nice — you just used something called "theory of mind." Sally never saw the marble move, so in her head it's still in the basket. What's actually true (it's in the box) and what Sally believes (it's in the basket) are two different things, and you kept them separate without even thinking about it.

A 3-year-old says "box" — they can't yet separate what they know from what Sally knows. A 4-year-old says "basket." It's one of the most famous tests in child psychology, and in 2026 it's become a real test for AI agents too.

The 10-second version

	❌ Agent with no "theory of mind"	✅ Agent that models other minds
What it tracks	only what's actually true	what each person believes, separately
Where will Sally look?	"box"	"basket"
Result	FAIL (only knows reality)	PASS

How it works (the whole trick)

The only difference between the two agents is one rule: a person's belief only updates when that person is actually in the room to see it happen.

def someone_moves_the_marble(new_place, who_is_watching):
    for person in who_is_watching:        # only people in the room
        beliefs[person] = new_place        # update THEIR mental picture

So when Anne moves the marble while Sally is out, only Anne's mental picture updates. Sally's is frozen at "basket." Ask the simple agent and it just reports reality ("box"). Ask the smarter agent and it answers from Sally's point of view ("basket").

That's the whole thing. But keeping a separate picture of "what does each other person know" is the difference between an agent that's a good teammate and one that isn't.

Why this isn't just a cute puzzle

Almost everything useful about multiple agents (or an agent working with a human) needs this:

Handing off work: to delegate, I need to know what you already know.
Explaining things: I should tell you the part you're missing, not dump everything.
Warning someone: "Heads up, Sally still thinks the marble's in the basket" only works if I can track Sally's wrong belief.
Not causing chaos: an agent that assumes everyone knows what it knows will skip important info and make bad assumptions.

Most AI today reasons about the world. The 2026 shift is reasoning about the people in the world — including when they're wrong. That's what turns a smart tool into a real collaborator.

Being smart about the world makes a good tool. Being smart about other people makes a good teammate.

Try it

git clone https://github.com/Shridhar-2205/living-software
cd living-software/03-theory-of-mind
python demo.py

Honest note: real versions have to figure out what someone believes by watching their behavior, which is much harder. Here I just tell the agent who was in the room, so the core idea — track beliefs separately from reality — is as clear as possible.

The rest of the series — Toward Living Software

I built an AI agent that rewrites its own code
Do AI agents need to sleep?
Can an AI agent pass the Sally-Anne test? (you're reading it)
An AI agent that gets curious on its own
How do you trust an AI agent with your money?
Agents that write their own SKILL.md files

Shridhar Shah — Senior Software Engineer on the AI team at Cisco. Part 3 of Toward Living Software.

GitHub · LinkedIn

Background: the Sally-Anne false-belief test (Baron-Cohen, Leslie & Frith, 1985); Kosinski, "Evaluating Large Language Models in Theory of Mind Tasks" (PNAS 2024 / arXiv:2302.02083); and a 2026 follow-up showing how brittle this still is — "Understanding Artificial Theory of Mind" (arXiv:2602.22072).

Do AI Agents Need to Sleep? I Built One That Does

Shridhar Shah — Sat, 27 Jun 2026 21:36:55 +0000

A sleep-like phase that consolidates noisy daily experience into durable memory — 75% vs 100% recall.

TL;DR: Everyone "fixes" AI memory by making the context window bigger. I wanted to try the opposite idea, so I built a demo of a 2026 research trend: giving an agent a "sleep" phase — time spent not answering questions, just tidying up what it learned that day. The agent that "sleeps" remembers 100% of what it learned. The exact same agent without sleep remembers only 75% and gets confused by bad info. Runs on a laptop.

The memory problem every AI app hits

If you've built anything with an LLM, you know the pain: the model only "remembers" what's in its current context window. Once the conversation gets long enough, the oldest stuff scrolls off the top and is just... gone. Forgotten.

The usual fix is "make the context window bigger." But that's like fixing a messy desk by buying a bigger desk. It's expensive, and the model still gets worse as you cram more in (a real, measured effect — more text in the window can actually lower accuracy).

Your brain doesn't work this way. You don't remember every sentence anyone said today. While you sleep, your brain replays the day, keeps the important bits as long-term memory, and dumps the rest. That's how you remember "I like coffee" without remembering every single cup.

A couple of 2026 papers ask the obvious question: Do Language Models Need Sleep? Their answer: giving an AI a quiet "offline" phase to consolidate memories makes it remember better. So I built the simplest version that shows why.

The 10-second version

	❌ Agent with no sleep	✅ Agent that sleeps
How it remembers	keeps only the last N messages	saves a tidy summary every night
After 30 noisy days	75% recall	100% recall
Tricked by bad info?	yes	no — it goes with what it saw most often

Same experiences, same noise, same memory test. The only difference is whether the agent sleeps.

How it works

Each "day," the agent hears facts like Alice → drinks → coffee. To make it realistic, about 1 in 5 facts is wrong (people misremember, logs have errors).

The no-sleep agent only keeps the last 10 things it heard. Anything older falls off the edge and is forgotten. And one bad recent day can flip its answer.
The sleeping agent does one extra thing each night: it goes back through the day, updates a small running tally of what it heard, and then clears out the raw log:

def sleep(self):
    for (person, fact, value) in todays_notes:
        memory[person][value] += 1   # add today's notes to the long-term tally
    todays_notes.clear()             # forget the raw firehose, keep the summary

That tiny step buys two things:

It doesn't forget. The summary sticks around even after the raw messages are gone.
It filters out bad info. Because it counts how often it heard each thing across many days, the occasional wrong fact gets outvoted by the truth.

Why this matters

Everyone's trying to fix AI memory by making the context window huge. But a bigger window is still just a bigger pile of raw text — expensive, and it still overflows.

Sleep is a smarter bet: do the cleanup when the agent is idle. Spend a little time while nobody's waiting to turn today's messy notes into a clean, permanent summary — so when someone does ask, the answer is fast, cheap, and correct. It's the same theme as an agent that improves its own code: get better while you run, not just when a human retrains you.

The better AI agent doesn't have a bigger memory. It has a tidier one — because it sleeps.

Try it

git clone https://github.com/Shridhar-2205/living-software
cd living-software/02-agents-that-dream
python demo.py

Honest note: real systems fold these summaries into the model itself with fancier methods. Mine just uses a plain dictionary. The idea (replay the day → save a summary → clear the raw log) is exactly the same; the code is kept tiny on purpose.

The rest of the series — Toward Living Software

I built an AI agent that rewrites its own code
Do AI agents need to sleep? (you're reading it)
Can an AI agent pass the Sally-Anne test?
An AI agent that gets curious on its own
How do you trust an AI agent with your money?
Agents that write their own SKILL.md files

Shridhar Shah — Senior Software Engineer on the AI team at Cisco. Part 2 of Toward Living Software.

GitHub · LinkedIn

Sources: "Do Language Models Need Sleep?" (arXiv:2605.26099); "Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories" (arXiv:2606.03979).

I Built an AI Agent That Rewrites Its Own Code

Shridhar Shah — Sat, 27 Jun 2026 21:36:53 +0000

A tiny Darwin Gödel Machine that edits itself and keeps only changes that verifiably score higher.

TL;DR: I kept hearing that AI is about to start improving itself, and I was skeptical — so one weekend I built the smallest version I could. It looks at the tasks it's failing, edits its own code to fix them, and keeps a change only if it actually scores better on a test. It goes from passing 1 of 8 tasks to 8 of 8 — and nobody wrote those fixes but the program itself. Runs on a laptop in under a second. No fancy hardware, no API key.

The old dream: software that improves itself

Normally, software only gets better when we make it better. You write code, you find a bug, you fix it, you ship again. The program never improves on its own.

People have wanted "software that improves itself" for decades. The classic version (called a "Gödel Machine") had one rule that made it impossible to build: before the program could change a line of its own code, it had to mathematically prove the change would help. Proving that about real code is basically impossible, so the idea never worked.

In 2025, researchers found a way around it with the Darwin Gödel Machine. They dropped the "prove it first" rule and replaced it with something every engineer already trusts:

Try the change. Run the tests. If the score went up, keep it. If not, throw it away.

That's it. It's basically how we all work — make an edit, run the test suite, keep what passes. The twist is that the program is the one making the edits. In the real paper, this let an AI coding assistant improve its own tooling and jump from solving 20% to 50% of a hard benchmark of real GitHub issues.

I wanted to actually see this happen, so I built the tiniest version I could.

The 10-second version

	Start	After improving itself
What it can do	only `uppercase`	learned 6 more skills on its own
Test score	🔴 1 / 8	🟢 8 / 8
Who wrote the fixes?	—	the program did

Start:  ███░░░░░░░░░░░░░░░░░░░░░  1/8   (only knows: uppercase)
+reverse            ██████░░░░░░░░░░░░  2/8
+dedup_csv          █████████░░░░░░░░░  3/8
+sum_csv            ████████████░░░░░░  4/8
+sort_csv           ███████████████░░░  5/8
+title              ██████████████████  6/8
+normalize_inputs   ████████████████████  8/8   ← one fix unlocked TWO tasks
✅ SOLVED 8/8

How it works (the whole thing)

There are only three pieces.

1. The "agent" is just a bag of skills. Each skill is a tiny function — uppercase text, reverse it, sort a list, etc. It starts out knowing almost nothing.

2. A test with known answers. Every task has a correct answer, so checking the score is a plain equality check — output == expected. No human grading it, no second AI judging it. Just: did it get the right answer or not? (This "write a checker, then measure" idea is the same trick behind today's reasoning models.)

3. The loop. Over and over: look at what's failing, add one skill to try to fix it, re-run the test, and keep the change only if the score went up. It also saves every improved version, so it can branch off any of them later instead of getting stuck.

new_version = old_version + add_a_skill(things_it_is_failing)
if score(new_version) > score(old_version):   # did the test score actually improve?
    keep(new_version)                          # yes -> save it and build on it

The cool part: small fixes unlock big ones

One of the skills it adds, "clean up the input" (trim weird spacing), does nothing by itself. But the agent had earlier learned a "title-case" skill that kept breaking on messy text like " the quick fox ". The moment it adds the cleanup step, two stuck tasks start passing at once — that's the +2 jump at the end.

This is the whole point in miniature: the agent isn't just adding features. It's making itself better at getting better. A boring little fix becomes the stepping stone that makes later fixes work. The real research sees the same thing at full scale — the AI invents helpers like "try a few solutions and pick the best one," which then make every future fix more effective.

Why I think this is where things are going

For ten years, the way to make AI better was: make the model bigger. The newer idea is to make it improve itself while it runs:

This post — an agent that rewrites its own code.
"Language Models Need Sleep" (2026) — agents that tidy up their own memory during an offline "sleep."
Small models that think harder instead of being bigger.

The common thread: improvement is shifting from us retraining the model to the program improving itself, with a simple test telling it whether each change was good. Software that edits itself starts to feel less like a fixed program and more like something that grows.

Try it (under a minute)

git clone https://github.com/Shridhar-2205/living-software
cd living-software/01-self-rewriting-agent
python demo_cli.py     # watch the score climb 1/8 → 8/8
pytest -q              # the same claims, as automated tests

One honest note on safety: a real self-rewriting agent runs code it wrote itself, which is risky. In my version the "edits" come from a fixed list of safe skills, so nothing dangerous ever runs — the loop matches the research, the risk is zero. (The real one runs inside a sandbox for exactly this reason.)

The takeaway

The old dream needed a mathematical proof before changing any code. The new version just needs a test. If you can write a check that says "this got better," you can let a program improve itself — and watch it find clever fixes you never wrote.

The rest of the series — Toward Living Software

I built an AI agent that rewrites its own code (you're reading it)
Do AI agents need to sleep?
Can an AI agent pass the Sally-Anne test?
An AI agent that gets curious on its own
How do you trust an AI agent with your money?
Agents that write their own SKILL.md files

Shridhar Shah — Senior Software Engineer on the AI team at Cisco. Part 1 of Toward Living Software.

GitHub · LinkedIn

Source: Zhang, Hu, Lu, Lange, Clune, "Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents," arXiv:2505.22954 (2025) — reports SWE-bench 20.0% → 50.0%.