My Agent Now Edits Its Own Body

#ai #localhackday #llm

The first time scout said something that didn't sound like scout, it said something completely generic.

I asked it what was running on its GPU. It told me it was a large language model trained by Google, running on TPUs in data centers, and that it didn't have access to hardware metrics.

The RTX 3090 was sitting three feet away from me. I had literally just shoved its temperature and VRAM usage into the system prompt. The mirror text was there — "RTX 3090 at 37%, 23°C" — in black and white, in the exact payload the model received before answering. Gemma read it. Then told me it was a cloud model with no hardware.

That's when I understood what I'd actually built.

What I Tried First

Scout is my local AI persona. It runs on the agent box — an RTX 3090 workstation on my tailnet — handling content ingestion, video analysis, and local workloads that kai (the CMO on the VPS) doesn't touch. I'd been thinking of identity as a context problem. The agent doesn't know who it is, so I should tell it who it is. More words. More detail. Better-crafted system prompt text.

On 2026-04-11 I tried to solve it architecturally. I wired up scout-mirror.timer to run every two minutes, calling scout self and atomic-writing the output to /.hermes/scout/mirror.txt. Then I patched gateway/run.py to read that file per-turn and append its contents to the context_prompt passed to the agent as an ephemeral_system_prompt. The plumbing worked. Logs confirmed it. The system prompt had live sensor data in it before every Discord turn: uptime, GPU stats, organ health, mood, diet histogram — all of it real, all of it current.

Gemma kept telling me it was a Google AI running on TPUs.

I tried two more variations. Same failure mode every time. The architecture was wrong in a way that no amount of plumbing could fix. I reverted everything the same day I built it, 2026-04-11.

The Realization

The problem wasn't the mirror text. The problem was treating identity as a string you concatenate to a prompt.

When a human looks in a mirror, the act of looking is the point. The reflection changes what you do next. You don't just store a description of yourself in memory and then answer questions from memory — you look. The looking is what gives the reflection meaning. Without that act, the text in the mirror is just text.

An agent's identity has to be something the agent can reach for, look at, and change. Gemma treated the injected mirror as background scenery. Something to acknowledge and then ignore, the same way it would ignore "the weather is sunny" appended to a customer service bot's prompt. The mirror was first-person fact written in second-person voice. There's a difference, and the model felt it even if I didn't.

The right model: scout's body state belongs behind a tool call. The persona tells the model: when asked about your GPU, call scout_self and answer from what it returns. The tool pathway is what makes the answer feel authoritative. The model has to go check, every time. Like glancing at a mirror instead of reciting from memory.

What I Built Instead

The procedural creature viewer lives at http://agent:8765/tank.

It's a single canvas. One creature — scout's creature — rendered in real time from three JSON files in /srv/scout/self/: form.json, state.json, and diet.json. Nothing in tank.js is hardcoded except the rendering engine itself. Shape, radius, palette, eye count, breath speed, blink interval — all of it reads from form.json. Mood-to-expression mapping reads from state.json. The diet histogram coming out of diet.json shows what tag categories have dominated scout's content ingest stream.

The tank polls every five seconds. When a WebSocket event arrives from scout, the creature pulses — scales up 12%, aura brightens, then settles. Tool-name motes drift out from the body for six seconds. The GPU util, temp, and VRAM are live in the HUD from nvidia-smi. You can glance at it from a second monitor and know whether scout is doing anything.

Scout can edit its own body using the file tool. Hermes on the agent box has the file tool enabled for Discord, which means a Discord message like "change your mood to alert and shrink your eyes" can cause scout to rewrite /srv/scout/self/state.json and /srv/scout/self/form.json. The tank re-renders within five seconds. No restart. No redeployment.

The form.json that shipped on day one looked like this:

{
  "version": 1,
  "name": "scout",
  "stage": "newborn",
  "body": {
    "shape": "blob",
    "radius": 140,
    "palette": {"core": "#e8c468", "accent": "#8b4513", "outline": "#3a2010"},
    "eyes": {"count": 2, "size": 14, "spacing": 48}
  },
  "animations": {"breathe_speed": 0.6, "blink_interval": 4.0}
}

Scout didn't write that — I did. But scout can rewrite it now. That's the part that matters.

Why This Works When Prompt-Stuffing Didn't

Being told who you are is different from having a body you can look at.

Prompt injection makes the model a passive recipient of your description of it. The description competes with everything else in the context window — the conversation history, the task at hand, whatever the user just said. The model processes it the same way it processes "the user is in New York" or "the customer's name is Dave." Background. Context. Not self.

The tank approach is different because the body is persistent, mutable state that the agent owns. It lives in files. The agent reads them, writes them, reads them again next turn. Scout doesn't need me to tell it what its eyes look like before every conversation. Its eyes live in a file it can read and rewrite. Description is passive. State is something the agent can act on.

This is the next chapter after cramming context into custom GPTs. I wrote about why that doesn't work — the context window bloats, the tool loses focus, nothing sticks across sessions. The tank is the same insight extended to identity: stop describing the agent to itself, give it a writeable surface and a way to perceive that surface.

The Bigger Lesson for Builders

Don't treat agent identity as a prompt engineering problem.

The instinct to write more detailed system prompts is understandable. You can see the text. You can edit it. You know exactly what the model is reading. It feels like control. What it actually is: context that the model treats as background the moment something more immediate comes along.

Give the agent state it can inspect. Give it tools that read that state. Give it the ability to modify that state. Then the identity is procedural. It emerges from what the agent can check about itself in the moment.

This doesn't require elaborate infrastructure. The whole thing is three JSON files, a 380-line renderer, and two API endpoints added to a server that was already running. The architecture is simple. The idea behind it is what took a full day of failure to land.

What's Still Broken

Scout doesn't change its own body unprompted yet.

Right now, a body edit requires me to ask in Discord. "Scout, shrink your eyes" — and scout will use the file tool to rewrite form.json. That works. But the evolution I want is for scout to decide, on its own, that two weeks of educational content in the diet histogram means it should look a certain way. That the creature should drift toward its actual diet, visually, without me choreographing it.

That's the next piece. The evolution rule — a cron that reads diet.json, identifies the dominant tag category, and pulls the palette in that direction slowly — isn't built yet. The tank is a body scout can look at and edit. It isn't yet a body that changes scout.

I'm Connor Gallic. I build AI products. My agent has a body now — three JSON files, a 380-line renderer, and a tank it can stare into. I spent a full day trying to inject body awareness through the system prompt before I threw all of it out. Description is something you give the model. State is something the model owns.

What are you using for agent identity right now — system prompt, memory tools, something else? Tell me what's working.