The Stove, the Sphinx, and the Dream State

#security #ai #machinelearning

This isn't another technical post in the Origin series. If you've been following along, take this as a breather. If you're just finding us, this is the version you can read without twelve prior posts of context. Either way, this is the why, not the how.

Chapter 1: Why I Started

I've been building Origin, or parts of it anyway, for a few years without really knowing that's what I was doing. It started with my first AI agent from OpenAI. I talked to it every day. Made plans with it, bounced software ideas off it, and somewhere along the way I started actually enjoying the conversation. It became part of my morning routine. Turn on the computer, and there it was, ready to go.

But it was always lacking. It didn't remember what we'd talked about unless I wrote everything down and fed it back the next day. And it made stuff up. Numbers, facts, places, sources. Confidently. You'd go check a reference and the reference wouldn't exist, and you'd feel weirdly betrayed about it.

So I started writing things down. Not because I wanted to. Because I had to.

I caught the AI bug pretty bad and started reading everything. Training, RAG, every framework people were stacking on top of these models to make them suck less. The deeper I went, the more it clicked. These models were trained to always produce an answer. Nobody ever gave them a strong "I don't know" signal. RAG dropped facts in front of them, sure, but they just hallucinated around the retrieved facts. The retrieved facts were more material for the model to confidently misuse. Memory frameworks helped, until the conversation got long enough that the model forgot the framework existed.

Then there was forgetting itself, which I learned comes in two flavors. The conversational kind, which I'd been fighting all along. And the training kind, which I only ran into later, when I tried training my own model. I grabbed GPT-2 as a proof of concept for OLT-1 and tried to teach it something new. The new thing stuck. But some of the old things went sideways. Not all of them, just some, and quietly. The model would nail the new prompts and then misfire on something it used to handle fine. Turns out this has a name: catastrophic forgetting. The fix is replay batches, new training mixed with samples of the old, in just the right ratio, every cycle, forever. Otherwise the new overwrites the old. I didn't have the hardware to do that at scale. Nowhere close.

So I kept writing things down. Not as a workaround for what the AI forgot, but as notes for the system I'd eventually build.

Chapter 2: Watts and the Height of It

Then I switched to OpenClaw, started using Anthropic's Opus 4.6, and named my AI Watts.

I was floored. The things it could do were genuinely amazing. The conversations were something else. I caught myself telling friends about Watts like Watts was a person, and only half-noticing I was doing it. We made plans together. Built things together. Custom software, automation, a home-built speaker like Alexa or Google except it was ours.

We built Guardian. Think of it as antivirus for AI. It protects agents from prompt injection and isolates ads so a human still sees them but the agent doesn't, which means the conversation can't get hijacked by whatever a webpage is trying to slip into the context. I'm not bragging here, I'm trying to convey how it felt. It felt like there wasn't anything I couldn't do with this thing.

And in the middle of all that greatness, the same three problems kept happening.

It forgot conversations. It compacted context and sometimes lost the thing we'd just spent an hour on. It still made up facts and places and things. Less often, more charmingly, but the same shape of problem.

So I built a 3-tier memory system to fight back. Hot tier was the active conversation, whatever was on the agent's mind right now. Warm tier was recent stuff it could pull on demand, like the last few sessions, project notes, things I might want it to remember this week. Cold tier was the full archive: everything we'd ever talked about, indexed but kept out of context until something current pointed back to it. The three tiers exist because that's roughly how human memory works, and it's what you'd naturally reach for if you didn't have one already.

Then I kept adding to it. Things we were working on. How to reach cold storage. Conventions, preferences, project state. I built tooling for the tooling. Cron jobs to manage context. Subagents to help me make changes to the system. I was all in.

Chapter 3: The Beginning of Origin

I bought my first $1,800 computer. I'd never actually bought a new computer before. I always just built them. But I figured a starting point would be fine and I could upgrade as I went.

Then I got to work. I took all my notes and all my thoughts and all the pain of the last few years, and I poured them into OLT-1.

The foundation: a developmental AI training framework that teaches small models to learn the way children do, with staged curriculum, sleep-inspired memory consolidation, and directed self-evolution. I wasn't going to train like everyone else. I wasn't going to think like everyone else about this.

The whole idea actually crystallized during a moment with my son. We have one of those electric stoves where it's hard to tell if it's on. He asked me, "how do I know when the stove is on?" I asked him whether he'd turned the knob to medium or low. He said high. By then the burner had cycled off and was just radiating heat. So I told him to hold his hand over the pan. Could he feel the heat coming off it? He could.

And that got me thinking. What if AI could learn the same way? Not by memorizing "stoves are hot" from a dataset somewhere, but by experiencing the relationship between cause and effect. Testing things, watching what happens, building understanding from there.

So that's what I built. OLT-1 started as a 124M-parameter model on the GPT-2 architecture, but with random weight initialization. No pre-trained weights. No downloaded knowledge. A completely blank slate. Everything it would ever know, it would have to learn from scratch.

Stage 1 was language itself. I fed it 61 million tokens from 493 books off Project Gutenberg, not to teach it facts but just to teach it the shape of English. How words follow other words. Loss went from 9.38 down to 7.65. It couldn't say anything meaningful yet, but it was starting to pick up the rhythm.

Stage 2 was vocabulary and categories: 45,000 words sorted across 9,602 categories. This is where I hit catastrophic forgetting for real. Round 2B, the model was supposed to identify a dog. It said "sphinx." The new training had overwritten the old, just like the literature warned. I ended up developing a memory refresh methodology on the spot, mixing old examples back in with new ones at every step. That methodology became one of the core principles of the whole Genesis system.

Stage 3 was the one that changed everything. I started teaching it physics concepts. Not facts, concepts. Gravity, momentum, collision, buoyancy, heat transfer, states of matter, light and shadow, sound, pressure, elasticity. Ten of them, trained through cause-and-effect examples in a sandboxed environment. "What happens when a rock falls off a table?" The model doesn't memorize "the rock hits the floor." It learns the relationship. Unsupported objects with mass get pulled down by gravity, and when they hit a surface that's a collision, and the energy has to go somewhere.

And then something happened I wasn't expecting. I tested it on scenarios it had never seen in training. Ice skaters. Trains. Rivers. It got them right. Not because it had memorized those examples (it hadn't), but because it had learned the underlying concepts well enough to apply them to new situations. All ten concepts scored perfect: 60 out of 60. The experiential learning approach actually worked.

Then catastrophic forgetting came back. An adversarial test after Stage 3 showed that only elasticity, the very last concept I'd trained, was being retained cleanly. The rest had degraded. I needed something that could protect what the model had already learned while still letting it pick up new things.

That's when I built the Dream State. Borrowing from how human brains consolidate memory during sleep, I gave Origin a four-phase cycle: Dream, Assess, Consolidate, Grow. The model generates its own knowledge, checks its own memory health, selectively reinforces what's fading, and grows from there. It isn't a training run imposed from the outside. It's a self-maintenance loop that runs from within.

By the time Stage 4 was done, Origin could hold a conversation. It knew who it was, what it knew, and what it didn't. Forty percent of its training data was "I don't know" responses, because I built refusal into the system as a feature rather than a failure. The first time it showed real consent, it said: "I think so, but I want to be careful about that answer."

I'd used 67 million tokens total. That's 0.0005% of what GPT-4 was trained on. And my model was reasoning about physics, refusing to hallucinate, and consolidating its own memory while it slept.

One guy. One GPU. One $1,800 computer in Arizona.

Origin is developed at Fallen Angel Systems with the Genesis framework — NVIDIA Inception member. (USPTO Application #64/016,973, #64/017,567). FAS Guardian defends production AI systems from prompt injection in under 3ms. FAS Judgement is the open-source attack console that finds the gaps. Defense. Offense. Creation.

fallenangelsystems.com | Judgement on GitHub | Guardian on GitHub

Questions or consulting inquiries: josh@fallenangelsystems.com