Origin Part 12: The Adapter

#security #ai #machinelearning

The new encoder was 24x better at finding the right concept. It also broke every response.

Part 11 ended with the new encoder staged on disk. Top1 had jumped from 1.3% to 31.3%. Target activation had gone from 0.012 to 0.249. The architectural lever had landed exactly where the abort condition predicted it would. The numbers said this was the encoder we were going to ship.

Then we tried to ship it.

Every query came back "i don't know."

What the Dispatcher Does

The dispatcher is the part of Origin that sits between the encoder and the response. The encoder reads characters and produces concept activations - a long list of "how strongly does each concept fire on this input?" The dispatcher reads that list and decides what to do about it. Is this a greeting? Is this a question about identity? Is the user asking what something is? Each route fires when the activation pattern matches a rule, and each route knows how to construct a response from the concepts that fired.

The rules looked like this, in spirit: if the concept "greeting" is firing above 0.5, dispatch to the greeting handler. If the concepts "what" and "self" are both above 0.5, dispatch to the identity handler. Numbers like 0.5, 0.7, 0.8 were sprinkled through the dispatcher as thresholds. They worked because the old encoder produced activations that lived in those ranges.

The old encoder used sigmoid. Each concept was scored independently, on its own absolute scale from 0 to 1. A query about greetings might fire "greeting" at 0.92, "hello" at 0.88, and "question" at 0.04. Three concepts, three independent yes/no decisions, three numbers that meant what their face value said they meant.

The new encoder uses softmax. The activations are relative. They sum to 1 across the whole concept space. The strongest concept on a query might be 0.249 - which under the old encoder would have been a borderline-quiet signal, and under the new encoder is a confident, dominant fire.

0.249 was the new encoder's average top concept activation. Every threshold in the dispatcher was 0.5 or higher.

That's why every query routed to IDK. The new encoder was firing the right concept, with appropriate confidence relative to everything else, and the dispatcher was reading those activations as "nothing is firing." The encoder had gotten 24x better at picking the right answer, and the system above it couldn't hear it.

The Wrong Fix

The first instinct was rescaling. If 0.249 is the new "high," divide every threshold by 2. Done. Ship.

We tried it. It half-worked. Greeting handlers fired correctly on greetings. Identity handlers fired correctly on identity questions. But the dispatcher started cross-firing on everything else - questions about emotions would route to identity, questions about objects would route to physics. We'd swapped one calibration problem for another.

The reason: rescaling treats softmax outputs as if they were sigmoid outputs that happen to live in a different range. They aren't. A 0.249 firing on the new encoder isn't "the concept is 49.8% present" - it's "this concept is the most likely interpretation, with this much margin over the next-best." The number means a different thing than it did before. Rescaling fixes the magnitude. It doesn't fix the meaning.

That's the harder truth about this kind of integration: when an upstream component changes how it represents information, every downstream component that interprets that information has to be rewritten, not retuned.

The Right Fix

The dispatcher had been asking the wrong shape of question. It was asking "is concept X firing strongly enough?" - an absolute threshold question. With softmax outputs, that question doesn't have a meaningful answer. The right shape is "is concept X the dominant signal, and by how much?" - a relative comparison.

The rewrite turned every threshold into a ranking check plus a margin check. Instead of "greeting > 0.5," the rule became "greeting is in the top-3 fired concepts AND its activation is at least 2x the next-best non-greeting concept." Instead of "identity > 0.7," the rule became "identity dominates the top of the activation distribution."

The numbers in the new rules aren't thresholds in the old sense. 2x margin, top-3 rank, dominance-by-ratio - these all describe the shape of the activation distribution, not its absolute values. They survive future encoder changes the way the old thresholds didn't, because they're asking about the encoder's confidence relative to itself, not about a number that means something only on this specific encoder.

The cutover was one commit. Every dispatch rule rewrote. Backups taken on the dispatcher state and the live conversation memory. Test panel run

before

you > hello
origin > i don't know

you > what is your name
origin > i don't know

you > how does ice float
origin > i don't know

and after

you > hello
origin > hello.

you > what is your name
origin > my name is origin.

you > how does ice float
origin > ice is less dense than water, so it floats.

The new encoder is now live. The system runs end-to-end. The first two developmental tiers - basic conversation and elementary reasoning - are at 95.5% and 86.5% on the honest test panels.

What the Whole Arc Was About

Looking back at Parts 9 through 12 as a single sequence, the arc is about the discipline of finding the right bottleneck.

Part 9 said the bottleneck was data. We executed a careful plan to feed the encoder properly. Part 10 said the data plan didn't work - the abort condition triggered, and we listened. Part 11 said the bottleneck was architecture. The sandbox confirmed it. Part 12 says that even after fixing the right bottleneck, you still have to integrate the fix into the rest of the system, and integration is its own kind of work.

None of this is glamorous. It's not a "we achieved AGI" post. It's the slow, uneventful, mostly-correct version of how a model actually gets built: hypothesize a bottleneck, design a plan with a written-down abort condition, execute the plan, listen to what happens, do the next thing the evidence points at. Repeat until something actually works. Then integrate it without breaking everything around it.

The encoder we're running today is the third major iteration since we started. The dispatcher we're running today is the second. There will be more. Every component in this system has been the bottleneck at some point, and every component will be the bottleneck again. The job isn't to design the perfect system on day one. The job is to keep finding what's actually broken and fixing that thing, one bottleneck at a time, with abort conditions written in advance so a result you wanted to see doesn't become the result you accept.

What's Next

The encoder works. The dispatcher works. The first two tiers hold. The third tier - middle-school content across math, science, and history - is where the project goes next, and it's the tier that tests whether everything we've built so far actually generalizes.

There's a hypothesis we're testing alongside it: that the next bottleneck isn't going to be more concepts, but the relationships between concepts. A model can know "dog" and "animal" and "four legs" and "barks" as four separate concepts and still not understand what a dog is. Understanding might live in the connections, not the nodes.

If that's right, the next architecture pivot is already visible on the horizon. If it isn't, we'll find out quickly and write that post too.

One guy. One GPU. One $1,800 computer in Arizona. Still building.

Origin is developed at Fallen Angel Systems with the Genesis framework — NVIDIA Inception member. (USPTO Application #64/016,973, #64/017,567). FAS Guardian defends production AI systems from prompt injection in under 3ms. FAS Judgement is the open-source attack console that finds the gaps. Defense. Offense. Creation.

fallenangelsystems.com | Judgement on GitHub | Guardian on GitHub

Questions or consulting inquiries: josh@fallenangelsystems.com