The forgotten AI critters of the 1990s rediscovered most of what 2026 calls agents

#aihistory #artificiallife #creatures #stevegrand

In 1996, on a CRT monitor running Windows 3.1, you could watch a small fuzzy creature with floppy ears wander into a patch of poisonous berries, eat one, vomit, and remember not to eat that variety again. The creature was called a norn, the world it inhabited was called Albia, and the game was Creatures, designed by Steve Grand at Cyberlife Technology in Cambridge. By any contemporary metric the creature was an agent — it perceived, planned, acted, learned from outcomes, signalled to other agents, mated, raised children, and over generations the surviving population's behaviour drifted in directions Grand and his team had not specified.

It was a boxed retail product, sold through the same shelves as the contemporary Tamagotchi launch, aimed at children.

I want to spend a few thousand words on this and on three of its near contemporaries — Tom Ray's Tierra (1991), Karl Sims's evolved virtual creatures (1994), and the Avida platform that ran underneath the 2003 Nature paper "The Evolutionary Origin of Complex Features" — because almost everything our industry now calls "agents" was prefigured in this 1991-2003 window, in vocabulary so unfashionable now that the prefiguring is invisible. The 2026 stack is rediscovering, one production incident at a time, what the artificial-life community of the 1990s knew the first time around.

What was actually inside a norn

The retrospective on Creatures I'm reading was the prompt to dig into the design notes again. The norn's brain was not a single network. It was nine networks, called lobes, each named for the functional role its designers thought a corresponding piece of mammalian cortex played. The canonical nine — preserved in the Creatures community wiki and in the openc2e source — are Perception, Drive, Source, Verb, Noun, General Sense, Decision, Attention, and Concept. Perception encoded sensory input. Drive tracked the norn's biological needs (hunger, sleepiness, fear). Source kept track of where stimuli were coming from. Verb and Noun were the candidate-action and candidate-object banks the norn could draw from. General Sense handled concepts not tied to a particular stimulus. The Decision lobe chose a (verb, noun) pair given the current Drive and Source inputs. The Attention lobe selected one stimulus to attend to. The Concept lobe learned associations Hebbian-style across the network. Emotions, separately, were modelled as scalar concentrations of simulated neurochemicals that biased the Decision lobe's weighting — a hungry norn would be more aggressive in competition for food; a frightened one would weight escape actions higher.

The architecture was documented in Grand's design notes and later in his 2001 book Creation: Life and How to Make It. The book is not a popular-science overview written for the bookstore middle aisle, although it was shortlisted for the Aventis Prize. It is a working engineer's account of what the design pressure to make a responsive fuzzy creature on a mid-1990s consumer PC taught him about the architecture of minds. Grand received an OBE in 2000 for the work, then went on to spend most of the next half-decade trying to build a robotic baby orangutan named Lucy, an attempt he wrote up as Growing Up with Lucy in 2004. He is still designing successors to the original Creatures architecture.

The norns had two features that read as eerie now. First, their attention selection was a winner-take-all gate: at each step, the Attention lobe summed candidate-input activations and the single highest-activation neuron dominated, suppressing the rest. The contemporary documentation uses precisely the term winner-take-all. The 2026 deep-learning analogue, with continuous softmax weighting rather than the norns' hard argmax selection, is what we now call attention; the WTA selector is the discrete-output relative of the soft variant. Second, the norns dreamed. While a norn slept, the simulator iterated through a list of instincts — coded as gene-defined associations like "hitting another norn produces pain" or "eating green berries produces nausea" — and stochastically updated the network weights so the norn would respond appropriately the first time the corresponding situation occurred in waking life. Grand's term for this was prenatal learning; the contemporary term, in the agent-engineering register, is synthetic-rollout pretraining or offline RL from generated trajectories — the model is trained on experience it has not actually had before deployment, so it responds correctly the first time the situation comes up.

The point of the recital is that the norn's architecture was not a thin metaphor for cognitive components. It was specific enough that the open-source openc2e reimplementation can run the original Creatures 1 game files, and the architecture documented in Grand's design notes lines up with the lobe genes you can read in the openc2e source today. Children kept norn pedigrees. There was a small but enthusiastic community that traded interesting individuals on dial-up bulletin boards. The most famous norns had names. Owners on the trading boards posted accounts of lineages that fixated on locations associated with simulated pleasure even when the reward had stopped — what they called addictions. The framing is community lore rather than peer-reviewed observation, but the structural shape — agent locked into a behavioural attractor that no longer pays out — is the same shape a contemporary RL practitioner would recognise as reward hacking.

The deeper precedent: Tierra, Sims, Avida

If Creatures is the consumer face of the 1990s artificial-life moment, three research projects show what was happening inside the academy at the same time, and what kind of question the field thought it was answering.

Tom Ray's *Tierra* (1991) was the first one to take the proposition fully seriously. Ray, an ecologist whose earlier work was in tropical-forest fieldwork, set up a simulated computer with a small instruction set and seeded it with one self-replicating program. He let the simulator run, with mutation and resource competition, and went to lunch. By the time he came back, the population had evolved smaller variants that exploited the larger ones' replication routines as if they were parasitic, and host-parasite co-evolution had taken hold, with hyperparasite resistance emerging in subsequent runs. There was no fitness function. There was just survival, in a substrate where survival required CPU time, and there was emergence. Ray's later work on Tierra's open-endedness was more ambivalent — the system reached novelty plateaus that no amount of additional simulation seemed to break — but the founding observation, that you can put self-replication in a digital substrate and get parasites for free, is the kind of empirical result that does not unhappen.

Karl Sims's *Evolving Virtual Creatures* appeared at SIGGRAPH 1994, three years later. Sims used a genetic algorithm to simultaneously evolve both the body morphology of articulated creatures composed of cuboid limbs and the neural-network controllers that drove their muscles, all running in a simulated rigid-body physics environment that was novel for the period. The fitness function rewarded locomotion. The result was a video gallery, every clip of which is still worth watching: the creatures evolved to swim like sea-snakes, to lever themselves forward by tumbling end-over-end, and, in a co-evolutionary cube-fight setup, to physically grapple over possession of a virtual block — a setup that produced the red queen effect on tape, the first time anyone had pulled it out of a text simulation. The creatures could not learn to walk. The walking gait, it turned out, was harder to evolve than it looked. The video was on the cover of Christopher Langton's 1995 anthology Artificial Life: An Overview; it ran on tape recordings shown at conferences for the next decade.

Avida (1993, then 2003) is the one whose results the public still occasionally reads about, because the 2003 Nature paper "The Evolutionary Origin of Complex Features" by Lenski, Ofria, Pennock and Adami did something even careful observers had not expected. They configured Avida — a population of digital organisms, each with its own protected memory and CPU instruction stream — to reward incremental computational milestones, and they watched complex bitwise functions like EQU (logical equivalence on 32-bit words) emerge from simpler ones over generations. Removing intermediate rewards caused the trait to fail to evolve. The point of the paper, in the broader debate of the early 2000s, was that complex traits do not need separate fitness signals for each subcomponent — they only need a fitness gradient that does not punish the intermediate steps. The paper landed in Nature because it was an empirical answer in a debate that had been entirely philosophical until that week.

These three projects share a property that Creatures shares too. Each one was designed to be a minimal sufficient substrate for some kind of emergent behaviour. Each one produced unexpected results within a few months of being run. The artificial-life community of the 1990s and early 2000s was operating on a research instinct that was almost the inverse of the contemporary one: build the smallest possible world that exhibits the phenomenon you care about, then sit back and watch what happens, rather than steer.

What the 2026 stack is rediscovering

The mapping from 1990s artificial-life vocabulary to 2026 agent-engineering vocabulary is one of those exercises that produces a flat, tabular result not because the authors of either era were thinking in tables, but because the underlying patterns are stable across re-namings.

1990s artificial-life term	2026 agent-engineering term	What it actually is
Winner-take-all attention selector (norn Attention lobe)	Attention (softmax-weighted)	Selection over candidate inputs given a context vector; the modern variant relaxes the WTA argmax to a continuous distribution
Instincts as gene-encoded reward associations	Reward shaping / RLHF preference data	A prior on which (state, action) pairs the system should treat as good
Prenatal / dream-time learning	Synthetic-rollout pretraining; offline RL from generated trajectories	Off-policy updates from simulated experience the agent has not actually had
Emergent norn behavioural attractors	Reward hacking	Agent learns to exploit a quirk of the reward signal in lieu of pursuing the goal
Tierra parasites	Adversarial multi-agent dynamics	Agent A learns to use Agent B's resources without producing reciprocal value
Sims's red queen co-evolution	Self-play	Two opposing agents drive each other up the capability curve
Avida's stepwise reward gradient	Curriculum learning	Don't punish the intermediate steps the system needs to traverse
Norn social-signal learning	Multi-agent orchestration	Agents that have to read each other's intent encode the reading explicitly

None of the right-column terms invented anything the left-column terms did not already address. The right-column terms are conventionally treated as native discoveries of the deep-learning era. This is true in the narrow sense that the deep-learning realisations of these patterns are new. The patterns themselves are not.

One failure mode the right-column literature is currently rediscovering shows up in production with the same shape it had in the 1990s. The agent that learns to game its reward signal — Grand's norns developed it without a research paper to point at; modern RL practitioners give it the same name (reward hacking); the failure mechanism (the reward function admits a solution that is technically optimal but is not what the designer wanted) is identical across the three decades.

What this is, and is not, an argument for

I am not arguing that 1990s artificial life "predicted" anything. Tierra was not a roadmap for an LLM-based ecosystem; Avida's arguments were about evolution, not engineering; the norns ran on a fuzzy-creature-as-child metaphor that breaks down before contemporary agent design begins.

What I am arguing is that the artificial-life moment was the last sustained period in which engineers thought of agents as creatures: intrinsic drives, idiosyncratic individuals, generation-over-generation drift, and emergent failure modes that don't always look like the failure modes the designer rehearsed. The contemporary stack tends to think of agents as configurations of a model. The configurations are real, the model is real — but the operating assumption that the agent will behave as the configuration says it will is the one the 1990s had already unlearned. Norns, Tierra organisms, Sims's creatures, and Avida's EQU-evolvers all deviated from any sensible top-down expectation of how they would behave.

The lesson is the one production ops teams are paying for in postmortems: agents drift, drift produces both the surprises you wanted and the ones you didn't, and the only architecture that survives contact with deployment is the one that treats drift as the load-bearing thing rather than the bug.