DEV Community

Cover image for AI Needs RNA, Not Just Weights
vishalmysore
vishalmysore

Posted on

AI Needs RNA, Not Just Weights

There is a creature at the bottom of the ocean that solves intelligence differently than every other animal on Earth. The octopus has no centralized command-and-control architecture. Two-thirds of its five hundred million neurons live not in its brain, but distributed across eight semi-autonomous arms — each capable of local decision-making, sensation, and response without a round trip to headquarters. More remarkably, the octopus edits its own RNA in real time, reconfiguring the proteins that make its neurons fire differently depending on water temperature, prey, threat, and experience. It does not reboot. It does not retrain. It edits its expression of what it already knows.

We are building AI systems that share almost none of these properties.

This article is not a claim that AI literally needs ribonucleic acid. It is a proposal for a biologically inspired architecture principle — one grounded in the gap between how living intelligence actually works and how our current AI systems are engineered. The argument is simple: we have built very good DNA. We have not yet built the RNA.


Part I — The Frozen Model Problem

A large language model is trained once. Over weeks or months, on hardware consuming megawatts of power, billions of parameters are adjusted by gradient descent until the model can predict the next token in a sequence with remarkable accuracy. Then training ends. The weights are frozen. The model is deployed.

From that moment forward, the model is static. It can respond to new prompts, but it cannot truly adapt to them. Its knowledge is bounded by its training cutoff. Its personality is fixed by its alignment fine-tune. Its competencies are whatever emerged from the pre-training distribution. Asking a deployed LLM to learn something new is like asking a photograph to move.

"We have created extraordinarily capable static systems and mistaken their fluency for adaptability."

The workarounds we reach for reveal the depth of the problem. Context windows provide temporary, session-scoped information — but they are ephemeral. Once cleared, the model reverts entirely. Retrieval-augmented generation pipes external knowledge into the prompt — but the model does not actually learn from it; it merely reads it. Fine-tuning provides genuine adaptation, but at costs measured in time, compute, and the constant risk of catastrophic forgetting: the phenomenon where adapting to new information overwrites prior knowledge in ways that cannot be predicted or controlled.

Prompt engineering — the art of coaxing behavior through carefully structured inputs — is our most widely used adaptation mechanism. It is also the most revealing limitation. The fact that we have built an entire subdiscipline around phrasing instructions differently to get different behavior from a model that cannot actually change is a sign that something fundamental is missing from the architecture.

The Core Constraints

  • Frozen weights — parameters locked at deployment; no modification during inference
  • Ephemeral context — session memory evaporates; nothing persists across interactions
  • Expensive adaptation — fine-tuning requires significant compute and risks stability
  • Monolithic architecture — one model serves all tasks, contexts, and users identically
  • No runtime self-modification — the model cannot change itself in response to what it encounters

None of these constraints are fundamental laws of computation. They are engineering choices — choices shaped by what was tractable, measurable, and deployable at scale. But if we look at how biological intelligence solves the same problems, it becomes clear we may have optimized for the wrong layer of the stack.


Part II — What the Octopus Figured Out

The study that first drew widespread attention to octopus RNA editing was published in Cell in 2017. Researchers at the Marine Biological Laboratory found that the octopus, unlike virtually all other animals, edits the majority of its RNA transcripts — the working copies of genetic instructions used to build proteins. Where humans edit perhaps one or two percent of protein-coding transcripts, the octopus edits approximately sixty percent.

To understand why this matters, a brief detour into molecular biology is warranted.

DNA, RNA, and the Difference Between Blueprint and Production

DNA is the master blueprint of a living cell. It encodes the instructions for building every protein the organism will ever need. But DNA does not directly build proteins — it is transcribed into RNA first. RNA is the working copy: a temporary, single-stranded molecule that carries the genetic message from the nucleus to the ribosomes where proteins are assembled. In most organisms, this process is relatively faithful. The RNA copy closely matches the DNA template.

RNA editing changes this. Specific enzymes called ADARs (adenosine deaminases acting on RNA) can chemically alter individual nucleotides in the RNA transcript after it has been copied from the DNA but before it has been translated into protein. A single nucleotide change can alter which amino acid gets incorporated into the resulting protein — changing its shape, its electrical properties, its function. The DNA is untouched. The gene itself is unchanged. But the protein that gets built is different.

The octopus edits the expression of what it already knows — without altering the underlying source code.

In the octopus, this mechanism is used to tune neural proteins in real time. As water temperature changes, the octopus edits RNA transcripts for ion channel proteins in its neurons — keeping its nervous system functional across a temperature range that would otherwise cause it to either seize or shut down. It is not evolving. It is not retraining. It is performing a targeted, reversible modification of its own neural hardware, at the molecular level, in response to its immediate environment.

The octopus trades evolutionary flexibility for operational flexibility. Most organisms let evolution do the adaptation work across generations, preserving the integrity of individual genomes. The octopus made a different bet: keep the genome conservative, but give the transcriptome the freedom to reconfigure at runtime. It is, in software engineering terms, as if the octopus chose runtime configuration over compile-time optimization.

Decentralized Intelligence

The RNA editing story is only half of what makes the octopus architecturally interesting. The other half is the distribution of intelligence itself.

An octopus arm, severed from the body, will continue to respond to stimuli for over an hour. It will attempt to pass food to where the mouth used to be. It has not lost its intelligence — because much of that intelligence was never centralized to begin with. Each arm contains a ganglion, a cluster of neurons capable of local processing and decision-making. The central brain sets high-level goals; the arms execute them with semi-autonomous local competency.

This is not just a biological curiosity. It is an architectural pattern with direct analogues to modern AI system design — and it suggests that the centralized, monolithic model architecture we have built may not be the only viable approach to general intelligence.


Part III — A Framework: Mapping Biology to AI

The value of a biological analogy depends entirely on whether it maps cleanly onto the engineering problem at hand. Loose metaphors are aesthetically pleasing but operationally useless. What follows is an attempt at a precise mapping — one where each biological concept corresponds to a concrete AI engineering challenge.

Biology Role AI Equivalent Current State
DNA Master blueprint; rarely changes Base model weights Frozen post-training
RNA Working copy; dynamic and temporary Runtime adaptive layers Largely absent
RNA Editing Live modification of the working copy Dynamic weight modification Partial — LoRA, adapters
Neurons Signal processing units Network activations Implemented
Evolution Slow, generational weight optimization Pre-training / fine-tuning Slow and expensive
Epigenetics Gene expression without DNA change Prompt engineering / in-context learning Impermanent
Arm Ganglia Decentralized local intelligence Specialized sub-agents / MoE Emerging

The most striking column is the third one. The biological architecture that enables runtime adaptability — the RNA layer — has no direct equivalent in current deployed AI systems. What we have instead are approximations: adapters that add lightweight parameter deltas without modifying the base model; prompt engineering that alters behavior without touching weights; retrieval mechanisms that augment knowledge without encoding it.

These are all, in biological terms, epigenetic mechanisms. They change the expression of what the model can do without changing the underlying weights. They are impermanent, shallow, and constrained by what the base model already knows. They are not RNA — they are proxies for RNA in a system not architected to have it.


Part IV — A Proposed Architecture: The Living AI Stack

If we take the biological analogy seriously as an engineering specification rather than a metaphor, what would an AI architecture built on these principles actually look like? Below is a concrete proposal for what I am calling the Living AI Stack — five layers, each with a biological counterpart and a specific engineering role.

Layer 1 — The Core Model (DNA)

The base model weights remain the foundation. A large, capable, general-purpose model trained on broad data — your GPT-4, your Llama 3, your Claude. This is the DNA: the universal blueprint that encodes deep knowledge of language, reasoning, and the world. It changes slowly, through deliberate training runs, and its integrity is treated as sacrosanct. You do not edit the DNA casually.

The key architectural shift is what this layer is not asked to do. It is not asked to be everything — to handle every task, context, and user with the same fixed configuration. It is a stable, high-quality foundation. Adaptability happens above it.

Layer 2 — Dynamic Adapters (RNA)

Above the base model sits a layer of lightweight, swappable parameter deltas — analogous to RNA transcripts. These are task-specific, context-specific, or user-specific adapter modules: small enough to load in milliseconds, powerful enough to meaningfully redirect model behavior, and disposable when no longer needed.

This concept already has early implementations. Low-rank adaptation (LoRA) and its variants allow a small number of additional parameters to steer a large model's behavior without modifying the base weights. Prefix tuning prepends learned virtual tokens that shape the model's attention. These techniques work, but they are currently deployed statically — loaded at inference start and fixed for the session. The architectural upgrade is to make them genuinely dynamic: loaded, modified, and unloaded in response to real-time signals from the environment.

The Hot-Patch Analogy: Software engineers will recognize a familiar pattern: hot patching. In a running system, a hot patch applies a behavioral change without stopping the process. The RNA layer is, architecturally, a form of continuous neural hot-patching — where the "patch" is not code but learned behavioral parameters, applied and removed in response to context.

Layer 3 — The Context Rewriter (RNA Editing)

RNA editing does not swap transcripts — it surgically modifies individual nucleotides within them. The AI analogue is a meta-layer capable of targeted, real-time modification of the model's effective behavior at the activation level.

Recent research in mechanistic interpretability has produced tools that make this tractable. Activation steering — the insertion of learned vectors into a model's residual stream during inference — can reliably alter specific behavioral attributes without modifying weights. Sparse autoencoders trained to decompose model internals into interpretable features can identify and patch specific circuits. Contrastive activation addition (CAA) can shift a model's stance on a topic through direct geometric intervention in activation space.

These techniques are RNA editing: they modify the expression of the model's knowledge without touching the underlying parameters. They are reversible, targeted, and can be applied at inference time. What they currently lack is systematic integration into a production architecture — a framework that decides when to edit, what to edit, and how to verify the edit was correct.

Layer 4 — Arm Agents (Distributed Ganglia)

Rather than routing all intelligence through a single monolithic model, the arm-agent layer distributes cognitive work across specialized, semi-autonomous sub-agents. Each agent is a domain expert: one handles code, one handles retrieval, one handles multi-step reasoning, one handles tool use. They receive high-level directives from an orchestrator but execute with local autonomy — much as octopus arms receive a general intention from the central brain and implement it through their own ganglionic intelligence.

Mixture-of-Experts architectures begin to address this at the model level, routing different input tokens through different specialized sub-networks. Multi-agent frameworks like AutoGPT and CrewAI address it at the system level. Neither fully realizes the biological pattern — MoE lacks the true autonomy of arm ganglia, while current multi-agent frameworks lack efficient coordination mechanisms. The mature version of this layer will combine both: specialized sub-models with genuine local competency, coordinated by an orchestrator with a light touch.

Layer 5 — Persistent Memory (Epigenetic State)

Epigenetic markers do not change DNA, but they change which parts of the DNA get read — and those changes can persist across cell divisions. The AI equivalent is a writable external memory that persists across sessions and shapes how the model attends to and processes new information.

Vector databases provide a version of this: retrieved embeddings inject prior knowledge into the context without modifying the model. But current implementations are read-heavy and write-light — the model queries memory but rarely writes to it in a structured way. The epigenetic layer should be a true read-write store, updated through each interaction, and feeding back into the adapter layer to shape which RNA transcripts are loaded for the next session. This is how the model accumulates personalization, domain expertise, and institutional memory — not through retraining, but through accumulated epigenetic state.


Part V — What This Would Enable

The practical implications of a fully realized Living AI Stack are significant enough to warrant concrete examination rather than hand-waving at "more powerful AI."

True Personalization

Today's AI personalization is largely cosmetic: a system prompt that sets tone, a few preference flags, maybe a retrieved summary of past interactions. With a genuine RNA layer, personalization would operate at the parameter level — the model's actual computational behavior shaped by accumulated adapters that encode a user's style, preferences, domain vocabulary, and interaction history. This is not a different prompt; it is a genuinely different configuration of the same underlying intelligence.

Rapid Domain Adaptation

A hospital deploying a general-purpose LLM today faces a choice: fine-tune on medical data (expensive, slow, risky) or rely on retrieval augmentation (shallow, impermanent). With dynamic adapters, a medical RNA module could be loaded in milliseconds, configuring the model for clinical reasoning, drug interaction awareness, and appropriate uncertainty communication — then unloaded when the session ends, leaving the base model unchanged. The same model services radically different domains through adapter hot-swapping rather than through competing fine-tunes.

Continual Learning Without Catastrophe

Catastrophic forgetting — the tendency of neural networks to overwrite prior knowledge when trained on new data — is one of the deepest unsolved problems in machine learning. The RNA architecture suggests a structural solution: keep the base model frozen and route new learning into the adapter and memory layers. The DNA is never overwritten. New knowledge accumulates as additive epigenetic state. Forgetting becomes a policy choice, not an architectural inevitability.

Lower Infrastructure Cost

Training a frontier model costs tens of millions of dollars. Fine-tuning costs hundreds of thousands. LoRA adapters cost thousands. Activation steering interventions cost dollars. Prompt engineering costs nothing but human time. The Living AI Stack is, among other things, a cost architecture: it pushes adaptation work down to the cheapest possible layer, reserving expensive operations for changes that genuinely require them.


Part VI — The Risks Are Real, and They Are Not Small

A system that can modify itself at runtime is a system that can modify itself in unexpected ways. The risks of the Living AI Stack deserve as much engineering attention as its benefits — and in several cases, those risks are not yet solved problems.

Alignment Drift

Alignment is hard enough to maintain in a frozen model. A model that continuously updates its adapter layers and epigenetic memory may drift from its alignment constraints in ways that are gradual, compounding, and difficult to detect. Each individual modification may be small and seemingly benign; the cumulative drift may not be. Biology offers a cautionary analogue here too — uncontrolled RNA editing is implicated in neurodegenerative diseases and several cancers. Dynamic systems that edit themselves need robust error correction and integrity verification mechanisms. We do not yet have their AI equivalents.

Adversarial Manipulation

A writable memory layer and a dynamic adapter system create new attack surfaces for adversarial actors. Prompt injection into persistent memory — where a malicious input writes corrupted state that shapes future sessions — is a particularly serious concern. A deployed system whose RNA layer can be poisoned through sufficiently clever inputs is not just exploitable; it is persistently compromised in ways that may be invisible to conventional monitoring. Security architectures for Living AI systems will need to be designed from first principles, not retrofitted from existing LLM safety work.

Reproducibility and Auditability

Regulated industries — medicine, law, finance — require that AI outputs be reproducible and auditable. A system whose behavior varies based on runtime state violates this requirement by default. The same query, submitted at different times with different adapter configurations and memory states, may produce meaningfully different responses. This is not a fatal flaw — humans are also non-reproducible — but it demands new frameworks for logging, versioning, and auditing adaptive AI system behavior.

Catastrophic Self-Modification

The most dramatic risk is a system that edits itself into a pathological state. Neural networks are known to have sharp loss landscapes where small parameter perturbations cause large behavioral changes. A RNA editing mechanism that applies too aggressive a modification to a critical circuit could produce behavior that is not just different but broken in ways that are difficult to diagnose and reverse. Biological cells have extensive machinery dedicated to detecting and correcting RNA editing errors. AI systems will need analogous safeguards.

The Engineering Constraint: None of these risks argue against building RNA-equivalent AI systems. They argue for building them carefully — with integrity verification at every modification point, immutable audit logs of all state changes, and hard limits on the scope of runtime self-modification. The goal is a system that is adaptive like an octopus, not unstable like a cancer cell.


Part VII — From Static Models to Living Systems

The trajectory of AI development over the next decade will be determined by which architectural bets the field places now. The current bet is a large one: that scale, in the form of larger models trained on more data with more compute, will continue to yield capability improvements sufficient to justify the costs. That bet may continue to pay off. But it is worth examining whether it is the only bet on the table.

The biological record suggests that intelligence did not evolve through scale alone. The octopus and the human being have roughly similar behavioral complexity in certain domains despite the octopus brain being approximately ten thousand times smaller. The difference is architecture. The octopus solved intelligence through distributed processing, runtime adaptability, and a molecular-level mechanism for tuning neural hardware in response to environmental signals — not through having the most neurons.

I am not claiming octopus intelligence is equivalent to human intelligence, nor that current AI scaling is not effective. I am claiming that the gap between current AI architecture and what the biological record suggests is possible is large enough to be worth engineering attention.

DNA gave life its memory.

RNA gave it the ability to act.

AI has the memory. Now it needs the RNA.

The path from here to Living AI is not a single research breakthrough. It is a series of engineering decisions that, taken together, shift the architecture from static to adaptive: making adapters truly dynamic rather than session-fixed; integrating activation steering into production inference pipelines; building read-write persistent memory with proper integrity guarantees; designing multi-agent systems with genuine local autonomy rather than centralized orchestration with a thin veneer of distribution.

None of these are impossible. Several are partially built already, scattered across research labs and production systems that have not yet been integrated into a coherent architectural vision. What is missing is not the components — it is the blueprint. The recognition that we are building, in biological terms, a very sophisticated genome delivery mechanism, and that we have not yet built the cell.


Final Take

The octopus did not wait for evolution to solve its temperature problem. It developed a mechanism to solve it in real time, using the intelligence it already had, without rewriting its own source code. That is not a metaphor for what AI should aspire to. It is a proof of concept that runtime self-modification, properly constrained, produces robust and adaptable intelligence.

Our AI systems are remarkable. They are also, in a deep architectural sense, frozen. They know a great deal about the world, but they cannot truly change in response to it. They have DNA and no RNA — a genome without a cell to express it dynamically.

Building the RNA layer will require solving hard problems in safety, interpretability, and systems architecture. It will require abandoning some convenient assumptions about what it means for a model to be "deployed." It will require taking seriously the insight that the most impressive general intelligence we have studied — biological intelligence — solved adaptability not by being large, but by being alive in a way that our current systems are not.

That is the direction worth building toward.


Further Reading

  • Liscovitch-Brauer et al. (2017). "Trade-off between Transcriptome Plasticity and Genome Evolution in Cephalopods." Cell, 169(2), 191–202.
  • Hu et al. (2022). "LoRA: Low-Rank Adaptation of Large Language Models." ICLR 2022.
  • Zou et al. (2023). "Representation Engineering: A Top-Down Approach to AI Transparency." arXiv:2310.01405.
  • Turner et al. (2023). "Activation Addition: Steering Language Models Without Optimization." arXiv:2308.10248.
  • McCloskey & Cohen (1989). "Catastrophic Interference in Connectionist Networks." Psychology of Learning and Motivation, 24, 109–165.
  • Anthropic Interpretability Team (2024). "Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet." Anthropic Research.
  • Hochreiter & Schmidhuber (1997). "Long Short-Term Memory." Neural Computation, 9(8), 1735–1780.

Top comments (0)