Why the Kotlin creator's new language solves yesterday's problem — and what to build instead.
The Punchline First
Code is a transitional artifact. Like assembly language behind C — it doesn't disappear, it becomes invisible. LLMs are the compiler. Your intent is the source.
But intent without structure is vibe coding. And structured intent without separation of concerns is CodeSpeak — a beautiful bridge over a river that's drying up.
What actually survives the transition? Not code. Not specs. Decisions.
The Musk Corollary
The trajectory is clear: programs will be written directly in machine code — no human-readable languages in between. Musk said as much. The logic is simple.
If LLMs can translate intent straight into machine execution, then Python, JavaScript, Go — all of them — are the new assembly language. They don't disappear. They become invisible. The LLM uses compilers, runtimes, SQL, APIs natively — but no human ever reads that layer. Just as you never read the x86 instructions your C compiler emits.
This kills the bottom half of the stack. But it leaves the top half wide open. If the question "what language to write in" vanishes, the question "what exactly to build" becomes the only one that matters. Musk describes the execution layer collapsing. He says nothing about what replaces the intent layer.
That's the gap. Code was the intent layer — badly. Specs tried to be — CodeSpeak is the latest attempt. DNA is the answer that doesn't depend on either.
CodeSpeak: Right Diagnosis, Wrong Treatment
Andrey Breslav — the creator of Kotlin, a language used by 7 million developers — launched CodeSpeak in early 2026. The pitch: write plain-English specifications, LLMs compile them to Python/JS/Go. Maintain specs, not code. Shrink your codebase 5–10×.
The diagnosis is correct. Natural language is ambiguous. LLMs guess. Results are unpredictable. Engineers waste time debugging AI-generated code instead of shipping.
The treatment is a formal specification language — a DSL that sits between English and code, removing ambiguity for the LLM.
The problem: CodeSpeak is still in alpha, solving a 2024 problem. By the time it reaches production stability, the problem may not exist.
Why the Problem Disappears
Context windows are growing. When an LLM sees your entire project — every file, every commit, every past decision — ambiguity collapses. It doesn't need a formal spec to know your patterns.
LLMs are learning to ask. Claude Code in plan mode already asks "did you mean X or Y?" when it detects ambiguity. This is CodeSpeak's ambiguity checker — built into the agent, no DSL needed.
Agents are becoming stateful. Memory, CLAUDE.md, skills, project context — the agent accumulates your decisions across sessions. It "knows" what you mean because it remembers what you meant last time.
Inference cost is plummeting. When generation costs $0.01 and takes 5 seconds — regeneration is cheaper than spec maintenance. "Maintain nothing, regenerate everything."
CodeSpeak is a fax machine perfected in 1995. Technically impeccable. Solving a real problem. But email already exists and scales faster.
What Actually Survives
This emerged from building a scientific knowledge platform for geomorphological publications. The project went through three architectural generations. Each added infrastructure — databases, vector stores, embedding models, reranking pipelines. Thousands of lines of Python. 211 passing tests.
Then I asked: what if context windows grow to 10M tokens? What if I can just load all my distilled knowledge into the LLM and ask directly?
Answer: 80% of the architecture becomes unnecessary. The databases, the vector search, the chunking strategies — all of it is infrastructure for working around a limitation that's disappearing.
What remains? The decisions.
"Facts and claims are separate entities, because different authors draw different conclusions from the same measurements." This is true regardless of whether I use PostgreSQL, files, or a 10M-token context window.
"Every assertion is traceable to a specific page in the original document." True for any implementation.
"Roundness (Wadell) ≠ Circularity. Both are needed, both are stored, they don't substitute for each other." True forever.
These decisions don't live in code. They don't live in specs. They live in the expert's head — and they're the only thing that doesn't become obsolete when the stack changes.
DNA/RNA: A Methodology
I formalized this as a two-layer system, borrowing from biology.
DNA — the genetic code of your system. A 2–5 page document containing only decisions that are true regardless of implementation. No technology names. No frameworks. No model versions. Just: what exists, what's forbidden, what's valuable, how to act.
Philosophically, DNA has four layers:
- Ontology — what entities exist and how they relate. "Fact ≠ Claim." "Measurement ≠ Calibration ≠ Interpretation."
- Deontics — what's permitted and what's forbidden. "Originals are immutable." "No assertion enters the system without evidence."
- Axiology — what's valuable. "Completeness over speed." "Quality over throughput."
- Praxeology — how to act. "Triage is built into the process." "Infrastructure is temporary, knowledge is permanent."
If you remove all technology names and the document still makes sense — it's DNA. If it doesn't — you've mixed in implementation.

DNA = Ontology + Deontics + Axiology + Praxeology + Domain + Evolution
RNA (Harness) — the expression of DNA for a specific environment. Translates invariants into machine-checkable rules for a specific stack, agent, and CI pipeline.
DNA says: "Facts and claims are separate entities."
RNA says: "Tables facts and claims are separate. Test: no record in facts without fact_type from the allowed list."
DNA changes when your understanding of the domain changes (rare — years). RNA changes when you switch stacks (common — months). Same species, different habitat.
The Hierarchy
DNA — invariants, for humans (years)
↓
RNA/Harness — enforcement, for agents (months)
├── CLAUDE.md (agent contract)
├── Skills (codified experience)
└── Plugins/MCP (agent's tools)
↓
Requirements → TechnicalDesign → DevPrompts → Code (days)
↑
DNA Audit — third quality loop (feedback to DNA)

The higher the layer, the longer it lives and the more it belongs to the human. The lower — the faster it changes and the more it belongs to the agent.
Each lower level derives from the upper. Contradiction with the upper level is an error in the lower, not the upper.
The key insight: unit tests check if code works. Integration tests check if components work together. DNA Audit checks if the right code was written at all.
Why This Beats CodeSpeak
CodeSpeak formalizes intent into a DSL. DNA/RNA separates what you know from how it's implemented.
CodeSpeak: endpoint POST /auth/login { request { body { email: string @required } } }
DNA: "Every user action is authenticated. Authentication failure returns a reason, not a generic error."
CodeSpeak is tied to a target language and an LLM's ability to parse the DSL. DNA is natural language — readable by any human, any LLM, any future agent.
CodeSpeak ages when models improve. DNA ages only when your domain understanding changes.
CodeSpeak adds a layer between intent and execution. DNA removes one — it's what you'd tell a competent colleague on their first day, stripped of all implementation noise.
Harness Engineering: RNA Without DNA
On March 16, 2026, an article on Harness Engineering described an OpenAI experiment: a small team built a production product — roughly one million lines of code — without a single human-written line. Every line was generated by Codex agents. Estimated schedule: one-tenth the time by hand.
The key discovery: architectural intent must be mechanically enforced — linters, CI, "golden principles" baked into the repository — because agents replicate patterns at scale. Without guardrails, the codebase decays faster than humans can review it.
Their solution: a small AGENTS.md entrypoint. Opinionated rules in the repo. Background tasks scanning for deviations. This is exactly what we call RNA. It works. It's production-tested at million-line scale.
But it has no root.
AGENTS.md says "prefer shared utility packages over hand-rolled helpers." Why? Where is that decision recorded? In someone's head. Or in a Slack thread. Or nowhere. When the team changes, the golden principles need to be re-derived from scratch.
DNA is that root. Harness Engineering is the best implementation of RNA we've seen. DNA/RNA completes the picture: decisions that don't change (DNA) → rules that enforce them (RNA/Harness) → code that's generated, tested, and disposable.
Who Is the Developer?
The person who writes DNA is not a programmer. They're an architect of the solution — someone who knows exactly what they need, why, and what the constraints are. They decompose complex problems into invariants. They define "good" and "bad" results. They formulate constraints in natural language, completely and precisely.
This is not vibe coding. Vibe coding is "make me something nice, I don't know what." DNA is maximum rationality: I know the domain, I know the constraints, I know the quality criteria. I don't need an intermediate language to express this. I need an executor that understands natural language in full context.
The difference between a domain expert with DNA and a vibe coder is not the tool — it's the head. The tool is the same (LLM). But one says "make me an app" and the other says "facts and claims are separate entities, measurements store raw values and calibrations separately, don't split text inside borehole descriptions."
The second gets a working system. The first gets a prototype that collapses on real data.
Try It
- Take your current project.
- List every decision that would survive a complete rewrite in a different language, different database, different framework.
- Write them down. Two to five pages. No technology names.
- That's your DNA.
Everything else is RNA — important, but replaceable.
Stress Test
"This is just documentation." — No. Documentation describes what was built. DNA prescribes what must not be violated. It's closer to a constitution than a manual.
"ADRs already exist." — ADRs log individual decisions. DNA is a hierarchy with a root. ADRs say "we chose PostgreSQL because X." DNA says "structured queries on chronological ranges must return complete results, no omissions" — true for PostgreSQL, files, or a 10M-token context window.
"Harness Engineering covers this." — Harness Engineering (March 2026) is the best real-world validation of our RNA layer. OpenAI proved it works at million-line scale. But their golden principles have no recorded origin — they're enforcement without a root document. DNA is that root. We're not competing with Harness Engineering. We're completing it.
"This only works for domain-heavy projects." — Correct. CRUD apps don't need DNA. But any project where the domain expert knows something the programmer doesn't — medicine, geology, finance, law, science — benefits from separating that knowledge from implementation.
"What if the LLM ignores the DNA?" — Same as any contract: enforce it. RNA translates DNA into tests, CI checks, agent rules. The DNA Audit catches drift. It's not faith — it's verification.
"Breslav is wrong." — No. Breslav is right that the valuable part of programming is expressing intent, not writing code. He calls it "essential complexity." We agree completely — we just disagree on the container. A DSL ages with the technology. Natural language + a separation principle (DNA vs RNA) doesn't. Kotlin hit the right window: Java stagnating, Android rising. CodeSpeak is aiming at a window that's closing.
The code will be rewritten. The stack will change. The agent will improve. The DNA stays.
The code is dead. The decisions are alive. The question is whether you've written yours down.
This methodology was developed empirically during the construction of a scientific verification platform. The biological metaphor (DNA/RNA) emerged from analyzing CodeSpeak (Andrey Breslav), Harness Engineering (OpenAI), and Spec-Driven Development in the context of real development with AI agents.
Top comments (0)