DEV Community

Lei Hua
Lei Hua

Posted on

The Man Who Summoned Ghosts: Andrej Karpathy in the AI Era | Prologue: I Met nanoGPT Before I Met Him

The Man Who Summoned Ghosts: Andrej Karpathy in the AI Era | Prologue: I Met nanoGPT Before I Met Him cover

A thinker profile of Andrej Karpathy across the AI era.

Originally published on Lei Hua's Substack.

I met nanoGPT before I met Karpathy. Only later did I realize that what I had encountered was not merely a person, but a way of working.

I.

Like many people, my real understanding of deep neural networks did not begin with papers, and it did not begin in a classroom. It began with a handful of Karpathy repositories on GitHub.

micrograd: a Python file of fewer than 200 lines that lays bare the mechanism of backpropagation. It was the first time gradient descent stopped being an abstract term I had to trust, and became code I could read, step through, and modify.

makemore: the same posture of doing the fullest thing in the fewest lines, starting with character-level language models and walking all the way toward transformers. It was the first time I understood that a language model was not a black box hidden inside a company's cloud. It was code that could run on a laptop.

nanoGPT: GPT-2 rewritten in roughly a thousand lines of Python, with training scripts and data preparation laid out in plain view. It was the first time I believed that understanding how ChatGPT-like systems are trained was something an ordinary engineer could do. No OpenAI badge. No PhD. Just a README that more or less says: clone it, then run it.

Only then did I start to see the person behind the repositories.

I watched him build GPT-2 from scratch in a single four-hour take on YouTube. I watched him explain LLMs in one hour for a broad audience. I watched him repeat the same ideas, in ten different registers, at Microsoft Build, Sequoia, YC, No Priors, and Dwarkesh. And I watched him become clearer each year, more restrained each year, more publicly willing to recalibrate himself each year.

Many people may have met Karpathy in the same order: first his code, then his judgment, and only at the end the person.

II. What This Series Is Not Trying to Do

I need to say this first.

This series is not a love letter. My respect for Karpathy should not turn the piece into flattery, because flattery is useless. A figure shaped by praise alone does not help the reader see themselves.

This series is also not an indictment. Even after the October 2025 Dwarkesh episode, where he was briefly flattened into the role of an AI bubble-popper, I do not want to dramatize the story as "look, he changed." The real situation is more complicated, and more worth understanding.

So what is this series trying to do?

It is trying to do something narrow but deep: to follow, across the years in which AI reshaped the world around him, the public record of one technical mind as its thoughts, judgments, emotions, and posture were pushed by facts into repeated recalibration.

The difficulty is not collecting the material. His material is public, rich, and unusually complete: YouTube, bearblog, X, Sequoia, YC, No Priors, Dwarkesh. Across those venues, he has left a clear trail of language. The difficulty is resisting the temptation to turn change into drama.

One temptation is to say he became pessimistic: from the 2022 romance of neural networks as another kind of natural intelligence, to the 2025 sharpness of "it's slop." That story is simple, useful, and wrong.

Another temptation is to say he predicted everything: LLM OS, System 1 / System 2 LLMs, the AlphaGo step two analogy, each later seemingly confirmed by o1, RLVR, DeepSeek R1, and the rise of agentic workflows. That story is wrong too.

Because he is not primarily predicting. He is doing something deeper and subtler.

III. He Is Not a Prophet. He Is a Translator.

If this entire series leaves only one judgment about Karpathy, it is this:

He is not mainly an inventor. He is a translator. He is not mainly a prophet. He is a namer.

An LLM is the kernel process of a new OS. The context window is RAM. Software 3.0 is programming in English. Inscrutable artifacts. Lossy zip files of internet knowledge. Cognitive interns. Ghosts versus animals. Jagged intelligence. March of nines. Vibe coding. Agentic engineering. AI psychosis.

Most of these are not new concepts in isolation. OS, RAM, compression, kernel processes: these come from earlier computing culture. System 1 and System 2 belong to Daniel Kahneman's cognitive science vocabulary. The bitter lesson belongs to Richard Sutton. March of nines comes from reliability engineering.

What Karpathy does is different. He repeatedly knows exactly what his audience already understands, and exactly what name it is missing. Then he grafts an older vocabulary onto the new phenomenon of LLMs with unusual precision.

That is not simple. The hard part is not inventing a new concept. The hard part is finding the metaphor that lets the previous generation of programmers catch the new reality without flinching.

I learned deep learning from his GitHub repositories not because he used some secret teaching magic, but because he placed a new phenomenon inside tools an earlier generation already knew: Python, Jupyter, the command line, the debugger. He made it touchable, editable, and legible.

Sometimes education is exactly that: not inventing something new, but placing the new thing inside the grammar the student already speaks.

That translator posture has a special value in the AI era, because this is an era in which new phenomena grow faster than our language. Every month brings new models, new abilities, new failure modes, new workflows. The industry barely has time to name them before the outside world is already behind. In such a moment, the people patient enough to give new phenomena good names are doing something more important than branding. They are saving the era language-cost.

IV. Four Stable Cores, One World That Keeps Changing

If you only hear his 2025 line that AGI still feels a decade away, it is easy to read him as a pessimist.

But stretch the timeline from 2022 to 2026 and a different picture appears. He did not suddenly flip. The world changed, and the world forced different things into view.

What I want to preserve is not any single sharp sentence, but four stable cores that have barely moved:

  1. Minimalism, readability, and demystification of the training stack. From micrograd to nanoGPT to microGPT, each generation says: do the most complete thing in the fewest lines of code. This is technical taste, but it is also a moral posture. He does not want frontier models to look like magic.

  2. The dignity of education. Across his public work, this line only strengthens. Eureka Labs is not just a startup. It is the physical form of that lifelong thread.

  3. An allergy to hype. As early as State of GPT in 2023, he was saying "low-stakes + human-in-the-loop." The later "slop" and "march of nines" are the same caution spoken at higher volume.

  4. A preference for open and pluralistic ecosystems. In 2024, the image was a coral reef. In 2026, it becomes a tactical argument for building RL environments in verifiable domains the big labs have not claimed. The romance becomes strategy, but the anti-centralization undercurrent remains.

Those four things stay. What changes is reality: agents begin writing code for him, an education mission has to become a product, public speech acquires costs, and even the question of how he works is rewritten by the tools. By the end, he publicly describes himself as living in a kind of "AI psychosis." That is not theatrical. It is honest: the tools have begun to reshape the human mind.

V. The Question This Series Wants to Ask

The question is not whether Karpathy is right.

The question is:

When a person is forced by the era to keep rewriting their own judgment, what does it mean for them not to lose themselves?

This series follows that question chronologically: from the fall of 2022, when he had been out of Tesla for three months and sat down for Lex Fridman episode 333, to the spring of 2026, when he returned to the same Sequoia stage and said that he had never felt more behind as a programmer.

I place this prologue first because I want the conclusion on the table from the beginning:

May you recalibrate gracefully in this era.

Next, we begin in the fall of 2022. Karpathy has just left Tesla. He sits down in Lex Fridman's studio. That conversation is the zero-point of the next three years.

Sources and Anchors

Top comments (0)