Adam - The Developer ✨

Posted on Jun 22 • Edited on Jun 25

When Software Started Writing Software: A Developer’s History of AI

#ai #machinelearning #programming #productivity

Blending strict logic with pattern matching

If you've shipped software in the last three years, you've probably watched your job description quietly rewrite itself. You went from writing code, to writing code with an autocomplete, to writing code with a collaborator, to increasingly writing a spec and watching an agent write, test, and ship the code for you.

That didn't happen overnight. It's the latest chapter in a 70-year story that started with researchers trying to teach machines to play checkers. Let's walk through it, not as a dry timeline, but as the story of how "intelligence" kept getting redefined every time machines got good at the last definition.

1. The Symbolic Era: Intelligence as Logic (1950s–1980s)
2. Statistics Quietly Eats Symbols (1990s–2000s)
3. Deep Learning Breaks the Ceiling (2012–2017)
4. The Transformer and the Birth of "General-ish" Intelligence (2017–2022)
5. From Chatbot to Coworker: The Agentic Turn (2023–Today)
So What Changed, Really?

1. The Symbolic Era: Intelligence as Logic (1950s–1980s)

The founding bet of AI, made official at the 1956 Dartmouth Workshop, was simple and audacious: thought is computation. If you could represent knowledge as symbols and rules, and manipulate those symbols correctly, you'd get intelligence.

This gave us:

Logic Theorist (1956): proved mathematical theorems by searching through logical statements.
ELIZA (1966): pattern-matched your sentences back at you and convinced people it understood them. The first chatbot, and the first time humans projected understanding onto a system that had none.
Expert systems (1970s–80s): programs like MYCIN encoded a domain expert's rules as IF-THEN statements. MYCIN could diagnose bacterial infections about as well as a human specialist, using a few hundred hand-written rules.

This was weak, narrow intelligence in the most literal sense: a system that was a genius in one box and knew nothing outside it. The fatal flaw was scale, every rule had to be written by hand by a human expert. Knowledge didn't generalize, and it didn't learn from data. When funding agencies realized these systems couldn't handle the messiness of the real world, the money dried up. This was the first AI winter.

2. Statistics Quietly Eats Symbols (1990s–2000s)

While "AI" was a dirty word in grant applications, a different idea was gaining ground: instead of telling a machine the rules, show it examples and let it find the rules itself.

Neural networks had existed since the 1950s (the perceptron) but were widely written off after Minsky and Papert detailed their mathematical limits in Perceptrons (1969).
Backpropagation, popularized in 1986, gave multi-layer networks a real way to learn and yet networks still mostly lost to simpler methods for another two decades. The algorithm existing wasn't enough; there wasn't enough labeled data or compute to let it show what it could do. It sat half-revived, a promising idea nobody could afford to run at scale.
Support Vector Machines, decision trees, and Bayesian methods dominated practical machine learning, spam filters, recommendation engines, fraud detection.
IBM's Deep Blue beat Garry Kasparov in 1997, but largely through brute-force search over chess positions, not learning. Still narrow, still impressive.

This era's intelligence was statistical rather than logical: pattern recognition over labeled data. It worked well on narrow, well-defined tasks but needed mountains of hand-labeled examples and feature engineering done by humans. The "intelligence" was still mostly in the human designing the features the model was just fitting a curve.

3. Deep Learning Breaks the Ceiling (2012–2017)

The ingredients for the next leap had been sitting around for a decade: more data (the internet), more compute (GPUs originally built for video games), and algorithmic tricks for training deeper networks without them collapsing into noise. None of this looked inevitable at the time. Most of the field had moved on from neural networks; deep nets were a fringe bet kept alive by a small number of labs who kept getting told they were wasting their careers.

The spark was AlexNet (2012), a deep convolutional neural network that crushed the ImageNet image-classification competition, slashing the error rate compared to the next-best approach. That one result told the field something important: stack enough layers, feed them enough data, and the network finds its own features, no human feature engineering required. It wasn't a smooth continuation of the field's direction; it was closer to a coup. Within a couple of years, techniques that had been a punchline became the default starting point for almost every computer vision paper.

What followed was a five-year sprint:

Word2Vec (2013) / GloVe (2014): words became vectors, and "meaning" became something you could do arithmetic on.
Generative Adversarial Networks (2014): two networks competing, one generating fakes, one detecting them, together learning to produce eerily convincing images.
AlphaGo (2016): beat the world's best Go player (the board game, not go run main.go energy) using deep learning plus reinforcement learning plus tree search, on a game once thought too intuitive for machines.

Each of these was still task-bound, a vision model couldn't write a sentence, a Go-playing model couldn't recognize a cat — but the skill inside each one was no longer handed to it by a human; the model discovered its own representation of the problem. That shift in mechanism, more than any single result, is what made the next jump possible.

4. The Transformer and the Birth of "General-ish" Intelligence (2017–2022)

In 2017, a Google paper titled "Attention Is All You Need" introduced the transformer architecture. Instead of processing text sequentially like older recurrent networks, transformers let every word attend to every other word at once. It was a better way to model sequences and it turned out to scale beautifully.

That architectural choice, combined with the realization that you could pretrain a single giant model on a huge slice of the internet and then adapt it to almost any language task, produced the GPT lineage:

GPT (2018) → GPT-2 (2019) → GPT-3 (2020): each generation showed that scaling up parameters and data kept producing qualitatively new abilities, not just marginal accuracy gains. GPT-3 could write code, translate, summarize, and hold a conversation, despite never being explicitly trained to do any of those things.
Instruction tuning and RLHF (2021–2022): raw language models predict the next token; they don't inherently want to be helpful. Techniques like reinforcement learning from human feedback turned raw next-token predictors into assistants that follow instructions and refuse harmful ones.
ChatGPT (November 2022): Sam Altman announced it on X with the energy of a minor release: "today we launched ChatGPT. try talking with it here." Understated copy for the moment the old era ended and a new history began. Research left the lab; your non-technical relatives showed up. A hundred million users in two months. The Turing Test stopped being a thought experiment and became something people ran into accidentally over breakfast.

This is the point where AI stopped meaning "deep in one box, blank everywhere else" and started meaning compositional: a single set of weights that could combine skills it was never explicitly trained to combine. One model could write a sonnet, debug Python, and explain the sonnet's meter not because it was three different systems, but because language turned out to be a surprisingly good universal interface to a huge range of human tasks.

5. From Chatbot to Coworker: The Agentic Turn (2023–Today)

A language model that just answers questions is a powerful autocomplete. The next phase of the story is about giving that model hands: the ability to call tools, write and execute code, browse the web, remember state across steps, and chain its own reasoning into multi-step plans.

A few threads converged here:

Tool use / function calling let models stop just describing actions and start taking them, querying a database, hitting an API, running a calculation, instead of guessing at the answer.
Retrieval-augmented generation (RAG) gave models access to information beyond their training data, grounding answers in real documents instead of frozen memory.
Chain-of-thought and reasoning models showed that letting a model "think out loud" before answering and eventually training it specifically to reason longer on hard problems, produced dramatically better results on math, logic, and multi-step planning.
Agentic frameworks stitched these into loops: plan → act → observe → revise. Wrapped in orchestration code, retry logic, and well-designed tools, a model could chase a goal across many steps instead of answering once and stopping. Left alone long enough, it still drifts or takes wrong turns, the scaffolding exists to catch that. "Agent" describes a system, not a self-sufficient mind.
Multi-agent orchestration: models that spin up other model instances to parallelize work, each with a narrower role, then combine results. The specialist of the symbolic era is back, except now it's a transformer playing a role inside a swarm coordinated by another transformer, instead of a human-written rule.

This is the world a developer in 2026 actually lives in. Code isn't just suggested line-by-line; it's planned, written across multiple files, tested, and debugged in a loop that needs less hands-on steering than it used to, though still real review, real guardrails, and real human judgment about when to trust the output. The same pattern shows up outside coding: research agents that browse, synthesize, and cite sources across dozens of pages; operations agents that read a ticket, check a calendar, and draft a response; design agents that take a brief and return a working prototype. None of this is autonomy in the strong sense yet. It's interactive capability, a model in a loop with tools and a feedback signal — and it's powerful precisely because of how tightly that loop is engineered, not in spite of it.

Easy to miss when you're staring at capability charts: every jump here was also an economics story. Expert systems died because expert time didn't scale. Statistical ML rode cheap labeled data and storage. Deep learning rode gaming GPUs. LLMs rode internet-scale data and transformer parallelism. Agents are having their moment because inference got cheap enough to run a model in a loop hundreds of times per task without laughing. The recurring question isn't "can we build it?" but it's "can we afford to run it enough times to be useful?" That bottleneck moved; it didn't disappear.

So What Changed, Really?

If you zoom out, the history of AI is a story about where the intelligence lives:

Era	Where the "smarts" lived
Symbolic AI	In rules a human expert wrote by hand
Statistical ML	In features a human engineer chose
Deep learning	In representations the model learned itself
Large language models	In patterns learned from most of the public internet
Agentic systems	In the model's own planning, tool use, and self-correction across time

Each era didn't replace the last so much as absorb it. Today's agents still do statistical pattern matching under the hood; they still occasionally fail in the brittle, overconfident ways the old expert systems did, just less often and less predictably.

If you compress the five eras above, there are really only three discontinuities that mattered:

Hand-coded intelligence: rules a human wrote (symbolic AI).
Learned representations: patterns a model found in data, with steadily less human-chosen structure (statistical ML → deep learning → LLMs).
Interactive systems: models that act, observe consequences, and revise, instead of just outputting a single answer (agents).

The five-era version tells a better story; the three-version compression is what actually changed underneath.

Shift one: who writes the rules — human or data.
Shift two: passive answer vs. active loop. Most of "AI got so much better" traces to one of those.

This isn't a straight march toward AGI. It's researchers repeatedly asking "what if the machine decided this part?" and finding that worked, once the economics finally allowed it.

Whether the next chapter is "agents that reliably run entire businesses" or "another winter while the hype outpaces the engineering" is genuinely an open question and depending on who you ask, both are already happening at once.

Top comments (12)

Ekong Ikpe • Jun 22

Nice read. An honest takeaway while deep thinking 🤔

The "right" intent should not be to replace deterministic logic, but to figure out how to orchestrate the two.
Use strict, unyielding code for the core engine (state, file management, calculations) and use the pattern-matcher purely as a fluid interface or data translator.

Adam - The Developer ✨ • Jun 22

pretty much where I land too. Use AI for interpretation and flexibility, use deterministic code for anything that absolutely must be correct.

Theo Valmis • Jun 22

The through-line in that history is that every leap automated a layer of mechanism and left the judgment layer exactly where it was. Compilers automated writing assembly, frameworks automated boilerplate, and each time the prediction was that the skill would vanish, while what actually happened is the skill moved up a level to deciding what to build and whether it's right. AI writing software is the same move at a larger scale: it automates the producing, not the deciding. The history is reassuring and a warning at once, because the people who thrived across each leap weren't the fastest typists, they were the ones who owned the judgment the automation couldn't. That part still hasn't been automated, and it's still where the value concentrates.

Adam - The Developer ✨ • Jun 23

That's a great way to put it. Every wave of automation seems to move the value up a layer rather than eliminate it entirely.

Mike Czerwinski • Jun 24

Both happening at once is what the layer split inside the history actually predicts. Theo Valmis's "producing vs deciding" cut below names half of it. The half worth adding: there is a third layer the history of AI has not automated either, which is auditing whether the decision was right after the fact. Producing automated, mostly. Deciding partly automated, mostly delegated. Auditing is still where consequence locality binds, and it is the part most agentic systems quietly skip because the demo only shows the produce-decide loop, not the layer that catches when the loop got it wrong.

That asymmetry is what makes the winter-or-transformation question read as both-at-once. The producing layer is transforming fast and cheaply. The deciding layer is moving but constrained by who actually pays when the decision is wrong. The auditing layer is mostly winter, because almost nobody is shipping verifier evaluations that test whether the verifier catches deliberate violations. Capability is scaling. Auditability is not, yet. The gap between "interactive capability" and the strong sense of autonomy the post carefully avoids claiming runs straight through that audit layer.

The line that stays with me on a second read: "it's powerful precisely because of how tightly that loop is engineered, not in spite of it." Same shape one floor up. Power lives in the loop discipline, and the missing discipline right now is the audit layer, not the agent layer.

Adam - The Developer ✨ • Jun 24

Fascinating perspective, Mike. The asymmetry between scaling capability and lagging "auditability" perfectly explains why it feels like we're in a winter and a golden age at the exact same time. The loop discipline is where the real engineering is.

Mike Czerwinski • Jun 24

"Loop discipline is where the real engineering is" is the line I want to keep. It compresses the whole capability-vs-auditability asymmetry into something a reader can actually carry. The piece I keep coming back to is that loop discipline is also the only part of the stack that does not get cheaper when the model gets better. Better models tighten the produce side. The loop still has to be engineered, planted-fault tested, and maintained by somebody who pays attention. Capability inflation leaves loop discipline exactly where it was.

Kushal Baral • Jun 22

I like how each stage gradually shifted more decision-making from people to machines. Now the bigger question isn't whether AI can do something, but what we should still be responsible for ourselves:)

Adam - The Developer ✨ • Jun 23

Agreed. Feels like we're slowly moving from capability questions to responsibility questions.

Mudassir Khan • Jun 25

the "moved but not disappeared" framing for the inference cost bottleneck is the part worth dwelling on. running a model in a loop 100x per task is now affordable — but the same cheapness that lets you ship the agent also lets you ship it before you ship the observability for it. RAG plus structured tool calls running 50 iterations used to cost enough to self select for teams who could justify the infra. now any team ships it, which means more teams find out on iteration 47 that something drifted.

at what point do orchestration frameworks bundle step level replay and drift detection by default, vs. every team building that bespoke?

Adam - The Developer ✨ • Jun 25

"Iteration 47 hits way too close to home. 💀 The fact that it’s now cheaper to run a loop 100x than it is to store and analyze the traces is the ultimate irony. Honestly, the first orchestration framework to bundle step-level replay by default is going to win a lot of developer hearts."