DEV Community

Cristian Tala
Cristian Tala

Posted on

Karpathy Stopped Writing Code. So Did I. Here Is the Framework I Use to Direct AI Agents and Build Real Software

Andrej Karpathy, OpenAI co-founder, said this week that he hasn't written code since December 2025.

I read that and thought: I did exactly that today.

This week I built a complete WordPress plugin — LeanAutoLinks — using a team of 6 AI agents coordinated with Claude Code's Agent Teams. The plugin processes 16,000+ posts and generates internal links in 90 seconds. Every existing plugin I found in the market either took minutes, blocked the server, or simply didn't work at that scale.

I didn't write a single line of PHP.

What I did do was design roles, define metrics, set autonomy rules, and orchestrate the team. That's exactly what Karpathy describes when he says he went from programmer to AI director.

This post is the complete framework I used. Not theory. What actually worked.

The Shift Karpathy Is Describing (And Why It Matters)

Andrej Karpathy isn't an AI influencer. He's one of the most serious researchers in the field: OpenAI co-founder, former Head of AI at Tesla, creator of deep learning courses used in universities worldwide. When he talks, it's worth paying attention.

What he said this week was blunt:

"I used to do 80% of the coding manually and delegate 20% to AI. Now it's reversed: agents do 80% and I do 20%. And I still don't know exactly when that crossover happened."

Fortune, March 21, 2026

He used the phrase "state of psychosis" to describe how an individual now feels when they can build what previously required an entire team. Not because he's unstable — but because the scale of what's possible breaks previous intuitions about how much effort things require.

Economic Times reported that Karpathy now literally spends hours directing AI agents instead of writing code. Not a metaphor. His actual workflow.

And Forbes was more direct: junior developers are paying the price. When someone with Karpathy's background can orchestrate agent teams to produce quality code, what happens to the developer who just graduated from a bootcamp and knows how to build CRUDs in React?

But the most important point — the one Fortune emphasized — is this:

The bottleneck is no longer computation. It's the ability to direct agents.

This changes everything. The question is no longer "can you code?" It's "do you know what to ask agents for? Do you know how to design the team? Do you know when to give them autonomy and when to add constraints?"

Those are the skills that matter now.

My Real Experience: From 0 to Plugin in Production

The Problem

ecosistemastartup.com has over 16,000 published posts. It's a news site for Latin American entrepreneurs that runs automatically with n8n, WordPress, and an AI stack.

The problem: internal links across that volume of content were nonexistent or inconsistent. The auto-linking plugins I tested in the market did one of three things:

  • Processed 5-10 posts per request and blocked the server
  • Took hours to process the full site
  • Generated links so noisy they degraded the user experience

I needed something that:

  • Processed the complete inventory in seconds, not minutes
  • Didn't degrade server performance in production
  • Was smart about what to link (not spam, actual internal linking)
  • Had the ability to roll back if something went wrong

It didn't exist. I decided to build it.

The Decision: Claude Code Agent Teams

I knew about Agent Teams since Claude Code launched it. The idea is simple: instead of a single agent doing everything, you define a team with specialized roles. Each agent has specific context, responsibilities, and constraints. They can coordinate, pass information, and check in with each other.

I designed a team of 6 agents:

The 6 Agents

1. Strategist (Orchestrator)
The director. Defines overall architecture, coordinates flow between agents, makes high-level decisions. Has access to all team outputs. Doesn't write code directly.

2. Research Agent
Investigates existing plugins, analyzes competitor code, identifies what works and what fails. Documents findings before anyone writes a line of code. Its output is a technical brief that the Architect consumes.

3. SEO Agent
Defines internal linking rules: which keywords to link, which posts have priority, how to avoid over-optimization. Has veto power over any decision affecting the site's SEO profile.

4. Performance Agent (with veto power)
The most critical agent on the team. Its only obsession is performance: execution time, database queries, memory consumption. Has absolute veto power — if an implementation doesn't pass its benchmarks, the Architect must redo it.

5. Architect
Designs and implements the code. Works within the Performance Agent's constraints and the SEO Agent's rules. Builds in phases, not monolithically.

6. QA Agent
Testing. Defines test cases, runs validations, documents bugs. Doesn't approve any phase until it passes its criteria.

The Autonomy Rules

This is what most differentiates an agent team that works from one that gets paralyzed or produces garbage:

RULE 1: Don't ask, propose and execute.

- Each agent can make decisions within their domain without approval
- If there's ambiguity, choose the most conservative option and document the reasoning

RULE 2: Strict phases. Nobody skips steps.

- Research → Design → Implementation → Performance Check → QA → Deploy
- If the Performance Agent vetoes in phase 4, go back to Implementation

RULE 3: Veto power is absolute for Performance and SEO.

- If Performance Agent says "this is slow", it gets redone. No exceptions.
- If SEO Agent says "this over-optimizes", it gets adjusted.

RULE 4: Metrics first, code second.

- The goal isn't "create an auto-linking plugin"
- The goal is "process 16,000 posts in less than 2 minutes without degrading performance"
- If the code meets the metric, we win.

RULE 5: Document everything.

- Every design decision has a comment explaining why
- Future agents (or me) need to understand the reasoning
Enter fullscreen mode Exit fullscreen mode

The result: LeanAutoLinks — in production, processing 16,000 posts in 90 seconds.

The Complete Framework: How to Design an AI Agent Team

This is what I learned. You can apply it to any software project.

Step 1: Define the Problem with Metrics (Not Features)

The most common mistake: "I need a plugin that does automatic internal linking."

That's a feature, not a problem.

The real problem: "I have 16,000 posts without consistent internal links. Existing plugins take 4+ minutes and block the server. I need to process the complete inventory in less than 120 seconds with no impact on the production server."

See the difference? The second one has:

  • Scale of the problem (16,000 posts)
  • Success benchmark (less than 120 seconds)
  • Critical constraint (no production impact)

When agents have clear metrics, they can make autonomous decisions. If the metric is "make it work," each agent has a different interpretation of "working." If the metric is "90 seconds for 16,000 posts," everyone measures the same thing.

Step 2: Design Roles, Not Tasks

The difference between assigning tasks and assigning roles:

Tasks: "Agent 1: write the scraping function. Agent 2: write the link insertion function."

Roles: "Performance Agent: you're responsible for ensuring no function in the codebase degrades server response time. You have veto power over any implementation."

Tasks produce executor agents. Roles produce agents that think.

An agent with a Performance role has incentive to examine all the code, not just "its function." It asks questions no task would have prompted: "Are we using indexes on queries? What happens if this process runs while there's high traffic?"

Roles create ownership. And ownership produces better software.

Step 3: Set Explicit Autonomy Rules

Two extremes that kill agent team performance:

Too much autonomy: Agents go in different directions, produce inconsistent code, nobody has the global picture.

Too much control: Agents ask about everything. "Can I use this library? What indentation do you prefer? Confirm before continuing?" That's not an agent anymore — it's a chatbot with extra steps.

The middle ground: autonomy within explicit constraints.

Define what they can decide alone:

  • Choosing between two equivalent technical implementations → autonomous
  • Using an external library not mentioned → proposes and executes if it passes performance criteria
  • Changing the overall plugin architecture → requires check-in with Strategist

Define what requires veto:

  • Anything that touches database queries → Performance Agent approves
  • Anything that affects URLs or meta tags → SEO Agent approves

With these clear rules, agents move forward without getting paralyzed.

Step 4: Research Before Code

90% of people using AI agents for programming skip this step. And it's the one that makes the most difference.

Before the Architect wrote a line of PHP, the Research Agent spent time analyzing:

  • The 5 most popular auto-linking plugins on WordPress.org
  • Their 1-star reviews (why do they fail?)
  • The source code of the top 2 (what queries do they use? how do they handle volume?)
  • Known limits of the WordPress API for this type of operation

That research produced critical findings:

  • Popular plugins use str_replace() on content post-query. With 16,000 posts, that's 16,000 PHP operations. Slow.
  • The alternative: use MySQL directly with a single UPDATE that does the regex in the database. Orders of magnitude faster.
  • The real bottleneck isn't PHP, it's the MySQL connection. The optimal batch size for this server is ~500 posts per transaction.

Without that research, the Architect would have started with the obvious implementation (PHP loop) and we'd have hit the same problems as existing plugins. The Research Agent identified the right path before writing any code.

Step 5: Power of Veto — Constraints That Produce Better Software

The Performance Agent had one simple rule: no implementation averaging more than 100ms per post passes to QA.

This produced something interesting: the Architect, knowing there was a veto, designed differently from the start. Instead of "make it work and optimize later," it designed for performance from the first line.

Veto power isn't bureaucracy. It's a design mechanism.

When agents know there are non-negotiable constraints, they internalize those constraints and build with them in mind. The quality of the final product improves because the quality standard is embedded in the team's structure, not applied as a review at the end.

Step 6: Phases, Not "Build Everything"

The most destructive pattern I see in people using AI agents: "Build me the complete plugin."

That produces one of two outcomes:

  • A monolith that sort of works but has critical problems in details
  • An agent that gets lost in scope and doesn't produce anything

The correct approach: phases with explicit checkpoints.

For LeanAutoLinks:

  1. Research phase (Research Agent output: technical brief)
  2. Design phase (Architect produces architecture doc, no code yet)
  3. Implementation phase (Architect builds in iterations)
  4. Performance phase (Performance Agent runs benchmarks, veto if necessary)
  5. QA phase (QA Agent validates all test cases)
  6. Deploy phase (carefully, with rollback option)

Each phase has a concrete deliverable. No phase starts until the previous one is approved. This sounds bureaucratic but it's actually faster — you catch problems when they're cheap to fix.

What Karpathy Says About Junior Developers

I want to be direct about something Forbes pointed out.

If agents can do 80% of the code, what happens to developers who only know how to code?

The short answer: this is a real transition, and it's going to hurt some people.

But the longer answer is more interesting.

The people who will survive and thrive in this new environment aren't those who know the most syntax. They're the ones who:

  1. Understand problems deeply before jumping to solutions
  2. Know how to break down a complex problem into coordinated agent roles
  3. Can evaluate whether output is correct — not line by line, but architecturally
  4. Have domain expertise the agents lack — business context, user knowledge, what actually matters

This isn't the end of software development. It's a reorganization of what "software development" means.

The developer who knows how to direct an agent team of 6 with clear metrics, explicit constraints, and well-defined roles — that person can produce what previously required a complete team. That's not a threat to developers who adapt. It's a superpower.

What About Jobs?

The Forbes article focused on the economic impact on junior developers. It's a real concern.

But historically, every tool that multiplied developer productivity created more demand for software, not less.

The same will happen with code. As agents can build software faster and cheaper, we won't need less software. We'll build more. More tools, more automations, more integrations, more products.

The software market won't contract. It will expand. What will change is who can participate in it.

Another Example: An SEO Micro SaaS in One Day

And LeanAutoLinks wasn't the only project this week.

I also built an internal keyword rank tracking and gap analysis tool using the same agent approach. The problem was simple: I needed to monitor my keyword positions in Google and automatically detect content opportunities. Existing tools (Ahrefs, Semrush, Serpstat) cost $100-400/month and I barely use most of their features.

So I built a system that:

  • Syncs ranked keywords from my domains by connecting to SEO data APIs (~$0.01 per request vs $100+/month for traditional tools)
  • Detects content gaps automatically: queries where I have Google impressions but no content covering that topic
  • Creates glossary entries automatically when it detects terms people search for that we don't cover
  • Runs as a cron — every Monday it analyzes, detects opportunities, and generates the missing posts

The first run detected 742 queries with opportunities, created 4 glossary entries automatically, and identified a high-volume guide (19,000+ impressions) we didn't have covered.

Zero graphical interface. Zero fancy dashboard. Just scripts that do the work and an AI agent that orchestrates them.

This is the point Karpathy describes: an individual's capacity to build custom tools has multiplied exponentially. Before you needed a team to build an SEO SaaS. Now you can build exactly what you need in a day, with agents executing while you define the problem.

Same logic as always: define the problem with metrics, design the agents, let them execute.

If you use n8n for automation, you can connect these agents with workflows that run periodically and deliver reports without manual intervention.

If you have a server to host your projects, Hostinger has VPS plans that run Docker perfectly for this type of deployment.

Build in Public with AI as a Content Strategy

What you're reading right now is a direct consequence of that.

I built LeanAutoLinks with AI agents. I documented the process. That process becomes content that shows how the framework works in practice. The content generates visibility. Visibility generates conversations and questions. Questions generate more projects where I can apply the framework.

It's a self-reinforcing loop.

If you're building with AI, document it. Not the code — the process, the decisions, the mistakes, the results. That has value that code alone doesn't.

Conclusion

Karpathy says the bottleneck is no longer writing code. It's directing agents.

I've spent a week doing exactly that, and the results are real: a plugin in production, processing 16,000 posts in 90 seconds, with zero time on a traditional development team.

The framework I used isn't complicated:

  • Define the problem with metrics
  • Design roles, not tasks
  • Set explicit autonomy rules
  • Research before implementing
  • Include a Performance Agent with veto power
  • Respect the phase order

What makes the difference isn't the tool. It's the team design and the constraints you give it.

Next Steps

If you want to go deeper on this, I have two resources:

The full post on how I built LeanAutoLinks — with all the technical details, each agent's prompt code, and how I configured Agent Teams from scratch: I Built a WordPress Plugin with an AI Agent Team

My community — I'm publishing real-time projects I build with agents, the process, the mistakes, and the results. If you want to learn to do this yourself, it's where we work on it together: Join my community of founders at Cágala, Aprende, Repite

The future of software development isn't writing less code. It's knowing what to build, for what purpose, with what constraints — and letting agents execute it.

That was always the architect's job. Now anyone can do it.


📝 Originally published in Spanish at cristiantala.com

Top comments (0)