Ricardo M Santos for NotTheCode

Posted on Feb 23 • Originally published at notthecode.com

Autonomous AI Agents: The Shift from Coder to Intent Architect

#ai #career #agents

Autonomous AI agents don’t make teams faster. They make teams weirder: every developer becomes a manager, whether they asked for it or not. The hard part isn’t getting an agent to write code—it’s learning how to direct, constrain, and verify a system that can generate plausible output at industrial scale.

Most “autonomous ai agents” content stops at agent loops, tools, memory, and frameworks. Useful, but incomplete. The failure mode I keep seeing isn’t “the agent couldn’t code.” It’s “the human didn’t lead.”

The new team structure (you are the lead)

An autonomous AI agent is usually described as an LLM in a loop: it takes a goal, plans steps, calls tools (git, search, DB, CI), observes results, and iterates until it decides it’s done.

That definition is technically correct and operationally misleading.

On a real project, an agent behaves less like a library and more like a junior teammate with unlimited energy and no fear of being wrong. It will:

move fast,
fill in missing requirements with confident guesses,
produce code that “looks right”,
and occasionally ship a bug that passes every test you thought mattered.

If you treat the agent like a power tool, you’ll optimize for output volume (“generate more code”). If you treat it like a teammate, you’ll optimize for direction and verification (“generate the right change, for the right reason”).

That’s the framing behind Code Reviewing the Robot (The Infinite Intern): treat the agent like an infinite intern—high throughput, low judgment, and in constant need of a clear brief and a hard review.

From executioner to thinker

Autonomous agents change what software work is. The center of gravity moves from typing to reasoning.

The death of syntax valuation

For years, many developers built identity around “I know the right API” or “I can crank out working code quickly.” Agents are flattening that advantage. They can recall the API surface area faster than you can. They can produce a plausible implementation in any house style in seconds.

That doesn’t mean experience stopped mattering. It means experience moved upstream.

The teams doing well with agents aren’t the ones with the fanciest prompts. They’re the ones who can:

turn ambiguity into a concrete plan,
anticipate edge cases before code exists,
and notice when code is “correct” but misaligned with the real problem.

Introducing Intent Architecture

I call that skill Intent Architecture: the ability to express intent precisely enough that a fast, literal system can execute it without inventing requirements.

Intent Architecture is not “prompt engineering,” because prompts are often treated as copywriting tricks. It’s not “context engineering” either, because dumping context doesn’t automatically create clarity.

Intent Architecture is closer to what good tech leads already do:

define boundaries (“what we are not doing”),
name invariants (“what must always be true”),
specify acceptance checks,
and choose tradeoffs explicitly.

This is also where teams collide with the Context Gap: the difference between what you assume and what the agent can infer. The agent doesn’t share your product history, your scars, or the unwritten rules. If you don’t bridge that gap, it will bridge it for you—with fiction. (Related: The Context Gap.)

The Junior Paradox (the blind leading the robot)

Juniors are being asked to manage agents from day one.

That’s the paradox: a junior developer can now “produce” senior-looking output by directing an agent, but they may not have the judgment to evaluate it. They’re effectively supervising a tireless worker while still learning what “good” looks like.

I’ve watched this pattern play out in PR review:

The diff is huge because the agent happily refactored adjacent code “for consistency.”
The code is clean, typed, well-formatted, and full of thoughtful comments.
The logic is subtly wrong because it copied a pattern from a different subsystem.
Tests pass because the tests encode the same wrong assumption.

Nobody did anything malicious. The failure was managerial: the brief didn’t constrain the blast radius, and the review didn’t interrogate intent.

If this sounds harsh on juniors, it’s not meant to be. It’s an argument for changing what we teach and what we reward. In an agentic workflow, judgment becomes the first-class skill, not the capstone.

Managing the agentic workflow

You don’t “pair program” with an autonomous agent the way you pair with a human. You manage it. Two steps matter more than everything else: the Brief and the Review.

The brief (structuring the request)

A good brief is not long. It’s specific where it counts.

When I’m asking an agent to implement something non-trivial, I try to include:

Goal: the user-visible behavior in one or two sentences.
Non-goals: what not to change (especially refactors).
Constraints: performance, security, compatibility, migration rules.
Interfaces: where the change must live (files/modules/services).
Acceptance checks: the tests, scenarios, or invariants that must hold.

If you want a concrete artifact, write it like a small spec object the agent can’t “reinterpret.” For example, a TypeScript “intent contract”:

export type ChangeRequest = {
  goal: string;
  nonGoals: string[];
  constraints: string[];
  touchPoints: {
    mustChange: string[];
    mustNotChange: string[];
  };
  acceptance: {
    testsToAddOrUpdate: string[];
    scenarios: string[];
    invariants: string[];
  };
};

The point isn’t to build a framework. The point is to force clarity before code exists.

A brief like “Add caching to improve performance” is an invitation to creative writing. A brief like “Cache GetPricing(customerId) for 60 seconds; do not cache errors; must invalidate on PricingUpdated event; no changes outside PricingService and its tests” is management.

This is also where “autonomy” should be negotiated. If you give an agent a broad goal and a repo, it will produce broad changes. If you want narrow change, you must specify narrow change.

The review (quality control)

Agents are great at producing code that is syntactically valid and stylistically consistent. That’s exactly why review becomes harder: the code looks trustworthy.

This is where teams start paying the Silent Tax—the long-term cost of “fine for now” code that drifts from intent and quietly increases complexity. The tax doesn’t appear as a single outage; it shows up as slow merges, fragile behavior, and a growing fear of touching certain modules. (Related: The Silent Tax.)

A useful review pattern for agent-written changes is to review in this order:

Intent alignment: Does the code match the brief’s goal and non-goals?
Domain truth: Are the assumptions true in this system?
Failure modes: What happens when dependencies fail, inputs are weird, or state is stale?
Blast radius: Did the agent change anything it wasn’t asked to change?
Observability: Will you know when it breaks?

One practical trick: ask the agent to write a short “risk memo” before you merge.

Example prompt (kept intentionally plain):

List the top 5 ways this change could fail in production. For each, show where the code would misbehave and how we’d detect it (logs/metrics/tests).

You’re not asking for perfection. You’re forcing the agent to surface its own weak spots, and you’re giving the human reviewer a map.

If you’re in .NET, another strong move is to make invariants executable. When the agent is changing logic-heavy paths, I’d rather see a small set of targeted tests than a long explanation.

[Fact]
public async Task Does_not_cache_failures()
{
    var sut = CreatePricingService(failOnce: true);

    await Assert.ThrowsAsync<HttpRequestException>(() => sut.GetPricing("cust-1"));
    // Second call should re-try, not read cached failure
    var price = await sut.GetPricing("cust-1");

    Assert.NotNull(price);
}

Agents can write tests. Humans need to decide which truths are worth pinning down.

The manager’s dilemma (what seniors should teach now)

If you’re a tech lead or engineering manager, the uncomfortable shift is this: teaching syntax, libraries, and patterns is no longer the highest-leverage thing you can do.

Those still matter, but agents made them easier to acquire and easier to fake.

The skills that don’t compress well are:

problem framing,
tradeoff selection,
and the ability to detect when an implementation is “plausible” but wrong.

That means mentoring changes shape. Code review becomes less about formatting and more about interrogating intent:

“What problem is this solving?”
“What did we decide not to solve?”
“Which assumption would make this approach dangerous?”
“How will we know we broke it?”

That’s not philosophical. It’s how you keep an agentic team from shipping confident nonsense at scale.

Survival skills for the agent era

If autonomous AI agents are going to be “team members,” developers need the skills that make someone a safe lead.

A short list I trust (because I’ve watched teams fail without it):

Decomposition under ambiguity: turning “it should work like X” into testable statements.
Boundary setting: explicitly stating what must not change.
Evaluation literacy: knowing how to verify correctness beyond “tests are green.”
Adversarial thinking: imagining how this breaks when reality disagrees.
Taste for small diffs: preferring incremental, observable change over sprawling rewrites.

The punchline is not “everyone must become an architect.” It’s narrower: everyone must become competent at Intent Architecture, because directing autonomous agents is a leadership task, even when your title isn’t.

If you adopt that framing, the agent stops being magic. It becomes what it always was: a fast worker that needs a competent lead.

DEV Community