Leo Han

Posted on Jun 9

building-ai-agents-right-way

#agents #ai #architecture #softwareengineering

Building AI Agents the Right Way: Lessons from Anthropic Engineering

Why Following a Tutorial Will Make Your System Worse

2025–2026 has seen an explosion in AI Agent adoption, yet most teams keep making the same mistakes. As the Anthropic Engineering team has learned through extensive real-world experience: blindly "following a tutorial to build an Agent" will only produce a worse system.

This guide distills the core principles and practical methods for building effective, reliable Agent systems.

Principle 1: Don't Use Agents Just to Use Agents

Stop Using Agents Just to Use Agents.

This is the most important principle of all. Agents are not a silver bullet. Many teams jump straight to handing every task to an Agent, only to discover that latency skyrockets, costs spiral out of control, and system stability falls below what their original deterministic logic achieved.

Agents are appropriate for:

Open-ended reasoning tasks (multi-step, non-deterministic paths)
Tasks requiring interaction with multiple external tools
Tasks where the objective is clear but the execution path needs dynamic adjustment

Agents are not appropriate for:

Simple CRUD operations — use a direct API call
Deterministic data processing — use a traditional pipeline
Problems solvable by a single model call — no need to wrap it in an Agent framework

In one sentence: if an API call can do it, don't bring in an Agent.

Principle 2: Plan → Execute, Step by Step

An Agent is not as simple as "throw an instruction at the model and wait for the result." A robust Agent architecture follows a Plan → Execute two-phase model:

Plan phase: The Agent first thinks holistically and devises an execution plan. During this phase, it calls no external tools — it relies purely on reasoning to break the task into executable steps.

Execute phase: Execute step by step according to the plan, with each step having clear inputs, outputs, and validation conditions.

Step 1 → Step 2 → Step 3
  ↓         ↓        ↓
Observe   Observe   Final Result

This "plan first, execute later" pattern is far more stable than "think while doing," because the planning phase provides the Agent with a global perspective, preventing it from getting stuck in local optima or drifting off course.

Principle 3: Memory Is the Soul of an Agent

The Agent's core challenge isn't "intelligence" — it's "memory." Memory operates across two dimensions:

Internal Memory: The Agent's context within a single execution — including the task description, completed steps, observations, and intermediate reasoning. This is essentially the LLM's context window. Managing the context well (not losing critical information, not retaining redundancy) is the foundation of Agent stability.

External Storage: Persistent memory that spans multiple executions. This includes:

Task history and result caching
Lessons learned from experience
User preferences and feedback

External storage enables the Agent to learn from history, avoid repeating mistakes, and gradually optimize its behavior over time.

Principle 4: Embrace Sub-Agents

Complex tasks should not be handled by a single monolithic Agent. The Sub-Agent pattern is standard equipment for production-grade Agent systems:

Master Agent (Orchestrator): responsible for task understanding, decomposition, and scheduling
Sub-Agents: each handles a specialized sub-task (search, code generation, data analysis, format conversion, etc.)

Benefits of this architecture:

Each sub-agent has more focused instructions, leading to fewer hallucinations
Multiple sub-tasks can run in parallel
Sub-agents are decoupled, making problems easier to isolate
Different models can be used for different sub-agents (expensive ones for reasoning, cheap ones for formatting)

Principle 5: Avoid Premature Correction

Errors during Agent execution are normal, but your error-handling strategy determines the final system quality.

The common mistake: the Agent takes one step, finds the result unsatisfactory, immediately self-corrects, revises the plan, and re-executes. This causes "oscillation" — the Agent ping-pongs between directions and never finishes.

Wrong pattern:
Original Bug → AI Auto-Fixed → Introduces New Bug → Fix Again → ...

Right pattern:
Original Bug → Complete Full Execution First → Evaluate Holistically → One-Time Fix

The correct approach: let the Agent complete a full execution first, then make a one-time correction based on the overall result — rather than "fine-tuning" at every step. In other words, give the Agent room to make mistakes, but limit how often it can correct itself.

Principle 6: Tool Chain Design

More tools doesn't mean a better Agent. The key principles for tool chain design:

Atomicity: each tool does one thing, and does it well. Don't design "all-purpose" tools.
Standardized I/O: all tools use a unified input/output format to reduce the Agent's cognitive load.
Clear error returns: when a tool call fails, the return must precisely describe "what went wrong" and "possible correction paths" — not a vague "Error."
Least privilege: each sub-agent is granted only the minimum toolset needed to complete its task.

Principle 7: Start from Mature Prompts

Don't write an Agent's system prompt from scratch. Search for and draw on mature prompts that have been thoroughly tested in the community, then adapt them to your specific scenario.

A good Agent prompt typically includes:

Clear role definition and capability boundaries
Explicit tool-calling formats with examples
Error-handling strategies
Termination conditions and output format requirements

Anthropic Engineering's practice shows that prompt engineering is far more critical in the Agent context than in ordinary chat scenarios — a well-structured prompt can reduce failure rates from 40% to below 5%.

The Complete Agent System Pipeline

User Input
    ↓
Task Parsing & Classification (Orchestrator Agent)
    ↓
Plan Phase: Devise Execution Plan
    ↓
Distribute Sub-tasks to Sub-Agents
    ↓
Sub-Agent 1      Sub-Agent 2      Sub-Agent 3
(Search)         (Code)           (Analysis)
    ↓                ↓                ↓
Aggregate & Validate Results (Orchestrator Agent)
    ↓
(If Necessary) One-Time Correction
    ↓
Final Output

Conclusion

Building an AI Agent is not about buying a framework and tweaking a few parameters. A truly effective Agent system requires thoughtful design decisions at the architectural level. Remember these seven core principles: use Agents sparingly, plan before executing, manage internal and external memory, decompose complexity with sub-agents, avoid premature correction, design tool chains carefully, and start from mature prompts.

The most important lesson is this: the goal of an Agent is not to look "intelligent" — it's to complete work stably and predictably. If following a tutorial verbatim produces a working solution, that task likely never needed an Agent in the first place.

This article is adapted from the video "Building AI Agents — Follow the Tutorial and Your System Will Only Get Worse," drawing on practical experience from the Anthropic Engineering team and covering the seven core principles of Agent construction along with the complete pipeline design.

DEV Community