AI Agents: Why Simple Chains Beat Complex Orchestration
I've built nine AI features into CitizenApp, and I keep seeing the same pattern: developers get seduced by "agentic" architectures when a straightforward chain of function calls would work better.
Let me be direct: most AI agent frameworks are over-engineered. They look impressive in demos, but they introduce latency, unpredictability, and debugging nightmares in production. I prefer explicit chains with clear control flow because I can reason about them at 3am when something breaks.
What People Mean by "AI Agents"
When folks say "agents," they usually mean one of two things:
- Autonomous decision-making loops – An LLM decides what tool to call, calls it, sees the result, decides the next step
- Function calling with retry logic – Structured tool use with error handling and fallback strategies
The first one sounds magical. It's also fragile.
Here's why: every decision loop adds latency and a chance for the model to hallucinate. If you're building a user-facing feature, you can't afford to have your AI agent decide to call the wrong endpoint three times before giving up.
The CitizenApp Approach: Explicit Chains
In CitizenApp, I use what I call "orchestrated chains" – the developer defines the flow, the AI fills in the details.
Here's a real example from our document classification feature:
async function classifyAndExtractDocument(
documentText: string,
userId: string
): Promise<ClassificationResult> {
// Step 1: Extract structured data
const extracted = await extractWithClaude(documentText, {
fields: ['documentType', 'issueDate', 'amount', 'parties'],
});
// Step 2: Validate against known schema
const validated = validateSchema(extracted, documentType);
// Step 3: If validation fails, ask Claude to correct
if (!validated.success) {
const corrected = await extractWithClaude(documentText, {
fields: ['documentType', 'issueDate', 'amount', 'parties'],
instructions: `Previous attempt failed validation: ${validated.errors.join(', ')}. Please re-extract with these constraints in mind.`,
});
return corrected;
}
// Step 4: Enrich with business logic
const enriched = await enrichDocumentData(validated.data, userId);
return enriched;
}
Notice: no loops, no tool-calling framework, no "let the AI figure it out." The developer controls the flow. Claude does what it's good at—understanding text and extracting meaning.
When (Rarely) You Need Real Agents
I use actual agentic loops in exactly one place in CitizenApp: our research assistant. Here's why it works there:
- The user doesn't expect a response in < 500ms
- The task is inherently exploratory (the AI discovers what it needs to know)
- Failure is recoverable (the assistant can try another search or re-frame the question)
async def research_assistant(query: str, user_id: str, max_iterations: int = 5):
"""
Actual agent loop. Used sparingly. Only when the problem
is exploratory and latency isn't critical.
"""
conversation_history = []
for iteration in range(max_iterations):
# Get model's decision
response = await claude.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
system=RESEARCH_SYSTEM_PROMPT,
tools=RESEARCH_TOOLS,
messages=conversation_history
)
# Check if done
if response.stop_reason == "end_turn":
return extract_final_answer(response)
# Process tool calls
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = await execute_research_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
# Add to history and continue
conversation_history.append({"role": "assistant", "content": response.content})
conversation_history.append({"role": "user", "content": tool_results})
return {"error": "Max iterations reached"}
This works because the research assistant runs async in the background. The user gets told "researching..." and waits. Not ideal for API responses.
The Performance Cost
Here's what burned me: I started with LangChain's agent executor for a simpler use case. On paper, it looked elegant. In practice:
- Latency: Each agent loop added 200-400ms just for the API call round-trip
- Costs: Agentic loops meant more API calls. A task that could be done in one smart prompt became 3-4 model calls
- Debugging: When the agent did something unexpected (and it will), tracing why was like debugging a black box
I switched back to explicit chains. Same capabilities, 70% less latency, fraction of the cost.
My Rules of Thumb
Use explicit chains when:
- Response latency matters (most user-facing features)
- The flow is somewhat predictable
- You're building a feature, not a research tool
- Costs are a concern (spoiler: they always are)
Use agent loops when:
- The problem is genuinely exploratory
- Latency is acceptable (> 2-3 seconds)
- The task naturally requires multiple decision points
- You have a good error budget
The Right Tool
For most SaaS features, Claude + structured outputs + explicit orchestration beats "agentic" frameworks every time.
const result = await extractStructuredData(
input,
zodSchema(MyOutputShape)
);
This is less "AI" (less autonomous), but more reliable. And in production, reliability beats magic.
Gotcha: Tool Use Isn't the Same as Agents
I lumped these together early on. They're not the same thing.
Tool use = the model can call functions. Developers still control the loop.
Agents = the model decides what to do, including whether to use tools and when to stop.
Tool use is great. Tool use + explicit orchestration is my preferred pattern. Agents make me nervous. Your mileage may vary depending on your risk tolerance and latency requirements.
The unsexy truth: most winning AI features aren't "agents" at all. They're Claude doing one thing very well, wrapped in clear application logic.
Top comments (1)
Solid take on explicit chains vs agentic loops. One thing I've noticed in production: even with explicit chains, agents still struggle with the 'brainstorm vs build' distinction. They'll receive a 'help me think through this' prompt and immediately start calling tools. Built Brainstorm-Mode (mehmetcanfarsak on GitHub) that adds a mode layer to hook systems — divergent, actionable, academic modes that block tool calls during ideation. It's the same philosophy as your chains: explicit structure over letting the model figure it out.