Everyone can call an agent framework. Almost nobody can rebuild what's under it — and then something breaks: a tool call loops forever, the token bill explodes, and the framework's abstraction is suddenly the thing standing between you and the fix.
Here's the uncomfortable secret: a coding agent — the thing under Claude Code, Cursor, and every "AI engineer" product — is one loop. Ask the model what to do, run the tools it asks for, feed the results back, repeat until it says done. This post builds that loop in plain Node, zero dependencies, and shows the two traps almost every first implementation falls into.
The loop
Any model — real or mocked — is just a function of { messages, tools } that returns either { toolCalls: [...] } or { done: true, text }. Everything plugs into this:
export async function runAgent({ task, model, tools, maxSteps = 10, log = () => {} }) {
const messages = [{ role: 'user', content: task }];
for (let step = 1; step <= maxSteps; step++) {
const out = await model({ messages, tools });
if (out.done) return { text: out.text, steps: step, messages };
for (const call of out.toolCalls) {
const tool = tools[call.name];
let result;
try {
result = tool
? String(tool.run(call.args || {}))
: `error: unknown tool "${call.name}"`;
} catch (e) {
result = `error: ${e.message}`;
}
log(`step ${step}: ${call.name} -> ${result.slice(0, 70)}`);
messages.push({ role: 'assistant', toolCall: call });
messages.push({ role: 'tool', name: call.name, content: result });
}
}
return { text: '(max steps reached without finishing)', steps: maxSteps, messages };
}
Four decisions in there are load-bearing:
- The model is a parameter. A scripted mock and a live LLM satisfy the same contract, so the loop never changes — and your CI can run the agent with no API key.
-
maxStepscaps the loop. An agent without a step budget is an infinite loop with a credit card. -
Tool failures are data, not crashes. The
try/catchturns a thrownENOENTinto a result string the model sees — so it can react: fix the path, try another approach. MCP encodes the same idea asisError. Without this, one bad read kills the whole run. -
The transcript is the state. The model is stateless; the growing
messagesarray is the agent's entire memory.
Trap #1: path.resolve is not a sandbox
Give the agent file tools and you'll probably write this:
const resolve = (p) => path.resolve(cwd, p); // "all paths stay in the workspace"... right?
Wrong — and you can prove it in one line: path.resolve(cwd, '../escape.txt') happily returns a path outside cwd, and so does any absolute path. Your "sandboxed" agent can write anywhere the process can.
Real containment is resolve, then verify:
const resolve = (p) => {
const full = path.resolve(cwd, p);
const rel = path.relative(cwd, full);
if (rel.startsWith('..') || path.isAbsolute(rel)) {
throw new Error(`path escapes the workspace: ${p}`);
}
return full;
};
path.relative answers "how do I get from the workspace to this path?" — if the answer starts with .., the path left the building. And notice what happens next: the throw lands in the loop's try/catch and comes back to the model as error: path escapes the workspace — containment and error-feedback working together instead of crashing the run.
Trap #2: thinking the wire format is the hard part
Wiring a real model in is smaller than people expect. Every provider adapter ever written is two translations: your transcript → the provider's message format, and the response → your loop's contract. Over raw HTTP against the Anthropic Messages API, the response side is just:
if (msg.stop_reason === 'tool_use') {
return {
toolCalls: msg.content
.filter((b) => b.type === 'tool_use')
.map((b) => ({ id: b.id, name: b.name, args: b.input })),
};
}
return { done: true, text: msg.content.filter((b) => b.type === 'text').map((b) => b.text).join('\n') };
The only subtlety: the id on each tool_use block must come back on the matching tool_result — which is why the loop stores the whole call object in the transcript. The id rides along for free.
Run it
The full version of everything above — loop, tools with real containment, a scripted mock model, and the raw-HTTP adapter — is runnable in ten seconds, no API key:
git clone https://github.com/TheSeydiCharyyev/build-your-own-agent
node build-your-own-agent/reference/06-coding-agent/example.mjs
step 1: write_file({"path":"greet.txt",...}) -> wrote greet.txt (15 bytes)
step 2: read_file({"path":"greet.txt"}) -> hello from byoa
step 3: done
final: Created greet.txt and verified its contents: "hello from byoa"
Same loop against a live model: node example.mjs --real with an ANTHROPIC_API_KEY. The loop file doesn't change — that separation is the whole point.
The loop is 1 of 10 components
The agent loop is just one piece of the stack. The full map — build-your-own-agent — indexes the single strongest from-scratch tutorial for each of the 10 components: the agent loop, tool calling, memory, RAG, MCP, coding agents, token accounting, evals, multi-agent, and guardrails. Where nothing good existed, it ships its own step-by-step tutorials, each building from an empty file to running code:
- Build a coding agent — the full version of this post: loop, tools, sandboxing, mock and live models
- Build an MCP server + client — the whole "protocol" is JSON-RPC and 3 methods
- Build token accounting — cost math and prompt-cache economics; the demo reproduces the ~80% cache saving from first principles
Every curated link is fetched and re-checked weekly by CI, every demo runs on every push, and no resource appears twice across the index. It's deliberately an index, not another course — vendor-neutral and framework-free.
If you know a stronger from-scratch resource for any section, the bar is high on purpose — bring it.
Top comments (0)