This is the mental model I use when designing or reviewing any agent system—everything else is just implementation detail.
The term "AI Agent" is everywhere, but most explanations either oversimplify it ("it's an AI that does stuff") or overcomplicate it with academic jargon. Neither is useful if you want to build one.
This post will give you:
- A clear mental model of what an agent actually is.
- A detailed look at how the execution loop works internally.
- A copy-paste-ready agent you can run in 5 minutes using NodeLLM.
What is an AI Agent?
An AI Agent is not magic. It is a simple architecture pattern with three components:
- A Brain (the LLM): This is the reasoning engine. It decides what to do next based on the current context.
- Hands (Tools): These are functions that the LLM can choose to call. They let the agent interact with the real world—search the web, query a database, send an email. For a deeper dive into how this works, check out Tool Calling in LLMs.
- A Loop (the Orchestrator): This is the code that connects the brain to the hands. It sends the user's request to the LLM, executes any tool calls, feeds the results back, and repeats until the LLM decides it has a final answer.
The key insight: An LLM cannot actually do anything. It can only generate text. The "agent" behavior emerges from the loop that interprets the LLM's output and takes action on its behalf.
The Execution Loop (How It Works)
Let's trace a single request through an agent using a classic example:
User: "What's the weather in London and Paris?"
Turn 1:
- Your code sends the user message + a description of available tools to the LLM.
-
The LLM responds with a special "tool_calls" output:
{ "tool_calls": [ { "id": "call_1", "function": { "name": "get_weather", "arguments": "{\"city\": \"London\"}" } }, { "id": "call_2", "function": { "name": "get_weather", "arguments": "{\"city\": \"Paris\"}" } } ] } Your code (the orchestrator) sees
tool_calls, so it does not return to the user yet.Your code executes both functions (potentially in parallel).
Your code appends the results to the conversation history as new messages with role
tool.
Turn 2:
- Your code sends the updated history (including the tool results) back to the LLM.
-
The LLM now has all the information it needs. It responds with a regular text message:
"The weather in London is 12°C and rainy. Paris is 18°C and sunny." Your code sees there are no
tool_callsin this response, so the loop ends.The final text is returned to the user.
This "loop until no more tool calls" pattern is the core of every agent framework.
Why This Is Hard Without a Framework
If you try to build this loop yourself using the raw OpenAI or Anthropic SDKs, you'll quickly find yourself solving:
- Schema Boilerplate: You must manually write JSON Schema for every tool.
- Loop Management: You must implement the recursive call logic.
- Parallel Execution: Handling concurrent tool calls safely.
- Error Handling: What happens if a tool throws? Do you retry? Tell the LLM?
- Streaming: How do you handle tool calls that arrive during a stream?
- Runaway Loops: What if the LLM keeps calling tools forever?
NodeLLM handles all of this out of the box.
Practical Implementation: The System Inspector
The weather analogy is great for understanding the concepts, but in production, we want agents that interact with our actual infrastructure.
Here is a complete, working agent using NodeLLM. Instead of weather, we'll build a System Inspector with two tools: one to check machine resources (CPU/Memory) and one to inspect project files.
1. Install
npm install @node-llm/core
2. Define Your Tools
import { createLLM, Tool, z } from "@node-llm/core";
import os from "node:os";
import fs from "node:fs/promises";
// Tool 1: Get System Resources
class SystemInfoTool extends Tool {
name = "get_system_info";
description = "Returns CPU and Memory usage of the current machine.";
schema = z.object({});
async execute() {
const freeMem = Math.round(os.freemem() / 1024 / 1024);
const totalMem = Math.round(os.totalmem() / 1024 / 1024);
return {
os: os.type(),
architecture: os.arch(),
memory: `${freeMem}MB free / ${totalMem}MB total`,
cpus: os.cpus().length
};
}
}
// Tool 2: Inspect Project Files
class FileInspectorTool extends Tool {
name = "list_files";
description = "Lists files in the current directory to understand project structure.";
schema = z.object({
dir: z.string().default(".").describe("Directory to list")
});
async execute({ dir }) {
try {
const files = await fs.readdir(dir);
return { directory: dir, files: files.slice(0, 10) }; // Limit for brevity
} catch (err) {
return { error: `Could not read directory: ${err.message}` };
}
}
}
3. Run the Agent
async function main() {
const llm = createLLM({ provider: "openai" });
const chat = llm
.chat("gpt-4o")
.system("You are a system administrator agent. Help the user understand their environment.")
.withTools([SystemInfoTool, FileInspectorTool]);
// NodeLLM handles the entire agentic loop (Turn 1 -> Run Tools -> Turn 2)
const response = await chat.ask(
"How much memory do I have left, and what's in my current folder?"
);
console.log("Agent Response:", response.content);
}
main();
4. How to Run Locally
To run this example, you need an OpenAI API key. Save the code above as agent.ts and run:
export OPENAI_API_KEY='your_key_here'
npx tsx agent.ts
NodeLLM will automatically pick up the OPENAI_API_KEY from your environment.
5. What Happens When You Run This
- The LLM receives the request and sees the
get_system_infoandlist_filestools. - It realizes it needs both to answer the question, so it generates two tool calls.
- NodeLLM executes the
osandfscalls on your behalf. - The results are fed back to the LLM.
- The LLM synthesizes a final answer like: > "You have 4096MB of free memory out of 16384MB. In your current folder, I found package.json, src, and README.md."
Zero boilerplate. Zero loop management. Just tools and a question.
Safety Features Built-In
NodeLLM includes guards to prevent runaway agents:
Loop Protection
// Max 5 tool execution turns by default (configurable)
const llm = createLLM({ maxToolCalls: 10 });
Human-in-the-Loop
chat
.withToolExecution("confirm")
.onConfirmToolCall(async (call) => {
console.log(`Agent wants to call: ${call.function.name}`);
return await askUserForApproval(); // true = execute, false = cancel
});
Fatal Error Handling
import { ToolError } from "@node-llm/core";
class DangerousTool extends Tool {
async execute({ action }) {
if (action === "delete_everything") {
throw new ToolError("Blocked dangerous action", this.name, { fatal: true });
}
}
}
When NOT to Use an Agent
Agents are powerful, but they are often overkill. You should stick to simple LLM calls or deterministic code if:
- The task is single-step: If you just need a summary or a translation, a plain
chat.ask()is faster and cheaper. - No external tools are needed: If the LLM has all the info in its weights or context, don't wrap it in an execution loop.
- Determinism is required: If the logic is a fixed set of
if/elsestatements that can be written in TypeScript, don't let an LLM "reason" its way through it. It's slower, more expensive, and prone to hallucination.
Senior engineers know that the best code is the code you didn't have to write. Use agents for dynamic orchestration, not for static business logic.
Conclusion
An AI Agent is just a loop with side effects: ask the LLM, execute tools, repeat.
The complexity is in the details—schema generation, parallel execution, streaming, error handling, and safety. NodeLLM handles all of that, so you can focus on defining what your agent can do, not how to make it work.
npm install @node-llm/core
Start building agents today. Check out the Agentic Workflows Guide, browse the NodeLLM Documentation, or explore the Brand Perception Checker for a real-world example.
Top comments (0)