AI agents aren't magic. They're just a loop. This workshop breaks down how to build a real Todo Agent in 11 commits, teaching you the core pattern behind every AI agent system.
Everyone's talking about AI agents, but most explanations are either too abstract ("agents can reason and act!") or too complex (production systems with 50 dependencies).
This workshop takes a different approach: build one from scratch, one commit at a time. By the end, you'll understand the fundamental loop that powers everything from ChatGPT plugins to autonomous coding assistants.
GitHub Repository: https://github.com/sumitvairagar/simple-agent-workshop
The Journey: 11 Commits to Understanding
Step 1: Set Up the Project Skeleton
Every project starts somewhere. We begin with just a .gitignore and a README. Nothing runs yet, but we know what we're building: a Todo Assistant that can add tasks, list them, mark them done, and even prioritize using AI.
Key Learning: Start simple. Define the goal before writing code.
Step 2: Add Dependencies and TypeScript Config
We install the essentials:
-
openaiβ Official SDK to talk to GPT -
dotenvβ Load API keys securely -
tsxβ Run TypeScript without compilation -
typescriptβ Type safety and autocomplete
Key Learning: Modern AI development needs surprisingly few dependencies.
Step 3: Say Hello to GPT π
Our first message to GPT and back!
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }]
});
console.log(response.choices[0].message.content);
What to Notice:
- We send a
messagesarray with a role and content - GPT replies in
choices[0].message.content -
finish_reasontells us WHY GPT stopped ("stop"= it's done)
Key Learning: The OpenAI API is just HTTP requests. No magic.
Step 4: Give GPT a Memory with Conversation History
The Problem: GPT has NO memory between calls. Every request starts fresh.
The Solution: Keep a messages array and send the full conversation every time.
The Golden Rule:
- Push the user message into history
- Call the API with the full history array
- Push GPT's reply into history too β never skip this!
const messages = [];
messages.push({ role: "user", content: "My name is Alice" });
// ... call API ...
messages.push(response.choices[0].message);
messages.push({ role: "user", content: "What's my name?" });
// GPT correctly remembers: "Your name is Alice"
Key Learning: Conversation memory is just an array. You manage it, not GPT.
Step 5: Stream GPT's Reply Word by Word β¨
Instead of waiting for the full response, tokens appear as GPT writes them. This makes everything feel alive and instant.
const stream = await openai.chat.completions.create({
model: "gpt-4o",
messages,
stream: true
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || "";
process.stdout.write(content);
}
Key Learning: Streaming is the difference between "loading..." and feeling like you're talking to something intelligent.
Step 6: Give GPT Tools to Work With π§
Tools give GPT superpowers beyond just chatting. Each tool is a description that tells GPT:
- What it's called (
add_todo,list_todos,mark_done) - When to use it (the description)
- What inputs to pass
const tools = [
{
type: "function",
function: {
name: "add_todo",
description: "Add a new task to the todo list",
parameters: {
type: "object",
properties: {
task: { type: "string", description: "The task to add" }
}
}
}
}
];
Important: GPT does NOT run tools itself. It just tells us "please run add_todo with this input" and we do the actual work.
When GPT wants a tool, finish_reason changes to "tool_calls".
Key Learning: Tools are just JSON schemas. GPT reads them and decides when to use them.
Step 7: Actually Run the Tools GPT Asks For π
Now we close the loop β when GPT says "call this tool", we do it!
The Flow:
- Find all
tool_callsin the response - Run the matching function (
add_todo,list_todos,mark_done) - Push each result back as a
"tool"role message - Include the
tool_call_idso GPT knows which result matches which request
for (const toolCall of response.choices[0].message.tool_calls) {
const result = executeTool(toolCall.function.name, toolCall.function.arguments);
messages.push({
role: "tool",
tool_call_id: toolCall.id,
content: JSON.stringify(result)
});
}
Key Learning: Tool execution is just function calls. You write the functions, GPT decides when to call them.
Step 8: The Agent Loop β This Is Where the Magic Happens π€
This is THE commit. Everything before was setup. This is the agent.
The whole idea in plain English:
Keep asking GPT what to do next.
If it wants a tool β run it, hand GPT the result, ask again.
If it says it's done β stop and show the answer.
while (true) {
const response = await callGPT(messages);
if (response.finish_reason === "tool_calls") {
// Run the tools and add results to messages
executeTools(response.tool_calls);
continue; // Ask GPT again
}
if (response.finish_reason === "stop") {
break; // GPT is done
}
}
That loop is what makes this an "agent" instead of a chatbot.
GPT decides:
- Which tools to call
- How many times
- When to stop
We just follow its lead.
Key Learning: The agent loop is the heart of every AI agent system. Master this pattern, and you understand 90% of AI agents.
Step 9: Add a System Prompt to Guide the Agent's Behaviour
Without instructions, GPT will do something... but inconsistently.
The system prompt is like a job description β it tells GPT:
- What role it's playing (todo list assistant)
- The rules to follow (add, list, mark done)
- How to behave (brief and friendly)
messages.push({
role: "system",
content: `You are a helpful todo list assistant.
Rules:
- Use add_todo to add tasks
- Use list_todos to show all tasks
- Use mark_done to complete tasks
- Be brief and friendly`
});
Try This: Comment out the system prompt and run again. The difference is night and day.
Key Learning: System prompts are how you control agent behavior. They're more important than you think.
Step 10: Stream the Final Answer to the User in Real Time
After all the tools are done, we do one final call to GPT and stream the summary token by token to the terminal.
Why Two Separate Calls?
- During the loop we need the complete response to see tool requests
- Streaming and tool use can't be combined cleanly β pick one
- So: tool use in the loop, streaming for the final pretty answer
Key Learning: Production systems often separate "thinking" (tool use) from "presentation" (streaming).
Step 11: Add a Prioritize Tool That Calls GPT Under the Hood πͺ
One GPT call using another GPT call as a tool.
The main agent manages the todo workflow. When the user asks to prioritize, it hands off the thinking to a smaller focused GPT call (gpt-4o-mini) that reorders tasks.
Main agent β calls prioritize β GPT sub-call β reordered list β back to main agent
Key Learning: This "agent inside an agent" pattern is how real production systems handle tasks that are too big or specialized for one model to do alone.
The Big Picture
After these 11 steps, you've built a real AI agent. Not a toy, not a demo β a working system that:
- Maintains conversation memory
- Uses tools to interact with the world
- Makes decisions autonomously
- Streams responses for great UX
- Can even delegate to other AI calls
And the core pattern? It's just a loop:
1. User sends message
2. GPT decides which tool to use
3. Execute the tool
4. Send result back to GPT
5. GPT responds to user
That's it. That's the pattern behind ChatGPT plugins, GitHub Copilot, autonomous coding agents, and every other AI agent system.
Try It Yourself
The workshop is designed to be hands-on. Clone the repo and use the demo script to navigate through each commit:
git clone https://github.com/sumitvairagar/simple-agent-workshop.git
cd simple-agent-workshop
./simple-agent-workshop-demo.sh
Commands:
-
nβ Next step -
pβ Previous step -
lβ List all steps -
g 5β Jump to step 5 -
qβ Quit
At each step, read the code, run npm start, and see how it works.
Why Software Engineers Need to Understand This
AI agents aren't replacing software engineers. They're becoming a core tool in our toolkit.
Understanding how they work means:
- You can build AI features into your products
- You can debug when they fail (and they will)
- You can architect systems that use AI effectively
- You can evaluate which problems AI can actually solve
This isn't optional knowledge anymore. It's foundational.
Just like every engineer should understand HTTP, databases, and async programming β understanding the agent loop is becoming a core skill.
What's Next?
This workshop teaches the fundamentals. Real production systems add:
- Error handling and retries
- Rate limiting and cost controls
- Observability and logging
- Multi-agent orchestration
- Memory systems beyond conversation history
- Security and sandboxing
But they all build on this same core loop.
Master the basics first. Then level up.
Resources
- GitHub Repo: https://github.com/sumitvairagar/simple-agent-workshop
- OpenAI API Docs: https://platform.openai.com/docs
- TypeScript: https://www.typescriptlang.org
Built something cool with this? Share it in the comments! I'd love to see what you create.
Top comments (0)