CONTENT:
AI agents are everywhere in 2026 — OpenClaw, Claude Code, Cursor. But how do they actually work under the hood? I built one from scratch using Python and Google's Gemini API to find out.
What Is an AI Agent (Really)?
Strip away the hype: an AI agent is a loop.
while task_not_done:
1. Send context to LLM
2. LLM decides: respond OR call a tool
3. If tool call → execute it, feed result back to LLM
4. If response → done
That's it. The magic is in function calling — the LLM doesn't just generate text, it generates structured tool calls that your code executes.
What I Built
An autonomous agent using the google-genai SDK that can:
- Read files from the local filesystem
- Write files with generated content
- Run Python scripts and capture output
- Chain multiple tools in sequence to solve complex tasks
Example interaction:
You: "Read data.csv, analyze it, and write a summary to report.txt"
Agent: [calls get_file_content("data.csv")]
Agent: [analyzes the data]
Agent: [calls write_file("report.txt", "Summary: ...")]
Agent: "Done. I've analyzed the CSV and written the summary to report.txt."
The agent decides on its own which tools to use and in what order. No hardcoded workflow.
How Function Calling Works
You define tools as Python functions with clear descriptions:
def get_file_content(file_path: str) -> str:
"""Read and return the content of a file at the given path."""
with open(file_path, "r") as f:
return f.read()
def write_file(file_path: str, content: str) -> str:
"""Write content to a file at the given path."""
with open(file_path, "w") as f:
f.write(content)
return f"Successfully wrote to {file_path}"
def run_python_file(file_path: str) -> str:
"""Execute a Python file and return its output."""
result = subprocess.run(["python", file_path], capture_output=True, text=True)
return result.stdout or result.stderr
You pass these to the Gemini API as tool declarations. The model returns a function_call object instead of text when it wants to use a tool. Your code executes the function, sends the result back, and the loop continues.
Key Lessons
Absolute Paths Matter
My biggest debugging headache: the agent would pass raw user input as file paths instead of resolving them to absolute paths first. A resolve_path() utility function fixed hours of mysterious "file not found" errors.
The Agent Loop Is Everything
The core loop handles three states:
- Model wants to call a tool → execute it, feed result back
- Model generates text → return to user
- Model generates nothing → something went wrong, break
Getting this loop right is 80% of building an agent.
System Prompts Shape Behavior
The difference between a helpful agent and a chaotic one is the system prompt. Mine specifies:
- Available tools and when to use them
- How to handle errors
- When to ask for clarification vs. just doing it
- Output format expectations
The Boot.dev Module
This project was part of Boot.dev's "AI Agents with Python" module. The course structure:
- Start with raw API calls
- Add function declarations
- Build the agent loop
- Handle multi-step tool chains
- Add error handling and edge cases
Each step builds on the last. By the end you understand exactly how tools like Claude Code work at the fundamental level — they're just this loop, scaled up with more tools and better prompts.
Why This Matters
Understanding how agents work from the inside out changes how you think about AI tooling. You stop being a consumer of agents and start being a builder. My next project — AutoApply — uses this exact pattern at a larger scale: multiple agents orchestrated by Paperclip, each with their own tools, coordinating to automate job applications.
Try It
git clone https://github.com/marinpesa15/ai-agent.git
cd ai-agent
pip install google-genai
export GOOGLE_API_KEY=your-key
python main.py
From sysadmin to AI developer — one project at a time. Building in public.
GitHub: github.com/marinpesa15/ai-agent
Top comments (0)