DEV Community

Marin Pesa
Marin Pesa

Posted on

AI-Agent with Python and Gemini AI

CONTENT:


AI agents are everywhere in 2026 — OpenClaw, Claude Code, Cursor. But how do they actually work under the hood? I built one from scratch using Python and Google's Gemini API to find out.

What Is an AI Agent (Really)?

Strip away the hype: an AI agent is a loop.

while task_not_done:
    1. Send context to LLM
    2. LLM decides: respond OR call a tool
    3. If tool call → execute it, feed result back to LLM
    4. If response → done
Enter fullscreen mode Exit fullscreen mode

That's it. The magic is in function calling — the LLM doesn't just generate text, it generates structured tool calls that your code executes.

What I Built

An autonomous agent using the google-genai SDK that can:

  • Read files from the local filesystem
  • Write files with generated content
  • Run Python scripts and capture output
  • Chain multiple tools in sequence to solve complex tasks

Example interaction:

You: "Read data.csv, analyze it, and write a summary to report.txt"

Agent: [calls get_file_content("data.csv")]
Agent: [analyzes the data]
Agent: [calls write_file("report.txt", "Summary: ...")]
Agent: "Done. I've analyzed the CSV and written the summary to report.txt."
Enter fullscreen mode Exit fullscreen mode

The agent decides on its own which tools to use and in what order. No hardcoded workflow.

How Function Calling Works

You define tools as Python functions with clear descriptions:

def get_file_content(file_path: str) -> str:
    """Read and return the content of a file at the given path."""
    with open(file_path, "r") as f:
        return f.read()

def write_file(file_path: str, content: str) -> str:
    """Write content to a file at the given path."""
    with open(file_path, "w") as f:
        f.write(content)
    return f"Successfully wrote to {file_path}"

def run_python_file(file_path: str) -> str:
    """Execute a Python file and return its output."""
    result = subprocess.run(["python", file_path], capture_output=True, text=True)
    return result.stdout or result.stderr
Enter fullscreen mode Exit fullscreen mode

You pass these to the Gemini API as tool declarations. The model returns a function_call object instead of text when it wants to use a tool. Your code executes the function, sends the result back, and the loop continues.

Key Lessons

Absolute Paths Matter

My biggest debugging headache: the agent would pass raw user input as file paths instead of resolving them to absolute paths first. A resolve_path() utility function fixed hours of mysterious "file not found" errors.

The Agent Loop Is Everything

The core loop handles three states:

  1. Model wants to call a tool → execute it, feed result back
  2. Model generates text → return to user
  3. Model generates nothing → something went wrong, break

Getting this loop right is 80% of building an agent.

System Prompts Shape Behavior

The difference between a helpful agent and a chaotic one is the system prompt. Mine specifies:

  • Available tools and when to use them
  • How to handle errors
  • When to ask for clarification vs. just doing it
  • Output format expectations

The Boot.dev Module

This project was part of Boot.dev's "AI Agents with Python" module. The course structure:

  1. Start with raw API calls
  2. Add function declarations
  3. Build the agent loop
  4. Handle multi-step tool chains
  5. Add error handling and edge cases

Each step builds on the last. By the end you understand exactly how tools like Claude Code work at the fundamental level — they're just this loop, scaled up with more tools and better prompts.

Why This Matters

Understanding how agents work from the inside out changes how you think about AI tooling. You stop being a consumer of agents and start being a builder. My next project — AutoApply — uses this exact pattern at a larger scale: multiple agents orchestrated by Paperclip, each with their own tools, coordinating to automate job applications.

Try It

git clone https://github.com/marinpesa15/ai-agent.git
cd ai-agent
pip install google-genai
export GOOGLE_API_KEY=your-key
python main.py
Enter fullscreen mode Exit fullscreen mode

From sysadmin to AI developer — one project at a time. Building in public.


GitHub: github.com/marinpesa15/ai-agent

Top comments (0)