Dumebi Okolo

Posted on Dec 4, 2025

Introduction to AI Agents: A Technical Overview for Beginners

#ai #agents #machinelearning #beginners

Artificial intelligence has shifted from static prompt–response patterns to systems capable of taking structured actions. These systems are known as AI agents. Although the term is often stretched in marketing, the underlying architecture is practical and grounded in well-understood software principles.

_I took the 5-day AI agents intensive course with Google and Kaggle, and I promised myself to document what I learned each day.
It's been a while since I took the course, but I have been putting off writing this article. _

This article will be part of a 5-part series where I go through each day with what I have learned and share my knowledge with you.

Now, this article outlines the foundational concepts needed to build an AI agent. It also sets the stage for subsequent posts that will explore implementation details, tool integration, orchestration, governance, and evaluation. This is Day One of a multi-part technical series.

What Is an AI Agent?

Technically, an AI agent is a software system that uses a language model, tools, and state management to complete a defined objective.
It operates through a controlled cycle of reasoning and action, instead of remaining a passive text generator.

A typical AI agent includes:

A model for reasoning
A set of tools for retrieving information or executing operations
An orchestrator that manages the interaction between the model and those tools
A deployment layer for running the system at scale

This structure turns a model from a text interface into an operational component that can support business processes or technical workflows.

The AI Agent Workflow: The Think–Act–Observe Cycle

All agent systems follow a predictable control loop.
This loop is essential because it governs correctness, safety, and resource usage.

1. Mission Acquisition
The system receives a task, either from a user request or an automated trigger.
Example: “Retrieve the status of order #12345.”

2. Context Assessment
The agent evaluates available information:

Prior messages
Stored state
Tool definitions
Policy rules

3. Reasoning Step
The model generates a plan.
Example:

Identify the correct tool for order lookup
Identify the tool for shipping data retrieval
Determine response structure

4. Action Execution
The orchestrator calls the selected tool with validated parameters.

5. Observation and Iteration
The agent incorporates tool output back into its context, reassesses the task, and continues until completion or termination.

This controlled loop prevents uncontrolled behavior and supports predictable outcomes in production systems.

Core Architecture of an AI Agent System

1. Model Layer

The model performs all reasoning.
Selection depends on:

Latency requirements
Cost boundaries
Task complexity
Input/output formats

Multiple models may be used for routing, classification, or staging tasks.
However, initial implementations usually rely on a single model for simplicity.

2. Tool Layer

Tools provide operational capability.
A tool is a function with strict input/output schemas and clear documentation.
They fall into categories such as:

Data retrieval (APIs, search functions, database operations)
Data manipulation (formatting, filtering, transformation)
Operational actions (ticket creation, notifications, calculations)

Effective tool design keeps actions narrow, predictable, and well-documented.
Tools form the “action surface” of the agent and determine how reliably the system can complete assigned objectives.

3. Orchestration Layer

This layer supervises the system. It is responsible for:

Running the reasoning loop
Applying system rules
Tracking state
Managing tool invocation
Handling errors
Regulating cost and step limits

It is also the layer where developers define the agent’s operational scope and boundaries.

4. Deployment Layer

An agent becomes useful only when deployed as a service.
A typical deployment includes:

An API interface
Logging and observability
Access controls
Storage for session data or long-term records
Continuous integration workflows

This layer ensures the agent behaves as a reliable software component rather than a prototype.

Capability Levels in AI Agents

Understanding agent capability levels helps to set realistic expectations.

Level 0: Model-Only Systems

The model answers queries without tools or memory.
Suitable for text generation or explanation tasks.

Level 1: Tool-Connected Systems

The model uses a small set of tools to complete direct tasks.
Example: Querying external APIs for factual information.

Level 2: Multi-Step Systems

The agent performs planning and executes sequences of tool calls.
This level supports tasks that require intermediate decisions.

Level 3: Multi-Agent Systems

Two or more agents collaborate.
A coordinator routes tasks to specialized agents based on capability or domain.

Level 4: Self-Improving Systems

Agents that can create new tools or reconfigure workflows based on observed gaps.
Primarily research-grade today.

Building Your Practical First Agent

Developers do not need a complex system to get a simple agent running.
A small, well-defined project is just okay for understanding the architecture.

Keep in mind that I ran all this code in Kaggle's Notebook and we used Google's Gemini for the project. The screenshots accompanying the code blocks are from my own effort.

Step 1. Configure Your Gemini API Key

Every ADK project must expose your Gemini API key to the runtime. This block sets the key as an environment variable, which the ADK automatically detects.

import os

# Replace with your actual key or load it from your environment manager
os.environ["GOOGLE_API_KEY"] = "YOUR_API_KEY_HERE"
print("API key configured.")

Step 2. Import ADK Core Components

These are the foundational ADK modules we'll interact with: agent definitions, model bindings, runtimes, and built-in tools. This is the minimum import set required to stand up a functional agent.

from google.adk.agents import Agent
from google.adk.models.google_llm import Gemini
from google.adk.runners import InMemoryRunner
from google.adk.tools import google_search
from google.genai import types

Step 3. Optional: Retry Settings

LLM APIs occasionally return transient errors under heavy load. The retry configuration defines a standard exponential backoff strategy so your agent can recover automatically without failing user tasks.

retry_config = types.HttpRetryOptions(
    attempts=5,
    exp_base=7,
    initial_delay=1,
    http_status_codes=[429, 500, 503, 504],
)

Step 4. Define Your First Agent

This is the most important construct. An agent is defined by its behavior (instruction), identity (name/description), model, and available tools. The structure below is portable across any environment.

root_agent = Agent(
    name="helpful_assistant",
    description="A simple agent that can answer general questions.",
    model=Gemini(
        model="gemini-2.5-flash-lite",
        retry_options=retry_config
    ),
    instruction="You are a helpful assistant. Use web search for current information.",
    tools=[google_search],
)

Step 5. Create a Runner

The Runner orchestrates conversations, tool calls, and message history. For prototyping, InMemoryRunner is the simplest option because it requires no infrastructure or persistent storage.

runner = InMemoryRunner(agent=root_agent)

Step 6. Run Your Agent

run_debug() executes a complete agent cycle—thought generation, tool selection, action execution, and final synthesis. This is the quickest way to validate that your agent is correctly wired.

response = await runner.run_debug(
    "What is Google's Agent Development Kit? What languages are supported?"
)
print(response.text)

Step 7. Try a Query That Requires Live Information

This example demonstrates that the agent will automatically invoke the Google Search tool when the prompt requires real-time information not contained in the model’s training data.

response = await runner.run_debug("What's the weather in London right now?")
print(response.text)

Step 8. Scaffold an ADK Project Folder (Optional)**

Explanation

ADK includes a CLI for generating full project scaffolds. This is useful when you're ready to move from experimentation into an actual multi-file agent application.

adk create sample-agent --model gemini-2.5-flash-lite --api_key $GOOGLE_API_KEY

Step 9. Launch the ADK Web UI (Optional)

The ADK Web UI is a local development interface for inspecting agent traces, debugging tool calls, and testing messages. Start it from any terminal—no Kaggle or notebook integration required.

adk web

After launching, the UI becomes available at:

http://localhost:8000

Moving forward, my subsequent articles will cover:

Designing reliable tool schemas
Structuring agent instructions
Using Model Context Protocol (MCP) in real applications
Implementing human-in-the-loop workflows
Tracking performance and diagnosing failures
Hardening agents against incorrect tool usage

That's all for day 1! Can't wait to get back here for day 2!
Did you know that the 5-Day AI agent Intensive Course is now publicly available to learn from? Head on here!

Let's connect:
Linkedin

DEV Community