Artificial intelligence has shifted from static prompt–response patterns to systems capable of taking structured actions. These systems are known as AI agents. Although the term is often stretched in marketing, the underlying architecture is practical and grounded in well-understood software principles.
_I took the 5-day AI agents intensive course with Google and Kaggle, and I promised myself to document what I learned each day.
It's been a while since I took the course, but I have been putting off writing this article. _
This article will be part of a 5-part series where I go through each day with what I have learned and share my knowledge with you.
Now, this article outlines the foundational concepts needed to build an AI agent. It also sets the stage for subsequent posts that will explore implementation details, tool integration, orchestration, governance, and evaluation. This is Day One of a multi-part technical series.
What Is an AI Agent?
Technically, an AI agent is a software system that uses a language model, tools, and state management to complete a defined objective.
It operates through a controlled cycle of reasoning and action, instead of remaining a passive text generator.
A typical AI agent includes:
- A model for reasoning
- A set of tools for retrieving information or executing operations
- An orchestrator that manages the interaction between the model and those tools
- A deployment layer for running the system at scale
This structure turns a model from a text interface into an operational component that can support business processes or technical workflows.
The AI Agent Workflow: The Think–Act–Observe Cycle
All agent systems follow a predictable control loop.
This loop is essential because it governs correctness, safety, and resource usage.
1. Mission Acquisition
The system receives a task, either from a user request or an automated trigger.
Example: “Retrieve the status of order #12345.”
2. Context Assessment
The agent evaluates available information:
- Prior messages
- Stored state
- Tool definitions
- Policy rules
3. Reasoning Step
The model generates a plan.
Example:
- Identify the correct tool for order lookup
- Identify the tool for shipping data retrieval
- Determine response structure
4. Action Execution
The orchestrator calls the selected tool with validated parameters.
5. Observation and Iteration
The agent incorporates tool output back into its context, reassesses the task, and continues until completion or termination.
This controlled loop prevents uncontrolled behavior and supports predictable outcomes in production systems.
Core Architecture of an AI Agent System
1. Model Layer
The model performs all reasoning.
Selection depends on:
- Latency requirements
- Cost boundaries
- Task complexity
- Input/output formats
Multiple models may be used for routing, classification, or staging tasks.
However, initial implementations usually rely on a single model for simplicity.
2. Tool Layer
Tools provide operational capability.
A tool is a function with strict input/output schemas and clear documentation.
They fall into categories such as:
- Data retrieval (APIs, search functions, database operations)
- Data manipulation (formatting, filtering, transformation)
- Operational actions (ticket creation, notifications, calculations)
Effective tool design keeps actions narrow, predictable, and well-documented.
Tools form the “action surface” of the agent and determine how reliably the system can complete assigned objectives.
3. Orchestration Layer
This layer supervises the system. It is responsible for:
- Running the reasoning loop
- Applying system rules
- Tracking state
- Managing tool invocation
- Handling errors
- Regulating cost and step limits
It is also the layer where developers define the agent’s operational scope and boundaries.
4. Deployment Layer
An agent becomes useful only when deployed as a service.
A typical deployment includes:
- An API interface
- Logging and observability
- Access controls
- Storage for session data or long-term records
- Continuous integration workflows
This layer ensures the agent behaves as a reliable software component rather than a prototype.
Capability Levels in AI Agents
Understanding agent capability levels helps to set realistic expectations.
Level 0: Model-Only Systems
The model answers queries without tools or memory.
Suitable for text generation or explanation tasks.
Level 1: Tool-Connected Systems
The model uses a small set of tools to complete direct tasks.
Example: Querying external APIs for factual information.
Level 2: Multi-Step Systems
The agent performs planning and executes sequences of tool calls.
This level supports tasks that require intermediate decisions.
Level 3: Multi-Agent Systems
Two or more agents collaborate.
A coordinator routes tasks to specialized agents based on capability or domain.
Level 4: Self-Improving Systems
Agents that can create new tools or reconfigure workflows based on observed gaps.
Primarily research-grade today.
Building Your Practical First Agent
Developers do not need a complex system to get a simple agent running.
A small, well-defined project is just okay for understanding the architecture.
Keep in mind that I ran all this code in Kaggle's Notebook and we used Google's Gemini for the project. The screenshots accompanying the code blocks are from my own effort.
Step 1. Configure Your Gemini API Key
Every ADK project must expose your Gemini API key to the runtime. This block sets the key as an environment variable, which the ADK automatically detects.
import os
# Replace with your actual key or load it from your environment manager
os.environ["GOOGLE_API_KEY"] = "YOUR_API_KEY_HERE"
print("API key configured.")
Step 2. Import ADK Core Components
These are the foundational ADK modules we'll interact with: agent definitions, model bindings, runtimes, and built-in tools. This is the minimum import set required to stand up a functional agent.
from google.adk.agents import Agent
from google.adk.models.google_llm import Gemini
from google.adk.runners import InMemoryRunner
from google.adk.tools import google_search
from google.genai import types
Step 3. Optional: Retry Settings
LLM APIs occasionally return transient errors under heavy load. The retry configuration defines a standard exponential backoff strategy so your agent can recover automatically without failing user tasks.
retry_config = types.HttpRetryOptions(
attempts=5,
exp_base=7,
initial_delay=1,
http_status_codes=[429, 500, 503, 504],
)
Step 4. Define Your First Agent
This is the most important construct. An agent is defined by its behavior (instruction), identity (name/description), model, and available tools. The structure below is portable across any environment.
root_agent = Agent(
name="helpful_assistant",
description="A simple agent that can answer general questions.",
model=Gemini(
model="gemini-2.5-flash-lite",
retry_options=retry_config
),
instruction="You are a helpful assistant. Use web search for current information.",
tools=[google_search],
)
Step 5. Create a Runner
The Runner orchestrates conversations, tool calls, and message history. For prototyping, InMemoryRunner is the simplest option because it requires no infrastructure or persistent storage.
runner = InMemoryRunner(agent=root_agent)
Step 6. Run Your Agent
run_debug() executes a complete agent cycle—thought generation, tool selection, action execution, and final synthesis. This is the quickest way to validate that your agent is correctly wired.
response = await runner.run_debug(
"What is Google's Agent Development Kit? What languages are supported?"
)
print(response.text)
Step 7. Try a Query That Requires Live Information
This example demonstrates that the agent will automatically invoke the Google Search tool when the prompt requires real-time information not contained in the model’s training data.
response = await runner.run_debug("What's the weather in London right now?")
print(response.text)
Step 8. Scaffold an ADK Project Folder (Optional)**
Explanation
ADK includes a CLI for generating full project scaffolds. This is useful when you're ready to move from experimentation into an actual multi-file agent application.
adk create sample-agent --model gemini-2.5-flash-lite --api_key $GOOGLE_API_KEY
Step 9. Launch the ADK Web UI (Optional)
The ADK Web UI is a local development interface for inspecting agent traces, debugging tool calls, and testing messages. Start it from any terminal—no Kaggle or notebook integration required.
adk web
After launching, the UI becomes available at:
http://localhost:8000
Moving forward, my subsequent articles will cover:
- Designing reliable tool schemas
- Structuring agent instructions
- Using Model Context Protocol (MCP) in real applications
- Implementing human-in-the-loop workflows
- Tracking performance and diagnosing failures
- Hardening agents against incorrect tool usage
That's all for day 1! Can't wait to get back here for day 2!
Did you know that the 5-Day AI agent Intensive Course is now publicly available to learn from? Head on here!
Let's connect:
Linkedin





Top comments (0)