Understanding AI Agents

Part 1 of Demystifying AI agents

If you're a developer or engineer navigating the rapidly evolving landscape of AI, you're in the right place. I’ve spent a while exploring how AI can enhance software development, security, and automation. My goal with this blog series is to demystify AI agents—what they are, how they work, and how you can build secure and reliable ones.

This series will break down key concepts around AI agents, walk through real-world use cases, and provide practical guidance on implementing them effectively. Whether you're experimenting with autonomous systems or looking to integrate agents into your workflow, I hope this blog will help you understand not just how agents function but also why they behave the way they do.

What You’ll Learn in This Series

What is an AI agent? A foundational look at agents, the thought-act-observe loop, and how they interact with tools.
Exploring SmolAgents – a lightweight framework for building AI agents.
Securing AI agents – how CodeGate can protect AI-driven workflows.

What is an AI agent?
An agent is an entity that can observe and act on the environment in which it is situated. Based on the user provided goal (i.e. instructions), the agent first constructs a plan (i.e. a sequence of steps), to achieve the goal. Then, the agent executes the plan, optionally calling a pre-defined set of tools.

A tool is an arbitrary function in a programming language with a well defined interface. For example, the tool might be a function that performs some calculation, query a database, or call an API.

Conceptually, the agent executes several iterations of a loop consisting of: thinking, acting and observing. This is also referred to as the think-act-observe loop.

Think: Given a user instruction, the agent thinks and constructs a plan to satisfy the user’s request. As the agent executes the plan, the agent reasons with the current state to decide which action to perform next.
Act: The agent executes the action it decided from thinking by calling the available tools. A tool might read or update some information or it might carry out a physical action in the real world.
Observe: The agent observes the results from its action. These results are the output generated by the tool used for the action. The agent uses these results to further think about its next action.

Fig 1 illustrates a general schema of an AI agent.

As a naive example, a stock trading agent can observe the price of a specific stock in a stock market and perform the action of either purchasing or selling that stock based on the user instructions.

In the context of LLMs, an agent is a program which employs a LLM as a planning and reasoning engine to achieve a user specified goal. For observing and acting, the agent employs the available tools. However, the decision to use a tool is made by the LLM. I will clarify this notion of an agent using the example of stock trading.

Suppose the stock trading agent from the above example has access to these tools:

# Lookup the stock ticker of a company
lookup_ticker(company_name: str)

# Lookup my stock holdings
get_my_stock_holdings()

# Get the price of a given stock ticker
get_stock_price(stock_ticker: str)

# Sell the stock of given ticker and given quantity
sell_stock(stock_ticker: str, quantity: int)

Now, assume a user provides these instructions to the agent:

Check the stock price of Nvidia. If it is above $150, sell 80% of the stock that I hold.

When the agent starts execution, it employs the LLM to construct a plan, that is, a sequence of steps. The LLMs reasoning process might look like below.

I need to create a plan for this problem. First, I need to check the price of Nvidia stock. I have access to the get_stock_price tool. But the tool needs a stock ticker. I need to first look up the ticker symbol of Nvidia. I will use the lookup_ticker tool. 
Step 1: Call lookup_ticker to get ticker for Nvidia.

Now I can look up Nvidia’s stock price using the get_stock_price tool. 
Step 2: Call get_stock_price to get the price using ticker. 

If the price is above $150, I need to sell 80% of the stock. I need to first get the total number of shares. I will use the get_my_stock_holdings tool. 
Step 3: Call get_mystock_holdings() to get the number of Nvidia stock.

I need to calculate 80% of the Nvidia shares. I will use the calculator tool. 
Step 4: Call calculate to get 80% of the shares.

I need to sell 80% of the shares. I can use the sell_stock tool.
Step 5: Call sell_stock to sell 80% of the shares.

At the end of the reasoning process, the LLM has generated a plan consisting of Steps 1–5.

Now, the agent can execute this plan by repeatedly running the think-act-observe loop. Here is how the agent execution might look like.

Step 1

Think: I now have a plan, and I need to execute Step 1.
Act: Call lookup_ticker to get ticker for Nvidia.
Observe: The tool returns the stock ticker as: NVDA.

Step 2

Think: Now I have the ticker symbol NVDA of Nvidia. I can execute Step 2.
Act: Call get_stock_price to get the price using ticker. 
Observe: The tool returns the stock price as: $160.

Step 2

Think: Since $160 is above $150, I need to sell 80% of the shares. I need to execute Step 3. 
Act:  Call get_mystock_holdings().
Observe: The tool returns the stock holding: {“nvidia”: 100, “msft”: 50, “tsla”: 10}.

Step 4

Think: I know that the user has 100 units of Nvidia. I need to execute Step 4.
Act: Call calculate to get 80% of the shares.
Observe: The tool returns: 20.

Step 5

Think: I know I need to sell 20 shares. I need to execute Step 5.
Act: Call sell_stock to sell 80% of the shares.
Observe: The tool returns success.

Done

Think: User’s request is successfully completed. I am done.
Act: Indicate completion of user’s request.

Conclusion: This blog introduces the notion of an AI agent. It presents the think-act-observe loop, which is central to how an agent executes. Through a simple stock scenario, I outline the planning and execution steps of an agent. I hope you found this blog useful. In the next blog, I will review smolagents, a recently introduced framework from Huggingface for developing such agents. I will also describe how smolagents can be secured using Codegate, a recently introduced framework for security AI agents and LLMs.

References:

Artificial Intelligence: A Modern Approach, Russell & Norvig, Chapter 3 https://people.eecs.berkeley.edu/~russell/aima1e/chapter02.pdf
Smolagents: https://huggingface.co/blog/smolagents
Codegate: https://codegate.ai/

DEV Community

Understanding AI Agents

Part 1 of Demystifying AI agents

What You’ll Learn in This Series

Top comments (0)