Kazuya

Posted on Dec 6, 2025 • Edited on Dec 8, 2025

AWS re:Invent 2025 - Function calling vs agents: Choose the right AI approach (DEV204)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - Function calling vs agents: Choose the right AI approach (DEV204)

In this video, Guido Nebiolo from Reply explains how to select the right AI integration approach for enterprise workflows. He compares three paradigms: function calling for simple, deterministic tasks like retrieving account balances; single autonomous agents using the ReAct pattern for multi-step problems like travel planning; and multi-agent systems with hierarchical or network architectures for complex, domain-specialized tasks like legal due diligence. Key decision factors include task complexity, predictability versus adaptability, cost and latency constraints, control over business logic, and required autonomy levels. He emphasizes starting simple, planning for observability, and leveraging AWS tools like Amazon Bedrock for implementation.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Function Calling: The Bridge Between LLMs and Deterministic APIs

Thank you. I think you can hear me. Thank you all for attending this session. As anticipated, my name is Guido Nebiolo. I work for Reply. We are a system integrator. I lead a group of people focusing 100% on generative AI and AI. But I think all of you are here not to listen to me talking about me, but to talk about an important design choice that many companies are facing as they adopt generative AI.

The question is how to select the right approach when integrating AI-powered workflows in real-world systems. Depending on the complexity and the requirements of the task, we might design our system with several different paradigms. We may simply use function calling within an LLM. We may deploy a single autonomous agent, or we may orchestrate multiple agents working together. Each approach offers benefits and trade-offs, and as generative AI adoption expands across industries, making the right architectural choice becomes increasingly important for performance, safety, and especially for business value.

Let's start with the simplest approach, function calling. Modern LLMs, like the models that are available on Amazon Bedrock, can interface with external systems through function calling. It's a feature that is well known because it allows the developer to define APIs and the functions that an LLM can call or invoke when it needs to access external data or do calculations or things with code. Instead of generating free-form text like "temperature tomorrow should be around 25 degrees," the model can call a function and get the result and get back the exact value.

This is a game changer for reliability because structured and deterministic outputs are replacing the unpredictable text generation. This makes AI safer to integrate into real-world business workflows, and you can fetch live data, perform calculations, retrieve inventories, check customer balances, verify booking details, all with the deterministic external function call. For example, think of a banking assistant. When a customer asks "what's my current balance," we don't want the model to guess the answer, but to simply trigger a backend function and retrieve the exact value from a customer database. Function calling is essentially the bridge between probabilistic reasoning, typical of LLMs, and the deterministic world of software APIs.

Let's now bring this to a concrete travel scenario. I imagine I need to plan my trip to re:Invent in Las Vegas for next week. I will start prompting an LLM asking, "next week I will go to Las Vegas for a few days trip. What do I have to put in my luggage?" And a standard LLM would give a generic list of clothes, a generic list of jackets, sunglasses, things like that, without taking into consideration that when I say next week, I'm already providing a lot of information like the season, like things that I really need to do and to bring with me during my stay.

Now, if I equip the model with function calling capabilities, it can enhance the generation of its answer. The model recognizes that my request involves the place and the date and can call a function to get something specific for that specific moment. And with this real forecast, it can now suggest the real things that I need. The result is far more accurate and personalized.

Here is an example of doing function calling with Amazon Bedrock. It's pretty simple. Essentially, we put inside the definition of the prompt even the tools that the LLM can access. Function calling is extremely efficient, but only for known and simple interactions. Once we scale up to real-world scenarios, we have enterprise workflows with dozens of tools, interdependent functions, more dynamic requirements, and things get harder really quickly. The more functions involved, the more back and forth between the LLM and the application we have. So the system still needs to hold the business logic and act with this back and forth with the LLM. At this point, we start using reasoning inside the AI itself, and this brings us to agents.

From ReAct Pattern to Multi-Agent Systems: Enabling Autonomous AI Problem-Solving

The ReAct pattern, which stands for reasoning and acting, showed us that an LLM can do more than just predict text or call functions directly. They can reason. For example, first I need to check specific things,

then depending on the result, I will call another one, just like humans essentially think about how to solve a problem. By integrating this tool description into a prompt, the model can dynamically choose which tool to use, plan multiple steps ahead, analyze the intermediate result, and adjust the action accordingly to the result that it obtains. This allows us to build a system where AI is not a passive responder, but actively drives problem solving, and this leads us directly into an agentic architecture.

An agent is a step beyond function calling, as we said. It's an autonomous AI process that essentially receives a goal, reasons on how to solve it, decomposes that task into multiple steps, and decides when to call functions, when to use tools, and when to fetch external data. Let's see this in practice. Let's see the scenario that we had before. Imagine we want to extend that scenario. In this scenario, instead of simply calling a single get weather function, we want a system that is autonomously able to handle multiple aspects.

For example, retrieving the weather forecast, checking the hotel availability, verifying the flight to come here, and even suggesting activities depending on the length of my stay and the season in which I am. The agent is supplied with multiple tools: a weather API, a flight search tool, a hotel booking tool, and maybe even integration with my calendar to check the things that I have scheduled. The key difference here is that we no longer manually orchestrate the different steps. An agent is able to reason step by step, deciding when to invoke each tool and whatever is needed during the evolving understanding of the problem.

For example, say that we want to plan a trip to Las Vegas next week. The agent can decide to call the different tools to have information about the weather, to check the availability of flights and hotels, and even query the weather forecast for those days and finally recommend the things that I have to take. Sometimes one agent is not enough. Some business problems are more complex and need to be broken into specialized roles to deliver better results. This is where multi-agent systems come in.

Instead of one single agent juggling between different domains, we can create multiple agents that can collaborate. Each one can be focused on a particular area, having its own expertise, its own dedicated tools, and its own dedicated instructions. Continuing with the travel scenario, but taking it a little bit further, we can have a weather agent focusing purely on forecasting, a flight agent focusing on flight searches, a hotel agent for accommodations, and an activities agent that suggests all the things that we can do during this period.

Sitting on top, we can have a coordinator agent that essentially orchestrates all of them. Each agent can work independently and efficiently, staying within the boundaries of its own domain, improving accuracy and modularity. In this example, we can improve a single agent without touching the others and without the risk of breaking things that are working. This is how human organizations work: different specialists that collaborate through a manager.

Architectural Patterns and Challenges in Multi-Agent Orchestration

Before diving into challenges, let's look at two foundational multi-agent architectures that you can choose from. The first one is the hierarchical pattern, also called supervisor. In this pattern, we have a central orchestrator, what we can call a supervisor, that breaks down the overall goal into different subtasks and distributes them to specialized worker agents. Each worker is focusing only on its own domain without calling the others. The supervisor collects and consolidates all the results. This approach provides clear control on who is in charge of what, easy debugging, and straightforward sequencing of all the different actions.

The second one is the network, also called swarm. Here we have multiple agents that can operate as peers in a mesh with a shared blackboard. There's no single boss. Instead, we have agents that communicate directly to share information and coordinate the different tasks. This pattern offers more resiliency, obviously, because if one agent fails, the others can continue to work. But on the other side, it also enables parallel problem solving but can be very complex to manage.

As someone wiser than me once said, with great power comes great responsibility. We have several challenges with multi-agent architecture. The first one is orchestration complexity, which involves determining who can talk to whom, when, and in what order. The second one is latency and costs. Multiple agents means more LLM calls, and more LLM calls with more tools potentially lead to longer response times.

The third challenge is debugging. As a distributed system, if something goes wrong, we need detailed observability, tracing the full chain of agent decisions across the different agents. Last but not least, we have security and safety. Each agent can access different sensitive data, and permissions must be carefully controlled in a way that ensures all the agents stay within their own boundaries.

Imagine a real-world failure. We can have two agents calling each other recursively and endlessly because of a misaligned prompt or a tool API that has been changed without notice, bringing the agents unable to call that specific function. No one notices until a user complains. That's why strong observability and monitoring, like for example using Strands with OpenTelemetry, is crucial when deploying multi-agent architectures.

Choosing the Right Approach: Decision Framework and Real-World Applications

When choosing between function calling and multi-agent architecture, it's not about picking the most advanced option because it can be difficult to manage. It's about selecting the right level of intelligence and autonomy that we need for our use case. The first key driver in my opinion to take into consideration is task complexity. It's about how many steps and how many moving parts are in our problem.

Is it a simple question-answer flow that the user asks, or are there multi-step tasks evolving using the context? If the task is pretty straightforward, function calling is enough—quicker and easier. If it involves multiple subtasks, contextual interpretation, and chaining decisions, maybe an agent, like for example the example of booking a travel, can be a good choice. When instead a task involves parallel domains, like for example legal compliance and financing the same problem to solve, and requires coordination, a multi-agent system is usually a better fit.

The second consideration is predictability versus adaptability. The question we want to answer is how tightly do we need to control the system behavior, or do we prefer that it adapts in some open-ended way to solve the problem we gave it? Function calling offers full predictability. It's a single shot where the LLM suggests the function, the system executes it as needed, and gives the result for deterministic control.

Then we have agents, which give more adaptability. Essentially, they decide dynamically which tool they need to use based on the user intent and the evolution of the context, the evolution of the problem solution. And then we have multi-agent systems that allow emergent behavior. Essentially, agents may solve subproblems differently based on their domain knowledge. This is very powerful, but obviously less predictable, and that's where good observability becomes critical.

The third consideration is cost and latency. Do you prefer performance or budgets? Where are your constraints? Essentially, do you need a very responsive system, or can you live with a system that solves the problem in an asynchronous way? Function calls are cheaper and very fast, with minimal token use and few model calls. Agents are more expensive because they often reason through multiple steps to solve a problem, and multi-agent systems multiply cost and latency.

Then we have control over the business logic. Essentially, who defines how the system behaves—you or the model itself?

With function calling, you control everything. With agents, the logic is partially embedded in the prompt and the reasoning steps that the LLM performs. With multiple agents, you are essentially delegating everything to them to solve the problem in the way that they think is right. Last but not least, we have autonomy. How much initiative should the AI take?

Function calling is reactive. The user asks something, the model responds. Agents are instead proactive. They can decompose goals, decide the different steps that they have to take, and even ask follow-up questions if needed to solve the problem. With multi-agents, we have a collaborative and autonomous view. Essentially, they can assign roles between them, distribute the tasks, and solve the problems with minimal to no human input.

Here's a recap table to visualize it. Essentially, low complexity and low autonomy, obviously use function calling. Moderate complexity and adaptive reasoning, here you can use a single agent. High complexity with multiple domain collaboration requires multi-agent orchestration. For example, let's think about retrieving a product price. Retrieving a product price is a simple function calling. We don't need anything more complex than that. For planning a vacation, use a single agent. It's a single domain in which the agent has different tools, different APIs that they can call and can decompose the problem, staying in the boundaries of the domain. Instead, building an enterprise legal advisor obviously involves multiple tasks between different domains, and that's where a multi-agent system can act.

Let's map some real world use cases. Function calling, for example, FAQ bots, currency conversion, all the things that can be solved very simply, order status lookups. Then we have agents. With single agents, we can have a travel planning assistant, personal research assistant, content summarization. The typical assistant that you are using in your everyday job in some years. With multi-agents, instead, we have use cases like complex report generation, enterprise-wide troubleshooting. Even project management or legal due diligence can be performed by multiple agents put in the same solution.

The thing is, the more ambiguity, the more specialization or collaboration the problem involves, the more an agentic approach helps. The less specialization, the more we are helped by function calling. Let's wrap up. First things first, always start simple. Don't overcomplicate the problem. Use function calling whenever deterministic output is sufficient, or you can use single agents for multiple-step tasks that need adaptive reasoning. Use multi-agents when you need different domain specialization and complex workflows.

Second thing is always plan for observability. Things will definitely break when using LLMs. Always plan for observability, for cost control, and especially for governance. And the third one, last one, remember that AWS provides different end-to-end tools that can support the user at every level. We have Amazon Bedrock, Lambda, Step Functions, even agent orchestration. All these services can be used to build agentic AI solutions.

I hope this gave you a clear model on how to apply AI inside your projects. Here is my contact, feel free to get in touch with me. Remember to leave feedback for this session through the mobile app. Thank you for your attention. Thank you so much.

; This article is entirely auto-generated using Amazon Bedrock.