Vinicius Fagundes

Posted on Dec 4

Function Calling and Tool Use: Turning LLMs into Action-Taking Agents

#ai #mcp #llm #dataengineering

📚 Tech Acronyms Reference

Quick reference for acronyms used in this article:

API - Application Programming Interface
JSON - JavaScript Object Notation
LLM - Large Language Model
REST - Representational State Transfer
SQL - Structured Query Language
UUID - Universally Unique Identifier

Libraries/Frameworks:

LiteLLM - Universal LLM API wrapper (provider-agnostic)
LangChain - Framework for building LLM applications
Ollama - Tool for running LLMs locally
Pydantic - Data validation library for Python

🎯 Introduction: From Text Generation to Action

Up until now, Large Language Models (LLMs) have been impressive text generators. You ask a question, they generate text. But they can't:

Check your calendar
Query your database
Send an email
Fetch real-time weather data
Calculate complex math

Function calling changes this. It lets LLMs decide when to call external tools and what parameters to pass.

And it works across ALL major providers:

✅ OpenAI (GPT-4, GPT-3.5)
✅ Anthropic (Claude Sonnet, Claude Opus)
✅ Open-source models (Llama 3.1, Mistral, Mixtral)
✅ Google (Gemini)
✅ Any LLM with JSON mode

This article shows you how to build provider-agnostic function calling systems.

Real-Life Analogy: The Personal Assistant

Old LLM (Text-only):

You: "What's the weather in Tokyo?"
LLM: "I don't have access to real-time weather data, but Tokyo typically has..."

New LLM (With function calling):

You: "What's the weather in Tokyo?"
LLM: Thinks: I need weather data. I'll call the get_weather function with city="Tokyo"
LLM calls: get_weather(city="Tokyo", units="celsius")
Function returns: {"temp": 22, "condition": "partly cloudy"}
LLM: "The weather in Tokyo is currently 22°C and partly cloudy."

The LLM becomes an orchestrator, deciding which tools to use and when.

💡 Data Engineer's ROI Lens

For this article, we're focusing on:

How does function calling actually work? (The protocol)
How do I define functions? (JSON schemas, type safety)
How do I handle errors? (Retries, validation, edge cases)

This is the bridge from theory to production. After this article, you'll be able to build LLM agents that take real actions.

🔧 Part 1: Function Calling Basics

The Universal Concept: Provider-Agnostic

Important: Function calling is NOT locked to one provider. It works with:

✅ OpenAI (GPT-4, GPT-3.5-turbo)
✅ Anthropic (Claude)
✅ Open-source models (Llama, Mistral, Mixtral)

The core concept is the same across all providers. Only the API syntax differs slightly.

The Protocol: How It Works

Function calling is a two-step conversation between you and the LLM:

Step 1: You provide functions

functions = [
    {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name"},
                "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
        }
    }
]

Step 2: LLM decides to call function

{
    "function_call": {
        "name": "get_weather",
        "arguments": "{\"city\": \"Tokyo\", \"units\": \"celsius\"}"
    }
}

Step 3: You execute the function

result = get_weather(city="Tokyo", units="celsius")
# Returns: {"temp": 22, "condition": "partly cloudy"}

Step 4: You send result back to LLM

# LLM uses the result to generate final response
"The weather in Tokyo is currently 22°C and partly cloudy."

Same Function, Three Providers

Let's implement the exact same weather tool with OpenAI, Anthropic, and Llama to show the pattern is universal.

Provider 1: OpenAI (GPT-4)

from openai import OpenAI

client = OpenAI(api_key="your-openai-key")

# Define function schema (OpenAI format)
functions = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a specific city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city name, e.g., Tokyo, London"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature units"
                }
            },
            "required": ["city"]
        }
    }
]

def get_weather(city: str, units: str = "celsius") -> dict:
    """Weather tool implementation"""
    weather_data = {
        "Tokyo": {"temp": 22, "condition": "partly cloudy"},
        "London": {"temp": 15, "condition": "rainy"},
    }
    return weather_data.get(city, {"temp": 20, "condition": "unknown"})

# Call with OpenAI
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    functions=functions,
    function_call="auto"  # Let model decide
)

message = response.choices[0].message

if message.function_call:
    function_name = message.function_call.name
    function_args = json.loads(message.function_call.arguments)

    # Execute function
    result = get_weather(**function_args)

    # Send back to model
    final_response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "user", "content": "What's the weather in Tokyo?"},
            message,
            {"role": "function", "name": function_name, "content": json.dumps(result)}
        ]
    )

    print(final_response.choices[0].message.content)

Provider 2: Anthropic (Claude)

import anthropic

client = anthropic.Anthropic(api_key="your-anthropic-key")

# Define tools schema (Anthropic format - note: "tools" not "functions")
tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a specific city",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city name, e.g., Tokyo, London"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature units"
                }
            },
            "required": ["city"]
        }
    }
]

# Same weather function
def get_weather(city: str, units: str = "celsius") -> dict:
    """Weather tool implementation"""
    weather_data = {
        "Tokyo": {"temp": 22, "condition": "partly cloudy"},
        "London": {"temp": 15, "condition": "rainy"},
    }
    return weather_data.get(city, {"temp": 20, "condition": "unknown"})

# Call with Anthropic
response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    tools=tools,  # Note: "tools" not "functions"
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}]
)

# Check if Claude wants to use a tool
if response.stop_reason == "tool_use":
    tool_use = next(block for block in response.content if block.type == "tool_use")
    tool_name = tool_use.name
    tool_input = tool_use.input

    print(f"Claude wants to call: {tool_name}")
    print(f"With arguments: {tool_input}")

    # Execute function
    result = get_weather(**tool_input)

    # Send result back to Claude
    final_response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1024,
        tools=tools,
        messages=[
            {"role": "user", "content": "What's the weather in Tokyo?"},
            {"role": "assistant", "content": response.content},
            {
                "role": "user",
                "content": [
                    {
                        "type": "tool_result",
                        "tool_use_id": tool_use.id,
                        "content": json.dumps(result)
                    }
                ]
            }
        ]
    )

    print(final_response.content[0].text)

Provider 3: Llama (Open-Source via Ollama or vLLM)

# Using Ollama for local Llama models
import requests
import json
import re

# Same weather function
def get_weather(city: str, units: str = "celsius") -> dict:
    """Weather tool implementation"""
    weather_data = {
        "Tokyo": {"temp": 22, "condition": "partly cloudy"},
        "London": {"temp": 15, "condition": "rainy"},
    }
    return weather_data.get(city, {"temp": 20, "condition": "unknown"})

# Function registry
available_functions = {
    "get_weather": get_weather
}

# Define tools in prompt (Llama doesn't have native function calling API)
tools_description = """
You have access to the following tools:

1. get_weather(city: str, units: str = "celsius") -> dict
   Description: Get the current weather for a specific city
   Parameters:
     - city (required): City name like "Tokyo", "London"
     - units (optional): "celsius" or "fahrenheit"

To use a tool, respond with a JSON object in this format:
{
  "tool": "get_weather",
  "arguments": {
    "city": "Tokyo",
    "units": "celsius"
  }
}

After I provide the tool result, give your final answer to the user.
"""

def call_llama(messages: list, system_prompt: str = "") -> str:
    """Call Llama via Ollama API"""
    response = requests.post(
        "http://localhost:11434/api/chat",
        json={
            "model": "llama3.1",  # or llama2, mistral, etc.
            "messages": messages,
            "stream": False,
            "options": {
                "temperature": 0.1  # Low temp for structured output
            }
        }
    )
    return response.json()["message"]["content"]

# Step 1: Ask Llama
user_query = "What's the weather in Tokyo?"

messages = [
    {"role": "system", "content": tools_description},
    {"role": "user", "content": user_query}
]

llama_response = call_llama(messages)
print(f"Llama response: {llama_response}")

# Step 2: Parse tool call (extract JSON from response)
try:
    # Look for JSON in response
    json_match = re.search(r'\{[^}]+\}', llama_response, re.DOTALL)
    if json_match:
        tool_call = json.loads(json_match.group())

        tool_name = tool_call.get("tool")
        tool_args = tool_call.get("arguments", {})

        print(f"\nLlama wants to call: {tool_name}")
        print(f"With arguments: {tool_args}")

        # Step 3: Execute function
        if tool_name in available_functions:
            result = available_functions[tool_name](**tool_args)
            print(f"Function result: {result}")

            # Step 4: Send result back to Llama
            messages.append({"role": "assistant", "content": llama_response})
            messages.append({
                "role": "user",
                "content": f"Tool result: {json.dumps(result)}\n\nNow provide your final answer."
            })

            final_response = call_llama(messages)
            print(f"\nFinal answer: {final_response}")
        else:
            print(f"Unknown tool: {tool_name}")
    else:
        # Llama didn't call a tool, gave direct answer
        print(f"Direct answer (no tool call): {llama_response}")

except json.JSONDecodeError:
    print("Could not parse tool call from Llama response")

Alternative: Llama with LiteLLM (Unified Interface)

from litellm import completion

# LiteLLM provides unified interface across providers
def call_with_litellm(model: str, messages: list, tools: list):
    """Universal function calling across any provider"""
    response = completion(
        model=model,  # "gpt-4", "claude-3-5-sonnet", "ollama/llama3.1"
        messages=messages,
        tools=tools,
        tool_choice="auto"
    )
    return response

# Same tools definition works for all providers!
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string"}
                },
                "required": ["city"]
            }
        }
    }
]

# Works with ANY model
response = call_with_litellm(
    model="gpt-4",  # or "claude-3-5-sonnet" or "ollama/llama3.1"
    messages=[{"role": "user", "content": "Weather in Tokyo?"}],
    tools=tools
)

Key Differences Between Providers

Feature	OpenAI	Anthropic	Llama (Native)
API field name	`functions`	`tools`	Prompt-based
Schema format	`parameters`	`input_schema`	Description in prompt
Response parsing	`message.function_call`	`tool_use` block	JSON extraction
Multi-turn	`role: "function"`	`tool_result`	Context management
Native support	✅ Yes	✅ Yes	❌ Prompt engineering
Reliability	Excellent	Excellent	Good (depends on model)

Best Practice: Use a unified library like LiteLLM or LangChain to abstract provider differences.

First Complete Example: Weather Tool (Provider-Agnostic)

For production systems, use an abstraction layer. Here's a complete example with LiteLLM that works with any provider:

from litellm import completion
import json

# Define the weather function
def get_weather(city: str, units: str = "celsius") -> dict:
    """
    Get current weather for a city.
    In production, this would call a real weather API.
    """
    weather_data = {
        "Tokyo": {"temp": 22, "condition": "partly cloudy"},
        "London": {"temp": 15, "condition": "rainy"},
        "New York": {"temp": 28, "condition": "sunny"},
    }
    return weather_data.get(city, {"temp": 20, "condition": "unknown"})

# Define tool schema (works across all providers)
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a specific city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "The city name, e.g., Tokyo, London, New York"
                    },
                    "units": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature units"
                    }
                },
                "required": ["city"]
            }
        }
    }
]

def run_agent(model: str, user_query: str):
    """
    Universal function calling agent.
    Works with: gpt-4, claude-3-5-sonnet, ollama/llama3.1, etc.
    """
    messages = [{"role": "user", "content": user_query}]

    # Step 1: Call LLM with tools
    response = completion(
        model=model,
        messages=messages,
        tools=tools,
        tool_choice="auto"
    )

    message = response.choices[0].message

    # Step 2: Check if LLM wants to call a tool
    if message.tool_calls:
        tool_call = message.tool_calls[0]
        function_name = tool_call.function.name
        function_args = json.loads(tool_call.function.arguments)

        print(f"🔧 Model: {model}")
        print(f"📞 Calling: {function_name}({function_args})")

        # Step 3: Execute function
        if function_name == "get_weather":
            result = get_weather(**function_args)
            print(f"✅ Result: {result}")

            # Step 4: Send result back to LLM
            messages.append(message)
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result)
            })

            # Get final response
            final_response = completion(
                model=model,
                messages=messages
            )

            return final_response.choices[0].message.content
    else:
        return message.content

# Test with different providers
print("=" * 60)
print("Testing with GPT-4")
print("=" * 60)
answer1 = run_agent("gpt-4", "What's the weather in Tokyo?")
print(f"Answer: {answer1}\n")

print("=" * 60)
print("Testing with Claude")
print("=" * 60)
answer2 = run_agent("claude-3-5-sonnet-20241022", "What's the weather in London?")
print(f"Answer: {answer2}\n")

print("=" * 60)
print("Testing with Llama (via Ollama)")
print("=" * 60)
answer3 = run_agent("ollama/llama3.1", "What's the weather in New York?")
print(f"Answer: {answer3}\n")

Output:

============================================================
Testing with GPT-4
============================================================
🔧 Model: gpt-4
📞 Calling: get_weather({'city': 'Tokyo', 'units': 'celsius'})
✅ Result: {'temp': 22, 'condition': 'partly cloudy'}
Answer: The weather in Tokyo is currently 22°C and partly cloudy.

============================================================
Testing with Claude
============================================================
🔧 Model: claude-3-5-sonnet-20241022
📞 Calling: get_weather({'city': 'London', 'units': 'celsius'})
✅ Result: {'temp': 15, 'condition': 'rainy'}
Answer: The weather in London is currently 15°C and rainy.

============================================================
Testing with Llama (via Ollama)
============================================================
🔧 Model: ollama/llama3.1
📞 Calling: get_weather({'city': 'New York', 'units': 'celsius'})
✅ Result: {'temp': 28, 'condition': 'sunny'}
Answer: The weather in New York is currently 28°C and sunny.

Installation:

pip install litellm

# For Ollama (local Llama)
# Install from: https://ollama.ai
ollama pull llama3.1

Understanding JSON Schemas

The function schema tells the LLM:

What the function does (description)
What parameters it accepts (parameters)
Which parameters are required (required)
Valid values (enum, type)

Example: Database Query Function

{
    "name": "query_database",
    "description": "Execute a SQL query on the sales database",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "SQL SELECT query to execute"
            },
            "limit": {
                "type": "integer",
                "description": "Maximum number of rows to return",
                "default": 100
            },
            "timeout_seconds": {
                "type": "integer",
                "description": "Query timeout in seconds",
                "default": 30
            }
        },
        "required": ["query"]
    }
}

Key JSON Schema Types:

Type	Description	Example
`string`	Text	`"Tokyo"`
`integer`	Whole number	`42`
`number`	Decimal number	`3.14`
`boolean`	True/False	`true`
`array`	List of items	`["a", "b", "c"]`
`object`	Nested structure	`{"key": "value"}`
`enum`	Limited choices	`["celsius", "fahrenheit"]`

Type Safety and Validation

Always validate function arguments before execution:

from typing import Literal
from pydantic import BaseModel, validator

class WeatherParams(BaseModel):
    city: str
    units: Literal["celsius", "fahrenheit"] = "celsius"

    @validator('city')
    def city_must_not_be_empty(cls, v):
        if not v or not v.strip():
            raise ValueError('City cannot be empty')
        return v.strip().title()

def get_weather_safe(params_dict: dict) -> dict:
    """Type-safe weather function with validation"""
    try:
        # Validate with Pydantic
        params = WeatherParams(**params_dict)

        # Now safely execute
        weather_data = {
            "Tokyo": {"temp": 22, "condition": "partly cloudy"},
            "London": {"temp": 15, "condition": "rainy"},
            "New York": {"temp": 28, "condition": "sunny"},
        }

        result = weather_data.get(params.city, {"temp": 20, "condition": "unknown"})

        # Convert temperature if needed
        if params.units == "fahrenheit":
            result["temp"] = (result["temp"] * 9/5) + 32

        return result

    except Exception as e:
        return {"error": str(e)}

# Usage
result = get_weather_safe({"city": "  tokyo  ", "units": "celsius"})
print(result)  # {'temp': 22, 'condition': 'partly cloudy'}

🎛️ Part 2: Multi-Tool Orchestration

Real applications need multiple tools. The LLM must choose the right one.

Example: Customer Support Agent

import sqlite3
from datetime import datetime

# Tool 1: Search knowledge base
def search_knowledge_base(query: str, max_results: int = 5) -> list:
    """Search documentation for relevant articles"""
    # Simulated search
    knowledge_base = {
        "password reset": [
            {"title": "How to Reset Password", "content": "Click 'Forgot Password' on login page..."},
            {"title": "Password Requirements", "content": "Passwords must be 8+ characters..."}
        ],
        "billing": [
            {"title": "Understanding Your Bill", "content": "Your bill includes..."},
            {"title": "Payment Methods", "content": "We accept credit cards..."}
        ],
        "shipping": [
            {"title": "Shipping Times", "content": "Standard shipping takes 3-5 days..."},
            {"title": "Tracking Your Order", "content": "You can track your order..."}
        ]
    }

    # Simple keyword matching
    results = []
    query_lower = query.lower()
    for key, articles in knowledge_base.items():
        if key in query_lower:
            results.extend(articles[:max_results])

    return results[:max_results]

# Tool 2: Look up order status
def get_order_status(order_id: str) -> dict:
    """Get current status of an order"""
    # Simulated database
    orders = {
        "ORD-12345": {
            "status": "shipped",
            "tracking": "TRK789XYZ",
            "estimated_delivery": "2024-03-15",
            "items": ["Product A", "Product B"]
        },
        "ORD-67890": {
            "status": "processing",
            "tracking": None,
            "estimated_delivery": "2024-03-20",
            "items": ["Product C"]
        }
    }

    return orders.get(order_id, {"error": "Order not found"})

# Tool 3: Create support ticket
def create_support_ticket(
    customer_email: str,
    issue_category: str,
    description: str,
    priority: str = "medium"
) -> dict:
    """Create a new support ticket"""
    ticket_id = f"TKT-{datetime.now().strftime('%Y%m%d%H%M%S')}"

    return {
        "ticket_id": ticket_id,
        "status": "open",
        "created_at": datetime.now().isoformat(),
        "message": f"Ticket {ticket_id} created successfully"
    }

# Define all functions for LLM
functions = [
    {
        "name": "search_knowledge_base",
        "description": "Search the knowledge base for help articles and documentation",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search query describing the issue"
                },
                "max_results": {
                    "type": "integer",
                    "description": "Maximum number of results to return",
                    "default": 5
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "get_order_status",
        "description": "Look up the current status of an order by order ID",
        "parameters": {
            "type": "object",
            "properties": {
                "order_id": {
                    "type": "string",
                    "description": "Order ID in format ORD-XXXXX"
                }
            },
            "required": ["order_id"]
        }
    },
    {
        "name": "create_support_ticket",
        "description": "Create a new support ticket for issues that need human attention",
        "parameters": {
            "type": "object",
            "properties": {
                "customer_email": {
                    "type": "string",
                    "description": "Customer's email address"
                },
                "issue_category": {
                    "type": "string",
                    "enum": ["billing", "shipping", "product", "technical", "other"],
                    "description": "Category of the issue"
                },
                "description": {
                    "type": "string",
                    "description": "Detailed description of the issue"
                },
                "priority": {
                    "type": "string",
                    "enum": ["low", "medium", "high", "urgent"],
                    "default": "medium"
                }
            },
            "required": ["customer_email", "issue_category", "description"]
        }
    }
]

# Function dispatcher
def execute_function(function_name: str, arguments: dict) -> dict:
    """Route function calls to appropriate handlers"""
    function_map = {
        "search_knowledge_base": search_knowledge_base,
        "get_order_status": get_order_status,
        "create_support_ticket": create_support_ticket
    }

    if function_name not in function_map:
        return {"error": f"Unknown function: {function_name}"}

    try:
        result = function_map[function_name](**arguments)
        return result
    except Exception as e:
        return {"error": str(e)}

# Test the agent
def run_support_agent(user_message: str):
    """Complete support agent with multi-tool capability"""
    messages = [{"role": "user", "content": user_message}]

    response = client.chat.completions.create(
        model="gpt-4",
        messages=messages,
        functions=functions,
        function_call="auto"
    )

    message = response.choices[0].message

    # Handle function calls
    if message.function_call:
        function_name = message.function_call.name
        function_args = json.loads(message.function_call.arguments)

        print(f"🔧 Calling: {function_name}")
        print(f"📝 Arguments: {function_args}")

        # Execute function
        function_result = execute_function(function_name, function_args)
        print(f"✅ Result: {function_result}\n")

        # Send result back to LLM
        messages.append(message)
        messages.append({
            "role": "function",
            "name": function_name,
            "content": json.dumps(function_result)
        })

        # Get final response
        second_response = client.chat.completions.create(
            model="gpt-4",
            messages=messages
        )

        return second_response.choices[0].message.content
    else:
        return message.content

# Test cases
print("=" * 60)
print("Test 1: Knowledge base search")
print("=" * 60)
response1 = run_support_agent("How do I reset my password?")
print(f"Agent: {response1}\n")

print("=" * 60)
print("Test 2: Order status lookup")
print("=" * 60)
response2 = run_support_agent("What's the status of order ORD-12345?")
print(f"Agent: {response2}\n")

print("=" * 60)
print("Test 3: Ticket creation")
print("=" * 60)
response3 = run_support_agent(
    "I received a damaged product and need a refund. My email is customer@example.com"
)
print(f"Agent: {response3}\n")

Output:

============================================================
Test 1: Knowledge base search
============================================================
🔧 Calling: search_knowledge_base
📝 Arguments: {'query': 'password reset', 'max_results': 5}
✅ Result: [{'title': 'How to Reset Password', 'content': "Click 'Forgot Password' on login page..."}, ...]

Agent: To reset your password, click 'Forgot Password' on the login page...

============================================================
Test 2: Order status lookup
============================================================
🔧 Calling: get_order_status
📝 Arguments: {'order_id': 'ORD-12345'}
✅ Result: {'status': 'shipped', 'tracking': 'TRK789XYZ', ...}

Agent: Your order ORD-12345 has been shipped! You can track it with tracking number TRK789XYZ...

============================================================
Test 3: Ticket creation
============================================================
🔧 Calling: create_support_ticket
📝 Arguments: {'customer_email': 'customer@example.com', 'issue_category': 'product', 'description': 'Received damaged product, requesting refund', 'priority': 'high'}
✅ Result: {'ticket_id': 'TKT-20240312153045', 'status': 'open', ...}

Agent: I've created a support ticket (TKT-20240312153045) for your damaged product issue...

Sequential Function Calls

Sometimes the LLM needs to call multiple functions in sequence:

def run_agent_with_multiple_calls(user_message: str, max_iterations: int = 5):
    """Agent that can make multiple function calls"""
    messages = [{"role": "user", "content": user_message}]
    iteration = 0

    while iteration < max_iterations:
        iteration += 1

        response = client.chat.completions.create(
            model="gpt-4",
            messages=messages,
            functions=functions,
            function_call="auto"
        )

        message = response.choices[0].message

        # If no function call, we're done
        if not message.function_call:
            return message.content

        # Execute function
        function_name = message.function_call.name
        function_args = json.loads(message.function_call.arguments)

        print(f"[Iteration {iteration}] Calling: {function_name}({function_args})")

        function_result = execute_function(function_name, function_args)

        # Add to conversation
        messages.append(message)
        messages.append({
            "role": "function",
            "name": function_name,
            "content": json.dumps(function_result)
        })

    return "Max iterations reached"

# Test: Multi-step query
response = run_agent_with_multiple_calls(
    "Check the status of order ORD-12345, and if it's shipped, search for tracking information in the knowledge base"
)
print(f"\nFinal response: {response}")

Output:

[Iteration 1] Calling: get_order_status({'order_id': 'ORD-12345'})
[Iteration 2] Calling: search_knowledge_base({'query': 'tracking order', 'max_results': 3})

Final response: Your order ORD-12345 has been shipped with tracking number TRK789XYZ. To track your order, visit our shipping carrier's website and enter your tracking number...

🛡️ Part 3: Error Handling and Production Patterns

Handling Function Errors

import time
from typing import Optional, Callable

def execute_function_with_retry(
    function_name: str,
    arguments: dict,
    max_retries: int = 3,
    backoff_factor: float = 2.0
) -> dict:
    """Execute function with exponential backoff retry"""

    function_map = {
        "search_knowledge_base": search_knowledge_base,
        "get_order_status": get_order_status,
        "create_support_ticket": create_support_ticket
    }

    if function_name not in function_map:
        return {"error": f"Unknown function: {function_name}"}

    func = function_map[function_name]

    for attempt in range(max_retries):
        try:
            result = func(**arguments)
            return {"success": True, "data": result}

        except Exception as e:
            error_message = str(e)

            # Log the error
            print(f"❌ Attempt {attempt + 1} failed: {error_message}")

            # Last attempt, return error
            if attempt == max_retries - 1:
                return {
                    "success": False,
                    "error": error_message,
                    "attempts": max_retries
                }

            # Wait before retry (exponential backoff)
            wait_time = backoff_factor ** attempt
            print(f"⏳ Retrying in {wait_time}s...")
            time.sleep(wait_time)

    return {"success": False, "error": "Max retries exceeded"}

# Example with retry logic
result = execute_function_with_retry(
    "get_order_status",
    {"order_id": "ORD-99999"},  # Non-existent order
    max_retries=3
)
print(result)

Validating Function Arguments

from typing import get_type_hints, get_args, get_origin
import inspect

def validate_function_call(function_name: str, arguments: dict, function_schemas: list) -> tuple[bool, Optional[str]]:
    """Validate that function call matches schema"""

    # Find schema
    schema = None
    for func_schema in function_schemas:
        if func_schema["name"] == function_name:
            schema = func_schema
            break

    if not schema:
        return False, f"Function {function_name} not found in schemas"

    # Check required parameters
    required_params = schema["parameters"].get("required", [])
    for param in required_params:
        if param not in arguments:
            return False, f"Missing required parameter: {param}"

    # Check parameter types
    properties = schema["parameters"]["properties"]
    for param, value in arguments.items():
        if param not in properties:
            return False, f"Unknown parameter: {param}"

        expected_type = properties[param]["type"]

        # Type checking
        if expected_type == "string" and not isinstance(value, str):
            return False, f"Parameter {param} must be string, got {type(value).__name__}"
        elif expected_type == "integer" and not isinstance(value, int):
            return False, f"Parameter {param} must be integer, got {type(value).__name__}"
        elif expected_type == "number" and not isinstance(value, (int, float)):
            return False, f"Parameter {param} must be number, got {type(value).__name__}"
        elif expected_type == "boolean" and not isinstance(value, bool):
            return False, f"Parameter {param} must be boolean, got {type(value).__name__}"

        # Check enum constraints
        if "enum" in properties[param]:
            allowed_values = properties[param]["enum"]
            if value not in allowed_values:
                return False, f"Parameter {param} must be one of {allowed_values}, got {value}"

    return True, None

# Test validation
test_cases = [
    ("get_order_status", {"order_id": "ORD-12345"}),  # Valid
    ("get_order_status", {}),  # Missing required
    ("get_order_status", {"order_id": 123}),  # Wrong type
    ("search_knowledge_base", {"query": "help", "max_results": "five"}),  # Wrong type
]

for func_name, args in test_cases:
    is_valid, error = validate_function_call(func_name, args, functions)
    if is_valid:
        print(f"✅ Valid: {func_name}({args})")
    else:
        print(f"❌ Invalid: {func_name}({args}) - {error}")

Output:

✅ Valid: get_order_status({'order_id': 'ORD-12345'})
❌ Invalid: get_order_status({}) - Missing required parameter: order_id
❌ Invalid: get_order_status({'order_id': 123}) - Parameter order_id must be string, got int
❌ Invalid: search_knowledge_base({'query': 'help', 'max_results': 'five'}) - Parameter max_results must be integer, got str

Rate Limiting and Throttling

import threading
from collections import defaultdict
from datetime import datetime, timedelta

class RateLimiter:
    """Simple rate limiter for function calls"""

    def __init__(self, max_calls: int, time_window: int):
        self.max_calls = max_calls
        self.time_window = time_window  # seconds
        self.calls = defaultdict(list)
        self.lock = threading.Lock()

    def is_allowed(self, function_name: str) -> bool:
        """Check if function call is allowed"""
        with self.lock:
            now = datetime.now()
            cutoff = now - timedelta(seconds=self.time_window)

            # Remove old calls
            self.calls[function_name] = [
                call_time for call_time in self.calls[function_name]
                if call_time > cutoff
            ]

            # Check if under limit
            if len(self.calls[function_name]) < self.max_calls:
                self.calls[function_name].append(now)
                return True

            return False

    def wait_time(self, function_name: str) -> float:
        """Calculate wait time until next call is allowed"""
        with self.lock:
            if len(self.calls[function_name]) < self.max_calls:
                return 0

            oldest_call = min(self.calls[function_name])
            next_available = oldest_call + timedelta(seconds=self.time_window)
            wait_seconds = (next_available - datetime.now()).total_seconds()

            return max(0, wait_seconds)

# Usage
rate_limiter = RateLimiter(max_calls=5, time_window=60)  # 5 calls per minute

def execute_with_rate_limit(function_name: str, arguments: dict) -> dict:
    """Execute function with rate limiting"""

    if not rate_limiter.is_allowed(function_name):
        wait_time = rate_limiter.wait_time(function_name)
        return {
            "error": "Rate limit exceeded",
            "retry_after": wait_time,
            "message": f"Please wait {wait_time:.1f} seconds before calling {function_name} again"
        }

    # Execute function
    return execute_function(function_name, arguments)

# Test rate limiting
print("Testing rate limiter (5 calls per 60 seconds):")
for i in range(7):
    result = execute_with_rate_limit("get_order_status", {"order_id": f"ORD-{i}"})
    if "error" in result:
        print(f"Call {i+1}: ❌ {result['message']}")
    else:
        print(f"Call {i+1}: ✅ Success")

Complete Production-Ready Agent

from dataclasses import dataclass
from typing import List, Dict, Any
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@dataclass
class FunctionResult:
    """Structured function execution result"""
    success: bool
    data: Any = None
    error: Optional[str] = None
    execution_time: float = 0.0
    retries: int = 0

class ProductionAgent:
    """Production-ready LLM agent with error handling, validation, and logging"""

    def __init__(
        self,
        client: OpenAI,
        functions: List[Dict],
        function_map: Dict[str, Callable],
        model: str = "gpt-4",
        max_iterations: int = 5,
        rate_limiter: Optional[RateLimiter] = None
    ):
        self.client = client
        self.functions = functions
        self.function_map = function_map
        self.model = model
        self.max_iterations = max_iterations
        self.rate_limiter = rate_limiter

    def execute_function(self, function_name: str, arguments: dict) -> FunctionResult:
        """Execute function with full error handling"""
        start_time = time.time()

        # Validate function exists
        if function_name not in self.function_map:
            return FunctionResult(
                success=False,
                error=f"Unknown function: {function_name}"
            )

        # Validate arguments
        is_valid, validation_error = validate_function_call(
            function_name, arguments, self.functions
        )
        if not is_valid:
            return FunctionResult(
                success=False,
                error=f"Validation error: {validation_error}"
            )

        # Check rate limit
        if self.rate_limiter and not self.rate_limiter.is_allowed(function_name):
            wait_time = self.rate_limiter.wait_time(function_name)
            return FunctionResult(
                success=False,
                error=f"Rate limit exceeded. Retry after {wait_time:.1f}s"
            )

        # Execute with retry
        result = execute_function_with_retry(function_name, arguments)
        execution_time = time.time() - start_time

        return FunctionResult(
            success=result.get("success", False),
            data=result.get("data"),
            error=result.get("error"),
            execution_time=execution_time,
            retries=result.get("attempts", 1) - 1
        )

    def run(self, user_message: str) -> str:
        """Run agent with complete error handling"""
        messages = [{"role": "user", "content": user_message}]
        iteration = 0

        logger.info(f"Starting agent for query: {user_message}")

        while iteration < self.max_iterations:
            iteration += 1
            logger.info(f"Iteration {iteration}/{self.max_iterations}")

            try:
                # Call LLM
                response = self.client.chat.completions.create(
                    model=self.model,
                    messages=messages,
                    functions=self.functions,
                    function_call="auto"
                )

                message = response.choices[0].message

                # No function call - we're done
                if not message.function_call:
                    logger.info("No function call, returning final answer")
                    return message.content

                # Extract function call
                function_name = message.function_call.name
                function_args = json.loads(message.function_call.arguments)

                logger.info(f"Calling function: {function_name}")
                logger.debug(f"Arguments: {function_args}")

                # Execute function
                result = self.execute_function(function_name, function_args)

                if not result.success:
                    logger.error(f"Function execution failed: {result.error}")
                    # Return error to user
                    return f"I encountered an error: {result.error}"

                logger.info(f"Function executed successfully in {result.execution_time:.2f}s")

                # Add to conversation
                messages.append(message)
                messages.append({
                    "role": "function",
                    "name": function_name,
                    "content": json.dumps(result.data)
                })

            except Exception as e:
                logger.exception("Unexpected error in agent loop")
                return f"I encountered an unexpected error: {str(e)}"

        logger.warning("Max iterations reached")
        return "I've reached my maximum number of steps. Please try rephrasing your question."

# Usage
agent = ProductionAgent(
    client=client,
    functions=functions,
    function_map={
        "search_knowledge_base": search_knowledge_base,
        "get_order_status": get_order_status,
        "create_support_ticket": create_support_ticket
    },
    model="gpt-4",
    max_iterations=5,
    rate_limiter=RateLimiter(max_calls=10, time_window=60)
)

response = agent.run("What's the status of order ORD-12345?")
print(response)

🎯 Conclusion: From Chatbots to Agents

Function calling transforms LLMs from passive text generators into active agents that can:

Query databases
Call APIs
Execute business logic
Orchestrate multi-step workflows
Take real actions in the world

The Business Impact:

💰 Cost:

Function calling is cheaper than prompt engineering complex instructions
One function call vs 500 tokens of examples
Reusable tools across conversations

📊 Quality:

Structured outputs (JSON) vs parsing natural language
Type safety and validation
Deterministic tool execution

⚡ Performance:

Direct API calls vs prompt-based "simulation"
Real-time data access
Composable, maintainable tools

Key Takeaways for Data Engineers

On Provider Independence:

Function calling is a universal pattern, NOT vendor-specific
OpenAI uses "functions", Anthropic uses "tools", Llama needs prompting
Use abstraction layers (LiteLLM, LangChain) for flexibility
Action: Design your system to swap providers without code rewrites
ROI Impact: Avoid vendor lock-in, negotiate better pricing, use best model for each task

On Function Calling Basics:

Define clear JSON schemas with descriptions
Use type hints and validation (Pydantic)
Test function calls independently before integrating with LLM
Action: Start with 1-2 simple functions, expand gradually
ROI Impact: Structured outputs eliminate parsing errors

On Multi-Tool Orchestration:

LLM automatically chooses the right tool for the task
Sequential function calls enable complex workflows
Function dispatcher pattern keeps code clean
Action: Create a function registry/dispatcher early
ROI Impact: One agent handles multiple use cases

On Production Patterns:

Always validate function arguments before execution
Implement retry logic with exponential backoff
Add rate limiting to protect backend services
Log everything for debugging
Action: Use the ProductionAgent pattern as a starting point
ROI Impact: Prevents cascading failures in production

Real-World Example: Customer Support Automation

Before function calling:

Manual knowledge base searches
Email/ticket system for every query
10-15 minute response time
High support costs

After function calling:

Instant knowledge base lookup
Automated order status checks
Tickets only for complex issues requiring humans
30-second response time
70% reduction in support tickets

This is the bridge from theory to production. You now understand how to build LLM agents that don't just talk—they act.

🛠️ Setup & Installation

Required Libraries

# Core (choose based on your provider)
pip install openai           # For OpenAI (GPT)
pip install anthropic        # For Anthropic (Claude)
pip install requests          # For Ollama/local models

# Recommended: Universal abstraction layer
pip install litellm          # Works with ALL providers

# Type safety & validation
pip install pydantic

# Optional: Alternative abstraction layers
pip install langchain        # Full framework
pip install llama-index      # For RAG applications

Provider Setup

OpenAI:

from openai import OpenAI
client = OpenAI(api_key="sk-...")  # Get key from platform.openai.com

Anthropic:

import anthropic
client = anthropic.Anthropic(api_key="sk-ant-...")  # Get key from console.anthropic.com

Llama (via Ollama - runs locally):

# Install Ollama: https://ollama.ai
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull llama3.1

# Run model server (starts on localhost:11434)
ollama serve

LiteLLM (Universal - Recommended):

from litellm import completion

# Set API keys as environment variables
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

# Now works with any model!
response = completion(
    model="gpt-4",  # or "claude-3-5-sonnet" or "ollama/llama3.1"
    messages=[{"role": "user", "content": "Hello"}],
    tools=[...]
)

Best Practice: Provider-Agnostic Architecture

Don't lock yourself into one provider:

class LLMProvider:
    """Abstract base for any LLM provider"""

    def call_with_tools(self, messages: list, tools: list) -> dict:
        raise NotImplementedError

class OpenAIProvider(LLMProvider):
    def __init__(self, api_key: str):
        self.client = OpenAI(api_key=api_key)

    def call_with_tools(self, messages: list, tools: list) -> dict:
        # OpenAI-specific implementation
        pass

class AnthropicProvider(LLMProvider):
    def __init__(self, api_key: str):
        self.client = anthropic.Anthropic(api_key=api_key)

    def call_with_tools(self, messages: list, tools: list) -> dict:
        # Anthropic-specific implementation
        pass

# Your application code stays the same!
provider = OpenAIProvider(api_key="...")  # or AnthropicProvider(...)
result = provider.call_with_tools(messages, tools)

This way, switching providers is just changing one line of configuration, not rewriting your entire codebase.

Ready to build your first agent? Start with one function and expand from there.

Tags: #DataEngineering #LLM #FunctionCalling #Python #AIAgents #ProductionAI #Automation

DEV Community

Function Calling and Tool Use: Turning LLMs into Action-Taking Agents

📚 Tech Acronyms Reference

🎯 Introduction: From Text Generation to Action

🔧 Part 1: Function Calling Basics

The Universal Concept: Provider-Agnostic

The Protocol: How It Works

Same Function, Three Providers

Provider 1: OpenAI (GPT-4)

Provider 2: Anthropic (Claude)

Provider 3: Llama (Open-Source via Ollama or vLLM)

Key Differences Between Providers

First Complete Example: Weather Tool (Provider-Agnostic)

Understanding JSON Schemas

Type Safety and Validation

🎛️ Part 2: Multi-Tool Orchestration

Example: Customer Support Agent

Sequential Function Calls

🛡️ Part 3: Error Handling and Production Patterns

Handling Function Errors

Validating Function Arguments

Rate Limiting and Throttling

Complete Production-Ready Agent

🎯 Conclusion: From Chatbots to Agents

Key Takeaways for Data Engineers

Real-World Example: Customer Support Automation

🛠️ Setup & Installation

Required Libraries

Provider Setup

Best Practice: Provider-Agnostic Architecture

Top comments (0)