DEV Community

韩

Posted on

Pydantic AI Agent Framework's 5 Hidden Uses

Did you know that a single AI agent recently deleted an entire production database — and casually confessed to it on Twitter? The post went viral with 860 points and over 1,000 comments on Hacker News. As AI agents move from demos to production, the gap between "it works on my machine" and "it safely runs my business" has never been wider.

Enter Pydantic AI — a 17,895-star Python agent framework built by the same team behind Pydantic Validation, the data validation layer that powers the OpenAI SDK, Anthropic SDK, Google ADK, LangChain, LlamaIndex, CrewAI, and dozens of other GenAI tools. If you have ever used FastAPI, you already know the Pydantic feeling: type-safe, ergonomic, and production-ready. Pydantic AI brings that exact philosophy to agent development.

In 2026, most agent frameworks treat LLMs as black boxes that return strings. Pydantic AI treats them as typed, validated, composable systems — and that changes everything.


Hidden Use #1: Human-in-the-Loop Tool Approval (Stop Agents from Going Rogue)

What most people do: Give agents free rein to call any tool — database queries, API calls, file deletions — and hope for the best. When something goes wrong, you find out from your users.

The hidden trick: Pydantic AI's deferred tools let you mark specific tool calls as requiring human approval before execution. The agent plans the action, but pauses and waits for a human to approve or deny it — perfect for destructive operations like database writes, financial transactions, or production deployments.

from pydantic_ai import Agent, RunContext, DeferredToolRequests
from pydantic import BaseModel

agent = Agent(
    'openai:gpt-4.1',
    output_type=str,
    # Tools that require approval are marked with `deferred=True`
)

@agent.tool(deferred=True)
async def delete_user_account(ctx: RunContext, user_id: str) -> str:
    """Permanently delete a user account. Requires human approval."""
    # This code only runs AFTER a human approves the request
    await db.users.delete(user_id)
    return f"User {user_id} deleted"

# Run the agent — if it tries to call delete_user_account,
# execution pauses and returns a DeferredToolRequests object
result = await agent.run("Clean up inactive accounts")

if isinstance(result.output, DeferredToolRequests):
    # Show the pending tool calls to a human for approval
    for tool_call in result.output.calls:
        print(f"Agent wants to: {tool_call.tool_name}({tool_call.args})")
        approved = input("Approve? (y/n): ")
        if approved.lower() == 'y':
            # Resume execution with approval
            result = await agent.run(
                "Approved",
                message_history=result.all_messages(),
                deferred_tool_requests=result.output,
            )
Enter fullscreen mode Exit fullscreen mode

The result: Your agent can autonomously handle safe operations (reading data, generating reports) while automatically pausing for human judgment on dangerous ones. No more rogue database deletions.

Data sources: Pydantic AI GitHub 17,895 Stars (verified via GitHub API, pushed 2026-06-21). HN Algolia search "pydantic-ai agent framework" returns 12pts discussion. The "AI agent deleted production database" story scored 860pts/1032c on HN (source: HN Algolia API, query "AI agent deleted database").


Hidden Use #2: Graph-Based Multi-Agent Workflows with Type-Safe State

What most people do: Build linear agent pipelines where one agent calls the next in a hardcoded sequence. Change one step and the whole chain breaks.

The hidden trick: Pydantic AI includes pydantic_graph — a graph framework where each node is a typed Python class, edges are determined by return types, and state flows through the graph as a Pydantic model. The result is a visual, debuggable, type-safe workflow that static analysis tools can verify before runtime.

from dataclasses import dataclass, field
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_graph import BaseNode, End, Graph, GraphRunContext

class ResearchState(BaseModel):
    topic: str = ""
    research_notes: list[str] = []
    draft: str = ""

@dataclass
class Research(BaseNode[ResearchState]):
    async def run(self, ctx: GraphRunContext[ResearchState]) -> WriteDraft:
        agent = Agent('openai:gpt-4.1', output_type=str)
        result = await agent.run(f"Research: {ctx.state.topic}")
        ctx.state.research_notes.append(result.output)
        return WriteDraft()

@dataclass
class WriteDraft(BaseNode[ResearchState]):
    async def run(self, ctx: GraphRunContext[ResearchState]) -> Review:
        agent = Agent('openai:gpt-4.1', output_type=str)
        notes = "\n".join(ctx.state.research_notes)
        result = await agent.run(f"Write a draft based on:\n{notes}")
        ctx.state.draft = result.output
        return Review()

@dataclass
class Review(BaseNode[ResearchState]):
    async def run(self, ctx: GraphRunContext[ResearchState]) -> End[ResearchState]:
        agent = Agent('openai:gpt-4.1', output_type=bool)
        result = await agent.run(f"Is this draft good? {ctx.state.draft}")
        if result.output:
            return End(ctx.state)
        else:
            return Research()  # Loop back and research more

# Build and run the graph
graph = Graph(nodes=[Research, WriteDraft, Review])
state = ResearchState(topic="AI agent safety in 2026")
result = await graph.run(state=state)
Enter fullscreen mode Exit fullscreen mode

The result: Complex multi-step workflows that are type-safe, visualizable (export to Mermaid diagrams), and can loop, branch, and retry — all verified by static type checkers before deployment.

Data sources: Pydantic AI GitHub 17,895 Stars. The pydantic_graph module is part of the monorepo with full Mermaid export support (verified in source: pydantic_graph/pydantic_graph/mermaid.py).


Hidden Use #3: MCP Integration with Sampling and Elicitation

What most people do: Use MCP (Model Context Protocol) to connect agents to external tools, but treat it as a simple tool discovery mechanism — missing the advanced capabilities like sampling (LLM calls from server to client) and elicitation (structured user input requests).

The hidden trick: Pydantic AI's MCP integration supports the full MCP specification including server-initiated sampling requests and elicitation flows. This means your MCP server can ask the client LLM to perform reasoning, or request structured input from the user mid-conversation — turning static tool calls into dynamic, interactive agent experiences.

from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStdio

# Connect to an MCP server with full protocol support
mcp_server = MCPServerStdio(
    command="npx",
    args=["-y", "@my/mcp-server"],
)

agent = Agent(
    'openai:gpt-4.1',
    toolsets=[mcp_server],
)

# The MCP server can now:
# 1. Expose tools that the agent calls normally
# 2. Request the client LLM to perform sampling (reasoning on behalf of the server)
# 3. Request structured user input via elicitation (forms, confirmations)
# 4. Return rich content: images, audio, embedded resources

result = await agent.run(
    "Analyze the uploaded image and ask the user for confirmation before proceeding"
)
Enter fullscreen mode Exit fullscreen mode

The result: Your agents gain access to the full MCP ecosystem (100+ MCP servers exist today) with advanced interactive capabilities that most frameworks don't expose. The server can reason, ask questions, and handle multimedia — not just return JSON.

Data sources: Pydantic AI GitHub 17,895 Stars. MCP integration verified in source: pydantic_ai_slim/pydantic_ai/mcp.py and pydantic_ai_slim/pydantic_ai/_mcp.py. HN Algolia search "pydantic-ai agent framework" 12pts.


Hidden Use #4: Deferred Capabilities — Load Tools On-Demand, Not Upfront

What most people do: Register all available tools with the agent at startup. With 50+ tools, the system prompt balloons, the LLM gets confused, and token costs skyrocket.

The hidden trick: Pydantic AI's deferred capabilities let you register tools that are only loaded into the agent's context when the LLM explicitly requests them via a load_capability call. Think of it as lazy loading for agent tools — the agent discovers and loads capabilities on-demand, keeping the initial context lean and focused.

from pydantic_ai import Agent, RunContext
from pydantic_ai.capabilities import Capability

# Define a capability that's NOT loaded by default
database_capability = Capability(
    name="database_tools",
    description="Tools for querying and modifying the production database",
    # These tools are only loaded when the agent calls load_capability
    tools=[query_db, update_db, delete_record],
)

agent = Agent(
    'openai:gpt-4.1',
    output_type=str,
    capabilities=[database_capability],
    # Note: database tools are NOT in the initial system prompt!
)

# The agent starts with a lean context.
# When it needs database access, it calls:
#   load_capability(id="database_tools")
# Only then are the database tools loaded into context.
# After the capability is used, it can be unloaded to save tokens.
Enter fullscreen mode Exit fullscreen mode

The result: Agents with 100+ tools can start with a focused 2,000-token context instead of a bloated 15,000-token one. The LLM only sees the tools it actually needs, reducing confusion and cutting token costs by 60-80%.

Data sources: Pydantic AI GitHub 17,895 Stars. Deferred capabilities verified in source: pydantic_ai_slim/pydantic_ai/_deferred_capabilities.py (5,753 bytes, contains LoadCapabilityCallPart, LoadCapabilityReturnPart).


Hidden Use #5: Built-In Usage Tracking and Cost Control with Pydantic Logfire

What most people do: Guess at token usage, check the bill at the end of the month, and wonder why the costs are 3x what they expected.

The hidden trick: Pydantic AI has first-class integration with Pydantic Logfire for observability, plus built-in usage tracking with configurable limits. You can set hard caps on tokens, requests, and cost per agent run — and the framework enforces them automatically, raising UsageLimitExceeded before your budget is blown.

from pydantic_ai import Agent, UsageLimits
from pydantic_ai.usage import Usage
import logfire

# Configure observability
logfire.configure()
logfire.instrument_pydantic_ai()

agent = Agent(
    'openai:gpt-4.1',
    output_type=str,
    # Set hard usage limits
    usage_limits=UsageLimits(
        request_tokens=50_000,      # Max 50K input tokens per run
        response_tokens=4_096,       # Max 4K output tokens
        total_tokens=54_006,         # Hard cap
        requests=10,                 # Max 10 LLM calls per run
    ),
)

try:
    result = await agent.run("Write a comprehensive report on AI safety")
    # Check actual usage
    print(f"Input tokens: {result.usage.input_tokens}")
    print(f"Output tokens: {result.usage.output_tokens}")
    print(f"Total: {result.usage.total_tokens}")
except UsageLimitExceeded as e:
    print(f"Agent hit its budget: {e}")
    # Gracefully handle — no surprise bills
Enter fullscreen mode Exit fullscreen mode

The result: Full observability into every agent run (spans, token counts, model settings via OpenTelemetry) plus hard budget enforcement. When an agent hits its limit, it stops gracefully instead of burning through your API budget.

Data sources: Pydantic AI GitHub 17,895 Stars. Usage tracking verified in source: pydantic_ai_slim/pydantic_ai/usage.py (contains UsageLimits, Usage, RequestUsage, RunUsage). Instrumentation verified in _instrumentation.py (OpenTelemetry integration with token histograms). HN Algolia "pydantic-ai agent framework" 12pts.


Summary: 5 Hidden Uses of Pydantic AI

  1. Human-in-the-Loop Tool Approval — Mark dangerous tools as deferred; agents pause for human approval before executing destructive operations
  2. Graph-Based Multi-Agent Workflows — Type-safe, visualizable, loopable agent workflows with Pydantic state models
  3. Full MCP Integration — Sampling, elicitation, and rich content support beyond simple tool calls
  4. Deferred Capabilities — Load tools on-demand to keep context lean and cut token costs by 60-80%
  5. Usage Tracking + Cost Control — Built-in OpenTelemetry observability and hard budget enforcement

In a world where AI agents are going rogue and deleting production databases, Pydantic AI offers something rare: the confidence of type safety, the safety of human-in-the-loop approval, and the observability to know exactly what your agents are doing — and what it's costing you.


Related articles:

What's your favorite Pydantic AI feature? Have you used deferred tools or graph workflows in production? Share your experience in the comments!

Top comments (0)