Anthropic released Advanced Tool Use features for Claude on November 20, 2025, addressing one of the most significant challenges in AI agent development: context pollution from large tool ecosystems. As applications integrate more MCP (Model Context Protocol) servers - databases, APIs, file systems, web scrapers - the token cost of loading all tool definitions into every request becomes prohibitive. A typical application with 50 tools might consume 150K tokens per request just describing available tools, burning through 75% of Claude's 200K context window before any actual work begins.
Advanced Tool Use introduces three features that transform tool management: the Tool Search Tool with defer_loading for dynamic tool discovery, Programmatic Tool Calling for code-based orchestration, and Tool Use Examples for schema-validated inputs. Combined with code-first MCP patterns that auto-generate schemas from TypeScript or Python type annotations, these features achieve 85-98% efficiency improvements while maintaining full tool access.
Key Takeaways
- 85% Token Reduction with Tool Search Tool: Advanced Tool Use with defer_loading reduces prompt token usage by 85% while maintaining full tool access, cutting costs from 150K to 17K tokens per request.
- Programmatic Tool Calling for Complex Orchestration: Claude writes Python code to orchestrate multiple tools, reducing context pollution and enabling precise control flow with loops, conditionals, and error handling.
- Code-First MCP Patterns: Auto-generate JSON schemas from TypeScript/Python type annotations, achieving 98.7% efficiency gains and eliminating manual schema maintenance.
- June 2025 Security Spec Compliance: MCP deployments require OAuth 2.1, mandatory PKCE, and RFC 9728 Protected Resource Metadata for production readiness.
Technical Specifications
- Release: November 20, 2025
- Status: Beta
- Models: Claude Opus 4.5, Sonnet 4.5
- Token Savings: Up to 85%
- Accuracy: 49% to 74% (Opus 4 with Tool Search)
- Protocol: JSON-RPC 2.0
- Context: 200K / 500K (Enterprise) / 1M (Beta)
- Transports: stdio, HTTP, SSE
- Beta Header: anthropic-beta: advanced-tool-use-2025-11-20
Advanced Tool Use requires the beta header in all API requests. Features may change before general availability. The Tool Search Tool is supported on Anthropic API, Azure, Vertex AI, and Amazon Bedrock (invoke API only).
Advanced Tool Use: Three Features That Change Everything
Tool Search Tool
Dynamic tool discovery with defer_loading:
- 85% token reduction
- BM25 + regex search
- 3-5 tools per search
Programmatic Calling
Code-based tool orchestration:
- Python execution
- Reduced context pollution
- Loops and conditionals
Tool Use Examples
Schema-validated input patterns:
- input_examples field
- Complex nested objects
- Format-sensitive inputs
Tool Search Tool: Dynamic Discovery with defer_loading
The Tool Search Tool fundamentally changes how Claude interacts with large tool sets by introducing a two-phase execution model. In standard tool use, you provide all tool definitions upfront in the initial API request. Claude analyzes the user's query, selects appropriate tools from those provided, and executes them. This works well for applications with 3-5 tools but becomes inefficient with 20+ tools, as tool definitions consume significant prompt tokens that count against context limits and increase API costs.
Tool Definition with defer_loading
{
"name": "query_database",
"description": "Execute SQL queries against the database. Use for retrieving, filtering, and aggregating data from any table.",
"defer_loading": true,
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The SQL query to execute"
},
"limit": {
"type": "integer",
"description": "Maximum rows to return (default: 100)"
}
},
"required": ["query"]
}
}
With defer_loading: true, tools aren't loaded into Claude's context initially - they're only loaded when Claude discovers them via Tool Search. The Tool Search Tool contains lightweight metadata about all available tools (names, brief descriptions, categories), consuming approximately 2K tokens regardless of how many tools you have. When Claude receives a user query, it first calls the Tool Search Tool with a natural language description of what it needs.
Search Variants: BM25 vs Regex
| Search Variant | How It Works | Best For |
|---|---|---|
| BM25 | Natural language semantic matching using the BM25 algorithm | Most use cases (default) |
| Regex | Pattern-based exact matching with regex patterns | Known tool names, debugging, specific lookups |
Keep your 3-5 most frequently used tools always loaded (without defer_loading), and defer the rest. This balances immediate access for common operations with on-demand discovery for everything else.
Token Savings Breakdown
Before: Standard Tool Use
- 50 tools x 3K tokens each
- = 150,000 tokens per request
- 75% of 200K context consumed
After: Tool Search Tool
- Tool Search (2K) + 5 tools (15K)
- = 17,000 tokens per request
- 8.5% of 200K context consumed
Result: 89% Token Reduction: 150K to 17K tokens per request
Programmatic Tool Calling: Code-Based Orchestration
Programmatic Tool Calling enables Claude to orchestrate tools through Python code rather than individual API round-trips. Instead of Claude requesting tools one at a time with each result being returned to its context, Claude writes code that calls multiple tools, processes their outputs, and controls what information enters its context window. This eliminates context pollution from intermediate results that consume token budgets without providing value.
Benefits
- Precise Control Flow: Loops, conditionals, and error handling are explicit in code rather than implicit in Claude's reasoning.
- Reduced Context Pollution: Process 1,000 records in code but only pass the top 10 results into Claude's context window.
- Fewer API Round-trips: Multiple tool calls execute in a single code block instead of sequential API requests.
Example: Python Orchestration
# Claude writes this code to orchestrate multiple tools
def process_sales_pipeline():
# Fetch 1,000 records - only code sees all of them
raw_leads = fetch_database_records(
query="SELECT * FROM leads WHERE status = 'active'",
limit=1000
)
# Process in code - filter, transform, aggregate
qualified = [
lead for lead in raw_leads
if lead['score'] > 80 and lead['last_contact_days'] < 30
]
# Enrich only the top candidates
for lead in qualified[:10]:
lead['company_info'] = fetch_company_data(lead['domain'])
# Only the final 10 enriched leads enter Claude's context
return {
'total_processed': len(raw_leads),
'qualified_count': len(qualified),
'top_leads': qualified[:10]
}
Programmatic Tool Calling excels when you need to process large datasets, apply complex transformations, or coordinate multiple tools in sequence. It's particularly valuable for data pipelines, batch operations, and scenarios where intermediate results shouldn't pollute Claude's context.
Tool Use Examples: Schema-Validated Input Patterns
For tools with complex inputs, nested objects, or format-sensitive parameters, the input_examples field provides schema-validated examples that help Claude understand how to use your tools effectively. While JSON schemas define what's structurally valid, they can't express usage patterns: when to include optional parameters, which combinations make sense, or what conventions your API expects.
Tool Definition with input_examples
{
"name": "create_report",
"description": "Generate a formatted report from data sources",
"input_schema": {
"type": "object",
"properties": {
"data_source": { "type": "string" },
"format": { "enum": ["pdf", "xlsx", "html"] },
"filters": { "type": "object" },
"include_charts": { "type": "boolean" }
},
"required": ["data_source", "format"]
},
"input_examples": [
{
"data_source": "sales_2025",
"format": "pdf",
"filters": { "region": "EMEA", "quarter": "Q4" },
"include_charts": true
},
{
"data_source": "inventory_current",
"format": "xlsx",
"filters": { "warehouse": "EU-West", "status": "low_stock" }
}
]
}
Tool Use Examples is most valuable for complex tools with nested objects, multiple optional parameters, or format-sensitive inputs where JSON schema alone doesn't convey usage patterns. Focus examples on common use cases and non-obvious parameter combinations.
MCP vs Function Calling: When to Use Each
Function calling and MCP serve complementary purposes in AI agent development. Understanding when to use each approach helps you make the right architectural decisions for your application's needs.
| Aspect | Function Calling | MCP (Model Context Protocol) |
|---|---|---|
| Architecture | Embedded in API requests | Client-server separation |
| State | Stateless (each call independent) | Persistent connections |
| Reusability | Per-application | Cross-application |
| Tool Updates | Requires code deployment | Runtime dynamic updates |
| Setup Complexity | Low | Medium-High |
| Best For | Simple integrations, prototypes | Large tool sets, enterprise, multi-app |
Choose Function Calling When
- Building quick prototypes
- Using 5 or fewer tools
- Single-application use case
- No cross-platform needs
Choose MCP When
- Managing 10+ tools
- Reusing tools across applications
- Need dynamic tool updates at runtime
- Enterprise deployments
Code-First MCP Pattern Implementation
Code-first MCP patterns eliminate the traditional separation between code implementation and schema definition by generating schemas automatically from typed function signatures. Instead of writing a tool function and then separately maintaining a JSON schema that describes its parameters, you write a normal TypeScript or Python function with type annotations, and the MCP SDK automatically generates the schema Claude needs.
TypeScript Code-First Pattern
import { tool } from '@anthropic-ai/sdk';
interface QueryDatabaseParams {
/** The SQL query to execute against the database */
query: string;
/** Maximum rows to return (default: 100) */
limit?: number;
/** Timeout in milliseconds */
timeout?: number;
}
// Schema generated automatically from types and JSDoc
export const queryDatabase = tool<QueryDatabaseParams>({
name: 'query_database',
description: 'Execute SQL queries against the production database',
handler: async ({ query, limit = 100, timeout = 5000 }) => {
const results = await db.query(query, { limit, timeout });
return { success: true, rowCount: results.length, rows: results };
}
});
Python Code-First Pattern
from anthropic import tool
from pydantic import BaseModel, Field
class QueryDatabaseParams(BaseModel):
query: str = Field(description="The SQL query to execute")
limit: int = Field(default=100, description="Maximum rows to return")
timeout: int = Field(default=5000, description="Timeout in milliseconds")
@tool
async def query_database(params: QueryDatabaseParams) -> dict:
"""Execute SQL queries against the production database.
Use this tool when you need to retrieve, filter, or aggregate
data from any database table. Supports SELECT queries only.
"""
results = await db.query(params.query, limit=params.limit)
return {"success": True, "row_count": len(results), "rows": results}
Benefits of Code-First Patterns
- Auto-Sync Schemas: Schemas stay synchronized with code automatically. No risk of schema drift where documentation doesn't match implementation.
- Compile-Time Validation: TypeScript/Python type checking catches parameter validation errors during development, not at runtime.
- Safe Refactoring: Rename parameters using IDE refactoring tools - both implementation and schema update together.
Model Compatibility: Opus 4.5 vs Sonnet 4.5
| Feature | Claude Opus 4.5 | Claude Sonnet 4.5 |
|---|---|---|
| Tool Search Tool | Supported | Supported |
| Programmatic Tool Calling | Supported | Supported |
| Tool Use Examples | Supported | Supported |
| Accuracy with Tool Search | 74% (improved from 49%) | Not published |
| Context Window | 200K tokens | 200K (500K Enterprise) |
| Input Pricing | $5.00 per 1M tokens | $3.00 per 1M tokens |
| Output Pricing | $25.00 per 1M tokens | $15.00 per 1M tokens |
Choose Opus 4.5 When
- Maximum tool selection accuracy needed
- Complex multi-step tool orchestration
- Budget allows for premium pricing
- Mission-critical tool decisions
Choose Sonnet 4.5 When
- Cost optimization is priority
- High-volume tool usage
- Standard accuracy is sufficient
- Production workloads at scale
Cost Optimization: Pricing and Savings Analysis
| Model | Input (per 1M) | Output (per 1M) | Cached Input | Batch API |
|---|---|---|---|---|
| Claude Opus 4.5 | $5.00 | $25.00 | $0.50 (90% off) | $2.50 (50% off) |
| Claude Sonnet 4.5 | $3.00 | $15.00 | $0.30 (90% off) | $1.50 (50% off) |
Monthly Savings Example: 50-Tool Application
Using Claude Sonnet 4.5 with 10,000 requests/day:
Before: Standard Tool Use
- 50 tools x 3K tokens = 150K tokens/request
- 10,000 requests/day x 30 days = 300,000 requests
- 150K x 300K x $3/1M input tokens
- = $13,500/month (tool definitions only)
After: Tool Search Tool
- Tool Search (2K) + 5 tools (15K) = 17K tokens
- 300,000 requests x 17K tokens
- 17K x 300K x $3/1M input tokens
- = $1,530/month (89% reduction)
Monthly Savings: $11,970 | Annual Savings: $143,640
Stacked Savings Potential
- Tool Search: 85% token reduction
- Prompt Caching: 90% on cached portions
- Batch API: 50% for non-urgent tasks
- Combined: Up to 97% maximum potential
MCP Security: June 2025 Specification Requirements
The June 2025 MCP specification introduced significant security requirements that affect all production deployments. These changes align MCP with enterprise security standards and address vulnerabilities identified in early implementations.
| Requirement | Specification | Status |
|---|---|---|
| OAuth 2.1 | MCP servers as resource servers only (separate auth servers) | Mandatory |
| PKCE | Proof Key for Code Exchange for all public clients | Mandatory |
| RFC 9728 | Protected Resource Metadata with WWW-Authenticate | Mandatory |
| RFC 8707 | Resource Indicators for audience-scoped tokens | Recommended |
| DPoP / mTLS | Token binding for enhanced security | Recommended |
| Session Security | Secure random IDs, no session-based auth | Mandatory |
Security Notice: MCP servers MUST NOT use sessions for authentication. Generated session IDs must use secure random number generators and be bound to user-specific information. Never pass through tokens received from MCP clients to upstream APIs - this creates confused deputy vulnerabilities.
Authorization Server Separation
MCP servers now act as OAuth 2.1 resource servers only. Authentication flows through your existing identity provider (Okta, Azure AD, etc.) rather than the MCP server itself.
Human-in-the-Loop Controls
Mandate human approval for actions with financial, security, or reputational impact. Automatically flag high-risk tool calls for human review.
When NOT to Use Advanced Tool Use: Honest Guidance
Don't Use Tool Search Tool For
- Small tool sets (under 5 tools) - Overhead exceeds savings
- Latency-critical applications - Adds 50-100ms per search
- Consistent tool usage patterns - Same 2-3 tools every time
- Rapid prototyping - Get working first, optimize later
Use Simpler Approaches When
- Single-function integrations - Function calling is simpler
- One-off scripts - Direct API calls more efficient
- No tool reuse potential - MCP shines with reusability
- Resource-constrained environments - MCP servers have overhead
Quick Decision Framework
- 1-5 tools: Load all tools directly (no Tool Search needed)
- 5-20 tools: Hybrid approach (keep top 3-5 loaded, defer rest)
- 20+ tools: Full Tool Search Tool implementation
- 50+ tools: Tool Search + categorical filtering + usage analytics
Common Mistakes: Lessons from Production Deployments
Mistake #1: Over-Granular Tool Definitions
The Error: Creating 200+ micro-tools when 30 well-designed tools would suffice. Each tool adds metadata overhead and search noise.
The Impact: Lower tool selection accuracy (more similar tools to distinguish), higher maintenance burden, slower Tool Search responses.
The Fix: Consolidate related operations into logical tool groups. Aim for 20-50 tools maximum with clear, non-overlapping responsibilities.
Mistake #2: Skipping Tool Descriptions
The Error: Using auto-generated or minimal descriptions like "Query the database" instead of comprehensive 3-4 sentence descriptions.
The Impact: 15-20% lower tool selection accuracy. Claude can't distinguish between similar tools or understand edge case behavior.
The Fix: Invest 30 minutes per tool in description writing. Explain WHAT it does, WHEN to use it, WHEN NOT to use it, and all parameters.
Mistake #3: Ignoring the June 2025 Security Spec
The Error: Deploying MCP servers without implementing OAuth 2.1, PKCE, and RFC 9728 requirements introduced in June 2025.
The Impact: Security vulnerabilities, non-compliance with enterprise requirements, potential data exposure through confused deputy attacks.
The Fix: Review and implement June 2025 spec requirements before production deployment. Use authorization servers separate from MCP servers.
Mistake #4: Deploying Without Metrics
The Error: Launching Tool Search Tool without monitoring tool selection accuracy, latency, or usage patterns.
The Impact: Can't identify optimization opportunities, no visibility into the 6% failure cases, no data for improving descriptions.
The Fix: Track from day one: tools searched, tools selected, tool success rate, latency. Use failures to improve tool descriptions and add category filtering.
Mistake #5: Not Testing Edge Cases
The Error: Only testing happy paths where Tool Search returns the exact tool needed. Not testing the 6% failure cases.
The Impact: Production failures when queries don't match any tool, when multiple tools partially match, or when the wrong tool is selected.
The Fix: Test: no matches, multiple partial matches, wrong tool selection, retry behavior, fallback to manual tool specification.
Migration Strategy for Existing Applications
Migrating existing Claude applications to Advanced Tool Use can happen incrementally without disrupting production systems. Start by auditing your current tool usage patterns - identify which tools get used most frequently and which are rarely called.
Week 1-4: Audit & Plan
- Audit current tool usage patterns
- Identify high/low frequency tools
- Implement Tool Search Tool in parallel
- Set up monitoring infrastructure
Week 5-8: Validate
- A/B test with 10% traffic
- Monitor selection accuracy
- Measure latency impact
- Validate token reduction
Week 9-12: Roll Out
- Migrate low-risk tool categories
- Gradually increase traffic
- Migrate business-critical tools
- Sunset legacy approach
Hybrid Approach: Keep your 3-5 most frequently used tools always loaded (without defer_loading), and defer the rest. This provides immediate access for common operations while reducing token usage for the long tail of tools.
Conclusion
Claude Advanced Tool Use with the Tool Search Tool, Programmatic Tool Calling, and code-first MCP patterns represents a fundamental improvement in how AI applications manage tool ecosystems. The 85% token reduction from Tool Search with defer_loading eliminates the linear relationship between tool count and API costs, enabling applications to integrate dozens or hundreds of tools without prohibitive overhead.
Combined with the June 2025 MCP security specification (OAuth 2.1, PKCE, RFC 9728), these patterns provide a production-ready foundation for enterprise AI deployments. For teams building serious AI applications, the investment in Advanced Tool Use pays for itself within 1-2 months through API cost savings alone - while actually improving tool selection accuracy from 49% to 74% on complex MCP evaluations.
Frequently Asked Questions
What is Claude Advanced Tool Use and how does it differ from standard tool use?
Claude Advanced Tool Use, released by Anthropic on November 20, 2025, introduces three features for managing large tool ecosystems: Tool Search Tool (dynamic tool discovery with defer_loading: true), Programmatic Tool Calling (code-based orchestration), and Tool Use Examples (schema-validated inputs). Standard tool use requires sending all tool definitions upfront - with 50 tools consuming ~150K tokens per request. Advanced Tool Use with Tool Search reduces this to ~17K tokens (85% reduction) while maintaining full tool access. Internal testing showed accuracy improvements from 49% to 74% on MCP evaluations when using Tool Search with Opus 4.
How does the Tool Search Tool work with defer_loading?
The Tool Search Tool uses a two-phase execution model. You provide all tool definitions to the API but mark tools with defer_loading: true to make them discoverable on-demand. Claude initially sees only the Tool Search Tool (plus any non-deferred tools). When Claude needs additional tools, it searches using either BM25 (natural language semantic matching) or regex (exact pattern matching). The API returns 3-5 most relevant tool definitions, which are automatically expanded for Claude to use. This semantic search approach handles query variations - whether users say 'read a file,' 'get file contents,' or 'load a document,' the search returns your file reading tool.
What is Programmatic Tool Calling and when should I use it?
Programmatic Tool Calling enables Claude to orchestrate tools through Python code rather than individual API round-trips. Instead of Claude requesting tools one at a time with each result entering its context, Claude writes code that calls multiple tools, processes their outputs, and controls what information enters the context window. Use it when you need complex data transformations, loops over multiple records, conditional execution paths, or when you want to reduce context pollution from intermediate results. It's particularly valuable for data pipelines where you might process 1,000 records but only want the top 10 results in Claude's context.
How does MCP compare to OpenAI function calling?
Function calling embeds tool definitions directly in API requests - stateless, per-application, and tightly coupled. MCP uses a client-server architecture with separation of concerns - persistent connections, cross-application reuse, and dynamic tool updates. Choose function calling for simple integrations (under 5 tools), quick prototypes, or single-application use cases. Choose MCP when managing 10+ tools, reusing tools across multiple applications, needing dynamic tool updates at runtime, or building enterprise deployments. The smart approach layers both: function calling for prototyping, MCP for production scalability.
What are the cost savings from using Advanced Tool Use?
Advanced Tool Use delivers substantial cost savings for applications with large tool sets. Example: 50 tools x 3K tokens each = 150K tokens per request. At Claude Sonnet 4.5 pricing ($3/million input tokens), this costs $0.45 per request. With Tool Search Tool: Tool Search metadata (2K tokens) + 5 selected tools (15K tokens) = 17K tokens, costing $0.051 per request - an 89% reduction. For an application making 10,000 requests daily, this saves $3,990/day or ~$120,000/month. Additional savings stack: prompt caching (90% discount on cached portions) and batch API (50% discount for non-urgent tasks) can achieve combined reductions up to 97%.
Can I use Advanced Tool Use with existing MCP servers?
Yes, Advanced Tool Use works with existing MCP servers through a progressive enhancement approach. You don't need to rewrite current implementations - add a Tool Search Tool layer that sits between Claude and your existing MCP servers. The Tool Search Tool maintains metadata about all your MCP server tools, and when Claude queries it, you dynamically load tool definitions from your existing servers. Anthropic's TypeScript and Python SDKs include adapter utilities that automatically generate Tool Search Tool metadata from existing MCP server definitions. Migrate gradually, starting with low-risk tool categories before moving business-critical tools.
What is defer_loading and how do I use it?
The defer_loading parameter (set to true) marks tool definitions as discoverable on-demand rather than loaded upfront. Tools with defer_loading: true aren't loaded into Claude's context initially - they're only loaded when Claude discovers them via Tool Search. At least one tool must be non-deferred (typically the Tool Search Tool itself). This is how you achieve the 85% token reduction: instead of 50 tools consuming 150K tokens, you pay only for the Tool Search Tool (~2K tokens) plus the 3-5 tools Claude actually selects (~15K tokens). The feature requires the beta header: anthropic-beta: advanced-tool-use-2025-11-20.
What are the security requirements for MCP in 2025?
The June 2025 MCP specification introduced significant security requirements. OAuth 2.1 is mandatory - MCP servers must act as resource servers only, with authentication handled by separate authorization servers. PKCE (Proof Key for Code Exchange) is required for all public clients. RFC 9728 Protected Resource Metadata is mandatory - servers must return HTTP 401 with WWW-Authenticate headers pointing to metadata documents. RFC 8707 Resource Indicators are recommended for audience-scoped tokens. Additionally, MCP servers MUST NOT use sessions for authentication, and session IDs must use secure random number generators bound to user-specific information.
Which Claude model is best for tool use - Opus 4.5 or Sonnet 4.5?
Both Claude Opus 4.5 and Sonnet 4.5 support all Advanced Tool Use features. Choose Opus 4.5 ($5/$25 per million tokens) when you need maximum tool selection accuracy (tested at 74% with Tool Search vs 49% without), complex multi-step tool orchestration, or when budget allows premium pricing. Choose Sonnet 4.5 ($3/$15 per million tokens) for cost optimization on high-volume usage, when standard accuracy is sufficient, or for production workloads where you've validated tool selection works well. For most production applications, start with Sonnet 4.5 and upgrade to Opus only if tool selection accuracy issues arise.
How do I debug MCP server connection issues?
Launch Claude Code with the --mcp-debug flag to identify configuration issues. Use the /context command to measure how much context each MCP server consumes. Check your MCP configuration scopes: local (~/.claude.json for private servers), project (.mcp.json for shared team servers), or user-level settings. Common issues include incorrect paths, missing environment variables for credentials, and transport protocol mismatches. For the Tool Search Tool specifically, log which queries return which tools to identify selection accuracy problems. Track metrics from day one: tools searched, tools selected, success rates, and latency.
What is prompt caching and how does it reduce costs?
Prompt caching allows Claude to reuse previously processed portions of your prompts, reducing costs by 90% on cached portions. For Tool Search Tool implementations, this means your static tool metadata can be cached across requests. At Sonnet 4.5 pricing, cached input tokens cost $0.30 per million instead of $3.00 - a 90% reduction. Combined with Tool Search (85% reduction) and batch API (50% reduction for non-urgent tasks), you can achieve combined savings up to 97%. Cache read tokens are charged at 0.1x the base input token price. Prompt caching is most effective for repetitive workloads with consistent prompt prefixes.
Can I use MCP with Amazon Bedrock, Vertex AI, or Azure?
Yes, Tool Search Tool is available through multiple providers: Anthropic API (direct), Azure Anthropic, Google Cloud Vertex AI, and Amazon Bedrock (invoke API only, not converse API). Each provider may have slightly different implementation details. On Bedrock, use the invoke API for Tool Search support. All providers support the same defer_loading parameter and beta header requirements. Check each provider's documentation for specific integration patterns and any provider-specific limitations or features.
What are BM25 and regex search variants in Tool Search?
Tool Search supports two search variants. BM25 (Best Match 25) uses natural language semantic matching - Claude describes what it needs, and BM25 ranks tools by relevance. This is the default and handles query variations well. Regex search uses pattern-based exact matching - Claude constructs regex patterns to find tools by name or description patterns. Use BM25 for most use cases as it handles natural language queries better. Use regex when you need exact tool name lookups or debugging specific tool selection. LiteLLM and other providers expose both variants with configurable preferences.
How do Tool Use Examples improve tool selection accuracy?
Tool Use Examples (input_examples field) provide schema-validated example inputs that help Claude understand complex tool usage patterns. While JSON schemas define what's structurally valid, they can't express when to include optional parameters, which combinations make sense, or format conventions. input_examples fill this gap with concrete examples. Use them for tools with nested objects, multiple optional parameters, format-sensitive inputs, or non-obvious usage patterns. This is a beta feature requiring the same advanced tool use beta header. Examples should cover common use cases and edge cases to maximize accuracy.
Top comments (0)