ANKIT AGARWAL

Posted on Mar 21

Agent-Corex: Solving Context Bloat in LLM-Based Systems

#ai #opensource #agents #llm

Agent-Corex: Intelligent Tool Selection for LLM Agents

The Problem

You're building an LLM-powered autonomous agent with access to 100+ tools. What's the best approach?

Option 1: Include all tools in the system prompt
❌ Massive context window (expensive)
❌ Slower inference
❌ Model gets confused with too many options

Option 2: Manually curate tools per use case
❌ Time-consuming
❌ Error-prone
❌ Doesn't scale

There's a better way.

Meet Agent-Corex

Agent-Corex is an open-source tool retrieval engine that intelligently selects the N most relevant tools for any given query using a hybrid ranking system.

How It Works

from agent_core import rank_tools

tools = [
    {"name": "get_weather", "description": "Get current weather for a location"},
    {"name": "send_email", "description": "Send an email to a recipient"},
    {"name": "search_web", "description": "Search the internet for information"},
    # ... 97 more tools
]

query = "What's the weather in San Francisco?"
relevant_tools = rank_tools(tools, query=query, top_k=5)

# Returns: [get_weather, search_web] (only the relevant ones!)

The Hybrid Ranking Algorithm

Agent-Corex combines two approaches:

1. Keyword Ranking (BM25)

Speed: <1ms
Perfect for: Quick, deterministic filtering
No external dependencies
Great for resource-constrained environments

2. Semantic Ranking (Embeddings)

Speed: 50-100ms (cached)
Perfect for: Understanding meaning and context
Uses sentence-transformers + FAISS
Highly accurate

3. Hybrid Scoring

Combines both (30% keyword + 70% semantic)
Best of both worlds
Customizable weights

Real-World Impact

Before Agent-Corex:

System with 200 tools
Average prompt size: 45,000 tokens
Average inference time: 2.3 seconds
Cost per 1M tokens: $15

After Agent-Corex:

System with 200 tools
Average prompt size: 2,000 tokens (95% reduction!)
Average inference time: 0.5 seconds (4.6x faster!)
Cost per 1M tokens: $15 but using 95% fewer tokens

Getting Started

Installation:

pip install agent-corex

Basic Usage:

from agent_core import rank_tools

tools = [
    {"name": "get_weather", "description": "Get current weather"},
    {"name": "send_email", "description": "Send email"},
    {"name": "search_web", "description": "Search the web"},
]

# Simple keyword ranking
results = rank_tools(tools, query="weather", method="keyword")

# Semantic ranking
results = rank_tools(tools, query="weather", method="embedding")

# Hybrid (recommended)
results = rank_tools(tools, query="weather", method="hybrid", top_k=2)

API Server:

agent-corex --host 0.0.0.0 --port 8000

Then POST to /retrieve_tools:

curl -X POST http://localhost:8000/retrieve_tools \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Check the weather",
    "method": "hybrid",
    "top_k": 5
  }'

Project Stats

📦 Open Source: MIT License
✅ Test Coverage: 95%+
🐍 Python Support: 3.8 - 3.12
🚀 Production Ready: Used in real deployments
📚 Full Documentation: Installation, API, deployment guides
🐳 Docker Support: Ready to deploy anywhere

Use Cases

Autonomous Agents - Select tools dynamically based on task
Multi-Step Reasoning - Different tools for different steps
Cost Optimization - Reduce token usage, cut LLM costs
Resource-Constrained Environments - Faster inference on local hardware
MCP Server Integration - Works with any MCP-compatible tools

Contributing

We're early stage and looking for:

Users to test and provide feedback
Contributors to improve algorithms
Implementers for new ranking methods
Documentation improvements

What's Next?

We're working on:

Web UI for tool management
Analytics dashboard
Advanced caching strategies
GPU-accelerated embeddings
Multi-language support

Start using Agent-Corex today and let us know what you think!

DEV Community