Why we need to decouple "Capabilities" from "Cognition" to build scalable Agent OS architectures.
If you are building AI agents today, you are probably doing something like this:
- Write a Python function
get_weather(city). - Import it directly into your Agent's main loop.
- Hardcode the function schema into the system prompt or API call.
- Restart the entire application every time you add a new tool.
This is tight coupling. It breaks the Single Responsibility Principle. Your Agent logic shouldn't know how get_weather is implemented, and your infrastructure shouldn't need a restart just because you added a calculator.
We need to treat Tools as a Service.
I built the Agent Tool Registry (ATR) to solve this. It is a type-safe, decentralized registry that lets agents discover capabilities at runtime.
Here is why you need it and how to use it.
The Philosophy: Scale by Subtraction
In complex systems, we often try to scale by adding layers. But the best way to scale is often by subtracting dependencies.
ATR subtracts the dependency between the Agent (the Consumer) and the Tool (the Provider).
- The Tool Provider simply registers a capability: "I can scrape websites."
-
The Agent simply asks for a capability: "Give me a tool that has tag
scraping."
This decoupling allows you to build an Agent OS where tools can be deployed, updated, and versioned independently of the agents that use them.
How It Works
ATR sits in Layer 2 (Infrastructure) of the agent stack. It handles registration, discovery, and schema generation—but crucially, it does not execute code. Execution belongs to the runtime (Control Plane).
1. Installation
pip install agent-tool-registry
2. Registration (The Provider)
You don't need complex configuration files. Just use the @atr.register decorator. ATR uses Pydantic under the hood to enforce strict type safety.
import atr
@atr.register(
name="web_scraper",
cost="low",
tags=["web", "scraping"],
side_effects=["network"]
)
def scrape_website(url: str, timeout: int = 30) -> str:
"""Scrape content from a website.
Args:
url: The URL to scrape
timeout: Request timeout in seconds
"""
# Your logic here...
return f"Content from {url}"
Notice the type hints (url: str). ATR extracts these automatically. If you forget them, it raises an error. No Any allowed.
3. Discovery (The Consumer)
The Agent doesn't import scrape_website. It asks the registry for it.
# Find tools by tag
web_tools = atr.search_tools(tags=["web"])
# Get the specific tool spec
tool_spec = atr.get_tool("web_scraper")
# Generate the schema for OpenAI/Anthropic
openai_schema = tool_spec.to_openai_function_schema()
This openai_schema is exactly what the LLM needs to understand how to call the function. You didn't have to write a single line of JSON manually.
4. Execution (The Runtime)
This is a key architectural decision: The Registry is not the Runtime.
ATR gives you the callable, but you execute it within your own safety context (e.g., inside a sandbox or wrapped by a Constraint Engine).
# Get the actual function
func = atr.get_callable("web_scraper")
# Execute it (usually inside a try/except block in your runtime)
result = func(url="https://example.com")
Why This Matters for Production
- Hot Swapping: You can update the registry implementation without redeploying the agent logic.
-
Safety Metadata: We explicitly track
side_effects(e.g.,filesystem,network,delete). Your agent runtime can use this to say, "Deny all tools withfilesystemside effects." - Type Integrity: By enforcing Pydantic validation at the registry level, we prevent "Garbage In" from the LLM.
The Bigger Picture
ATR is just one piece of the Agent OS ecosystem I'm building.
- Layer 1: Context-as-a-Service (CaaS)
- Layer 2: Agent Tool Registry (ATR)
- Layer 3: Agent Control Plane
We are moving away from monolithic "Bots" and toward modular, scalable cognitive architectures.
Stop hardcoding. Start registering.
🔗 GitHub: imran-siddique/atr
📦 PyPI: pip install agent-tool-registry
Top comments (0)