Mukunda Rao Katta

Posted on May 25

Two Copies of the Tool List, One Agent That Broke: agent-fn-registry

#hermeschallenge #ai #python #agents

The agent called a tool that did not exist.

Not in theory. Not a hypothetical. In production, mid-session, the model emitted a tool call for search_records. The loop looked for search_records in the second tool list. It was not there. The session died with a lookup error.

The second tool list existed because two parts of the loop both needed the tool definitions.

The first part was the request builder. It assembled the tools array that went to the API. I had defined the tool schemas there, inline, as a list of dicts. That was step one of the agent session startup.

The second part was the dispatch handler. When the model returned a tool call, the handler looked up the callable by name and ran it. I had defined that name-to-function mapping there, also inline, also as a dict literal.

Two definitions. The same set of tools, conceptually. The same tool names, supposedly. Except they had drifted.

At some point I had added search_records to the handler dict and forgotten to add it to the schema list. So the model never saw it in the tool list, which meant it should never have called it. Except I later removed a guard that filtered tool calls against the schema list. So the model could now call tools that were not in the schema list. And at some other point, I had added search_records to the schema list but removed it from the handler dict.

The exact sequence does not matter. The pattern does. Two definitions of the same concept, in two places, edited by two different people (sometimes the same person in two different headspaces), drifting apart over time.

The agent tried to call a tool that no longer existed in the handler. The session failed. The fix was to stop having two definitions.

agent-fn-registry is the small Python library I wrote to enforce one definition. On PyPI as agent-fn-registry. 26 tests, zero dependencies.

The Shape of the Fix

Before, the agent loop had this pattern scattered across its startup code:

# In the request builder
tools = [
    {
        "name": "search_records",
        "description": "Search the record database.",
        "input_schema": { ... }
    },
    {
        "name": "update_status",
        "description": "Update a record status.",
        "input_schema": { ... }
    },
]

# Elsewhere, in the dispatch handler
HANDLERS = {
    "search_records": search_records_fn,
    "update_status": update_status_fn,
}

After, it looks like this:

from agent_fn_registry import AgentFnRegistry
from tool_side_effects_tag import SideEffects, Tag

registry = AgentFnRegistry()

registry.register(
    fn=search_records_fn,
    schema={
        "name": "search_records",
        "description": "Search the record database.",
        "input_schema": { ... }
    },
    side_effects=SideEffects({Tag.READ}),
    defaults={"limit": 10},
)

registry.register(
    fn=update_status_fn,
    schema={
        "name": "update_status",
        "description": "Update a record status.",
        "input_schema": { ... }
    },
    side_effects=SideEffects({Tag.WRITE}),
    defaults={},
)

Now the request builder asks the registry:

tools = registry.tool_list()  # Returns the list of schemas

And the dispatch handler asks the registry:

entry = registry.get("search_records")
result = entry.fn(**validated_args)

One definition. Both parts of the loop read from the same source. They cannot drift apart because there is nothing to drift.

What It Does NOT Do

It does not generate schemas. That is tool-schema-from-fn. The registry stores whatever schema dict you give it. It does not inspect function signatures or docstrings.
It does not validate arguments. That is agentvet. The registry stores the schema so agentvet can use it, but calling registry.get("name") does not run any validation.
It does not fill in missing arguments. That is tool-arg-defaults. The registry stores defaults so tool-arg-defaults can use them, but the registry itself does not apply them.
It does not run the function. You call entry.fn(**args) yourself. The registry is a lookup table, not an executor.

Inside the Lib: Explicit Registration, Not Decorators

The main design choice was explicit registration over decorator-based registration.

Most tool libraries in Python reach for the decorator pattern:

@tool(schema=...)
def search_records(...):
    ...

Decorators are convenient but they couple registration to import time. If you want to know what is in the registry, you have to import every module that uses the decorator. In a large codebase, that import chain can be expensive. In tests, it can mean pulling in a large chunk of the application just to check whether a function is registered.

Explicit registration is more verbose. But it is also visible. You can read the startup code and see exactly which functions are in the registry at that point. You can build a test registry with a subset of tools without importing anything you do not need. You can build two different registries in the same process, one for each agent session with different tool sets.

# Full registry for production
prod_registry = AgentFnRegistry()
prod_registry.register(fn=search_records_fn, schema=..., ...)
prod_registry.register(fn=update_status_fn, schema=..., ...)

# Narrow registry for a test
test_registry = AgentFnRegistry()
test_registry.register(fn=search_records_fn, schema=..., ...)
# update_status not registered, intentionally

The registry does not live in a module-level singleton. It is an ordinary object. You instantiate it, you populate it, you pass it around. No global state, no import-time side effects.

This also means you can inspect the registry in tests without simulating a full agent session:

def test_registry_has_expected_tools():
    reg = build_my_registry()
    assert reg.get("search_records") is not None
    assert reg.get("update_status") is not None
    assert reg.get("delete_all") is None  # Not registered

That test does not need the model, the API client, or the session loop. It just checks what is in the registry.

The Entry Object

When you call registry.get("search_records"), you get back an AgentFnEntry:

entry = registry.get("search_records")

entry.fn          # The callable
entry.schema      # The tool schema dict (for the API tool list)
entry.side_effects  # SideEffects instance (for parallel-safety decisions)
entry.defaults    # Dict of default arg values

All four pieces in one place. The request builder uses entry.schema. The parallel scheduler uses entry.side_effects. The argument filler uses entry.defaults. The executor uses entry.fn.

tool_list() returns a list of all registered schemas, in registration order, ready to pass directly to the API:

response = anthropic.messages.create(
    model="claude-sonnet-4-6",
    tools=registry.tool_list(),
    messages=[...],
)

When This Is Useful

You have a medium-to-large agent with more than a handful of tools and you have already had at least one incident where two definitions of the same tool diverged.

You are writing a dispatch loop that needs to look up callables by name at runtime, and you want the lookup to fail loudly if a tool is missing rather than silently returning None.

You want to build different tool sets for different agent sessions or different user permission levels, by constructing different registry instances with different subsets of tools.

You are writing tests for the agent loop and you want to assert that the registry contains the right tools without instantiating the full application.

When This Is NOT What You Want

Your agent has two or three tools and you are comfortable managing the schema list and the handler dict by hand. The registry is a layer of indirection that adds value at the point where maintaining two in-sync dicts becomes error-prone. For a small fixed tool set, that point may never arrive.

Your team already uses a framework that provides a tool registry. LangChain, LlamaIndex, and similar frameworks have their own patterns for this. If you are already in one of those ecosystems, adding a separate registry creates a third definition of the tool set, which is exactly the problem this library exists to prevent.

Your function signatures are the schemas. If you use tool-schema-from-fn to generate schemas at startup and immediately pass them to the API, you may not need a persistent registry. The generation step is the single source of truth. The registry matters more when schema generation happens at build time or is done manually.

Install

pip install agent-fn-registry

Zero dependencies. Python 3.9 and above.

from agent_fn_registry import AgentFnRegistry

registry = AgentFnRegistry()

registry.register(
    fn=my_tool_fn,
    schema={
        "name": "my_tool",
        "description": "Does the thing.",
        "input_schema": {
            "type": "object",
            "properties": {
                "target": {"type": "string", "description": "What to target."}
            },
            "required": ["target"]
        }
    },
    side_effects=None,
    defaults={},
)

entry = registry.get("my_tool")
result = entry.fn(target="example")

Repo: https://github.com/MukundaKatta/agent-fn-registry

Sibling Libraries

These four libraries connect at the same boundary. Each handles a different part of the tool definition and call lifecycle.

Lib	Boundary	Repo
`tool-schema-from-fn`	Generate the schema from the function signature	https://github.com/MukundaKatta/tool-schema-from-fn
`tool-side-effects-tag`	The tag types (READ, WRITE, IDEMPOTENT, DESTRUCTIVE) stored in the registry	https://github.com/MukundaKatta/tool-side-effects-tag
`agentvet`	Validate args using the stored schema before execution	https://github.com/MukundaKatta/agentvet
`tool-arg-defaults`	Fill missing kwargs from the stored defaults	https://github.com/MukundaKatta/tool-arg-defaults

The intended flow: generate a schema with tool-schema-from-fn, tag side effects with tool-side-effects-tag, register both with agent-fn-registry, validate args with agentvet, fill gaps with tool-arg-defaults, then call entry.fn. Each library is independent. You can use any one without the rest.

What's Next

Two things would improve this library.

First, a names() method that returns the registered tool names as a set. Right now you can get individual entries and get the full tool list, but there is no clean way to check "is this tool name in the registry" without calling get and checking for None. A names() set would make permission gating cleaner: if tool_name in registry.names() is more readable than if registry.get(tool_name) is not None.

Second, a merge(other_registry) method that combines two registries. This would be useful for the case where a base set of tools is always registered, and per-session or per-user tools are registered on top. Right now you would have to call register individually for each entry from the other registry. A merge call would make composable registries straightforward.

The root problem this library solves is having two definitions of the same concept. One definition drifts, the other does not, and you find out at runtime when the agent tries to call a tool that is missing from one side of the split. The fix is a single object both sides read from. The library is that object.

DEV Community