Mukunda Rao Katta

Posted on May 25

The Agent That Retried Five Times Over a String: tool-arg-coerce-py

#hermeschallenge #ai #python #agents

The agent called set_rate_limit with {"limit": "20"}.

The function signature was def set_rate_limit(limit: int). Python raised a TypeError because you cannot do integer arithmetic on a string. The agent saw the error, decided it must have sent the wrong value, and retried. It sent {"limit": "20"} again. Same error. Five retries. Five identical failures.

The fix was one line: int("20"). That is a string-to-integer coercion. The model understood the tool. It knew the right value. It just serialized it as a JSON string instead of a JSON number. The retry loop was burning tokens and time on a problem that had nothing to do with reasoning.

That is the problem tool-arg-coerce-py solves.

The shape of the fix

Install it:

pip install tool-arg-coerce-py

Define your tool schema the normal way:

schema = {
    "type": "object",
    "properties": {
        "limit": {"type": "integer"},
        "enabled": {"type": "boolean"},
        "tags": {"type": "array", "items": {"type": "string"}},
    },
    "required": ["limit"],
}

The model returns args with type mismatches:

raw_args = {
    "limit": "20",       # string, should be integer
    "enabled": "true",   # string, should be boolean
    "tags": "ops,infra", # string, should be array
}

Run them through the coercer:

from tool_arg_coerce import coerce_args

result = coerce_args(raw_args, schema)

print(result.args)
# {"limit": 20, "enabled": True, "tags": ["ops", "infra"]}

print(result.coercions)
# [
#   Coercion(key="limit", from_type="str", to_type="int", original="20", coerced=20),
#   Coercion(key="enabled", from_type="str", to_type="bool", original="true", coerced=True),
#   Coercion(key="tags", from_type="str", to_type="list", original="ops,infra", coerced=["ops","infra"]),
# ]

Pass result.args to your tool function. No TypeError. No retry loop.

You can also drive coercion from a function signature instead of a schema:

def set_rate_limit(limit: int, enabled: bool = True):
    pass

result = coerce_args(raw_args, fn=set_rate_limit)

The library reads typing.get_type_hints() from the function and builds the coercion rules from that. Same result, one fewer place to keep in sync.

What it does NOT do

It does not validate that args are correct after coercion. That is a separate job. agentvet handles post-coercion validation.
It does not fill in missing args. If a required key is absent, the coercer raises. Filling missing args is what tool-arg-defaults does.
It does not rename keys. If the model sends rateLimit but your schema says rate_limit, the coercer does not fix that. That is tool-arg-rename.
It does not fuzzy-match enum strings. If your schema has {"enum": ["read", "write"]} and the model sends "READ", the coercer does not lower-case it for you. That is tool-arg-fuzzy.

Coercion is one step in a pipeline. This library does one step cleanly.

Inside the lib: the CoercionResult design

The most useful thing about this library is not the coercion itself. It is the record.

Every coercion produces a Coercion object with four fields: the key name, the original type name, the target type name, and the original and coerced values. The CoercionResult wraps the final args dict and the full list of these records.

In development, you can print result.coercions after every tool call and see exactly what the model got wrong. This is much faster than adding print statements to your tool functions or reading raw LLM logs.

In production, you can log all coercions to a file or metric:

result = coerce_args(raw_args, schema)

for c in result.coercions:
    metrics.increment("tool_arg_coercion", tags={
        "tool": tool_name,
        "key": c.key,
        "from": c.from_type,
        "to": c.to_type,
    })

Now you have a signal. If limit starts coming back as a string 40% of the time instead of 5%, something changed in how the model interprets your tool description. You want to know before users start seeing weird behavior. The coercions that were invisible before are now countable.

This is the design choice that makes the library worth adding even in cases where a simpler approach would work. Silent coercion is a black box. Recorded coercion is observable.

The library also refuses ambiguous coercions rather than guessing. If the schema says boolean and the model sends "maybe", the library raises CoercionError. It does not return False or None or invent a default. A bad coercion that silently goes through is worse than an error that surfaces the problem.

When this is useful

When your tool functions have typed signatures and the model occasionally returns mismatched types, coercion cuts out the retry loop entirely.

When you are migrating a tool from one schema version to another and cannot update all callers at once, coercion gives you a compatibility shim that also logs what it had to fix.

When you want drift detection without building a separate pipeline, logging result.coercions gives you a lightweight signal that the model's output distribution has shifted.

When you are running fine-tuning or evaluation experiments and want to track how often the base model gets arg types right, the coercion log gives you a precision metric per-field.

When NOT to use it

If your model is consistently returning the wrong types for the same fields, the root cause is probably the tool description. Fix the description first. Coercion treats a symptom. If the symptom appears 90% of the time, the description is the problem.

If your tool accepts arbitrary input and you are not working from a fixed schema, the library cannot help. It needs a schema or a typed function signature to know what the target types are.

If you need the coercion to handle nested objects recursively, check the current version first. The library handles top-level keys. Deep nesting has limits. Open an issue if you need it.

Install

pip install tool-arg-coerce-py

GitHub: MukundaKatta/tool-arg-coerce-py

39 tests, zero dependencies.

Siblings

These four libraries solve adjacent problems in the arg-processing pipeline. They compose. The typical order is: rename keys, fill defaults, coerce types, validate.

Lib	Boundary	Repo
tool-arg-rename	Rename arg keys before processing	MukundaKatta/tool-arg-rename
tool-arg-defaults	Fill missing args before coercion	MukundaKatta/tool-arg-defaults
agentvet	Validate args after coercion	MukundaKatta/agentvet
tool-arg-fuzzy	Fuzzy-match enum values, complementary	MukundaKatta/tool-arg-fuzzy

There is also a Rust sibling crate tool-arg-coerce that produces identical output. If you are running a mixed Python and Rust stack, both libraries use the same coercion rules and the same CoercionResult shape, so logs from both sides are comparable.

What is next

A few things I want to add:

A pipeline helper that runs rename, defaults, coerce, and validate in the right order with one call. Right now you wire up each step manually. That is fine for control, but annoying for the common case.

Nested object support. Top-level coercion covers most real-world cases. But some tool schemas have nested objects and arrays of objects, and the coercer should handle those the same way.

A coercion rate summary method on CoercionResult. Something that returns {"limit": 0.95, "enabled": 0.12} to show what fraction of calls needed coercion per field. Useful for evaluation reports.

The repo is open. Issues and PRs welcome.

DEV Community