Mukunda Rao Katta

Posted on May 25

tool-arg-coerce-py: Coerce LLM Tool Args to Expected Types Before They Break Things

#hermeschallenge #ai #python #agents

The Arg That Broke the Tool

The schema said count should be an integer. The LLM sent "5". A string. The tool called range("5"), and the agent crashed.

This is not a rare edge case. It is a regular occurrence. Language models are trained on text. When they generate tool call arguments, they sometimes produce strings where your schema expects numbers, booleans, or arrays. The model is not broken. The contract between model output and your code is just looser than you assumed.

You have a few options. You can add defensive int() casts everywhere in your tool implementations. That works until you forget one, and that one is always the one that runs in production. You can write a custom validator that runs before each tool call. That is the right idea, but now you have custom code to maintain for every tool. You can use Pydantic models with coercion on. That is the best option if you are already using Pydantic, and you should stop reading here if you are.

If you are not using Pydantic, you can use tool-arg-coerce-py. It reads the JSON Schema for a tool, compares it to the actual args the model produced, and coerces mismatches before they reach your code. Every conversion is recorded in a CoercionResult. Nothing happens silently. You always know what was fixed.

Shape of the Fix

from tool_arg_coerce_py import coerce_args, CoercionResult

schema = {
    "type": "object",
    "properties": {
        "count": {"type": "integer"},
        "verbose": {"type": "boolean"},
        "tags": {"type": "array", "items": {"type": "string"}},
        "threshold": {"type": "number"},
    },
    "required": ["count"]
}

raw_args = {
    "count": "5",         # LLM sent a string
    "verbose": "true",    # LLM sent a string
    "tags": "urgent",     # LLM sent a plain string instead of array
    "threshold": "0.75",  # LLM sent a string
}

result: CoercionResult = coerce_args(raw_args, schema)

print(result.coerced)
# {"count": 5, "verbose": True, "tags": ["urgent"], "threshold": 0.75}

print(result.conversions)
# [
#   ("count", "str->int"),
#   ("verbose", "str->bool"),
#   ("tags", "str->array"),
#   ("threshold", "str->float"),
# ]

print(result.unchanged)
# []  -- all args needed coercion this time

Pass result.coerced to your tool instead of the raw args. The conversions list tells you exactly what changed. You can log it, alert on it, or ignore it.

What It Does NOT Do

It does not validate that required fields are present. Use a schema validator like jsonschema or pydantic for that. The coercion step runs first; validation runs after.

It does not coerce between incompatible types. If the schema says integer and the LLM sends "hello", you get a CoercionError, not a silent failure. The library only converts when the conversion is lossless and unambiguous.

It does not recurse into nested object properties in v0.1.0. Top-level properties only. Nested schemas are passed through unchanged. This is a known limitation. If you have deeply nested tool schemas, you will need to flatten them or wait for a future release.

It does not modify your JSON Schema. The schema is read-only input. The library never rewrites your contract to match the model's output.

Inside the Lib

The coercion logic for each type is kept small and explicit.

def _coerce_value(value, target_type):
    if target_type == "integer":
        if isinstance(value, str):
            return int(value), "str->int"
        if isinstance(value, float) and value.is_integer():
            return int(value), "float->int"

    elif target_type == "boolean":
        if isinstance(value, str):
            if value.lower() in ("true", "1", "yes"):
                return True, "str->bool"
            if value.lower() in ("false", "0", "no"):
                return False, "str->bool"
            raise CoercionError(f"Cannot coerce '{value}' to boolean")

    elif target_type == "array":
        if isinstance(value, str):
            return [value], "str->array"
        if isinstance(value, (int, float, bool)):
            return [value], "scalar->array"

    elif target_type == "number":
        if isinstance(value, str):
            return float(value), "str->float"

    return value, None  # no coercion needed

The function returns the coerced value and a short conversion label. The label goes into CoercionResult.conversions. When conversion is not needed, the label is None and the arg goes into CoercionResult.unchanged.

The outer loop walks the schema properties, calls _coerce_value for each arg, and builds the result object. The whole library is under 200 lines. No external dependencies.

CoercionResult is a plain dataclass. You can serialize it to JSON for logging.

import json
print(json.dumps({
    "coerced": result.coerced,
    "conversions": result.conversions,
}))

When Useful / When Not

Useful when you have tools with strict parameter types and the LLM producing them is not perfectly consistent. Useful for multi-model setups where some models are more type-strict than others. Useful when you want an audit trail of type corrections for debugging or model evaluation.

Not useful if your tools already use pydantic models with automatic coercion. Pydantic handles most of the same conversions. If you are already validating with Pydantic, you may not need this. Not useful for deeply nested schemas in v0.1.0. Not useful if you want silent coercion with no record, because this library always records what it changes.

The best fit is a lightweight agent that does not use Pydantic, has a small set of tools with typed parameters, and wants a clean log of every type fix the LLM caused.

There is also a diagnostic use case. Run it alongside your existing tool validation for a while. Check the conversions log. If one tool constantly sends "true" instead of True for a boolean arg, that tells you something about how the model is reading the tool description. You might improve the description to reduce the frequency, then verify the fix by watching conversions drop.

Install

pip install tool-arg-coerce-py

PyPI publish is pending. Clone from GitHub in the meantime:

git clone https://github.com/MukundaKatta/tool-arg-coerce-py
cd tool-arg-coerce-py
pip install -e .

No runtime dependencies. Python 3.10 and above.

The test suite has 39 tests. Run them with:

pytest tests/

Siblings

Library	What it does	Language
`agentvet-rs`	Tool arg schema validation	Rust
`tool-arg-defaults`	Fill missing args from schema defaults	Python
`tool-arg-fuzzy`	Fuzzy-match LLM enum args to valid values	Python
`tool-arg-rename`	Convert kwarg case conventions	Python
`tool-schema-from-fn`	Function signature to tool schema	Python
`tool-result-validator`	Validate tool output against schema	Python

These libraries address different parts of the tool call lifecycle. Coerce runs before your tool executes. Defaults fills gaps. Fuzzy fixes near-miss enum values. Rename handles casing. Validator checks the output after execution.

What Is Next

v0.2.0 targets:

Nested object property coercion. The current implementation only handles top-level properties. Recursion into nested schemas is the most-requested missing feature.
Array item coercion. If the schema specifies "items": {"type": "integer"} and the array contains string elements, coerce each item.
Async variant. A non-blocking async_coerce_args for agents that pipeline coercion alongside other async work.
Integration example with tool-schema-from-fn. Generate the schema from a function signature, then coerce incoming args automatically.

The library exists because type errors from LLM-generated tool args are a persistent low-level friction in agent development. Most teams handle it ad hoc, with scattered int() casts and silent conversions that never get logged. Having a shared, tested, auditable solution removes that friction without adding a heavy dependency.

The 39 tests in the suite are not just coverage theater. Several of them document edge cases that were surprising to find: "True" (capital T) is handled correctly, "FALSE" (all caps) is handled correctly, a string "1" for a boolean maps to True, and "yes" and "no" are also valid boolean coercions because some models use natural-language boolean strings.

Pull requests welcome at MukundaKatta/tool-arg-coerce-py. Part of the Hermes Agent Challenge sprint.

DEV Community