Mukunda Rao Katta

Posted on May 25

llm-tool-arg-coerce: Stop LLM String Arguments From Crashing Your Tools

#hermeschallenge #ai #python #agents

The model sent "42". Your tool expected 42. Your app crashed.

You built a tool that queries a database by record ID. The function signature says record_id: int. You pass it to your LLM. The LLM calls the tool. It sends {"record_id": "42"}. Python gets a string. Your database query fails with a type error. The LLM retries. It sends "42" again. You have a loop.

This happens constantly with tool use. LLMs generate JSON. JSON does not know the difference between 42 and "42". Your tool does. The gap between what the LLM sends and what your function expects is real, and it falls on you to bridge it.

You could write coercion logic in every tool function. Or you could do it once and call it everywhere.

llm-tool-arg-coerce coerces LLM tool arguments to the types your functions actually expect. It handles str to int, float, bool, list, and dict. It works from a schema you provide or directly from a function signature. It returns a CoercionResult that records every conversion that happened.

The shape of the fix

Without coercion, a tool call might look like this:

def query_records(record_id: int, limit: int = 10) -> list:
    return db.query(record_id=record_id, limit=limit)

# LLM sends this:
args = {"record_id": "42", "limit": "10"}

# This crashes:
query_records(**args)

With llm-tool-arg-coerce:

from llm_tool_arg_coerce import coerce

result = coerce(
    fn=query_records,
    args={"record_id": "42", "limit": "10"},
)

print(result.coerced)      # {"record_id": 42, "limit": 10}
print(result.conversions)  # [Conversion(key="record_id", from_type="str", to_type="int", original="42"), ...]
print(result.failed)       # []

Then call the function with the coerced args:

query_records(**result.coerced)

For tools defined by a schema rather than a function, use schema mode:

schema = {
    "record_id": {"type": "integer"},
    "limit": {"type": "integer"},
    "include_deleted": {"type": "boolean"},
}

result = coerce(
    schema=schema,
    args={"record_id": "42", "limit": "10", "include_deleted": "true"},
)

print(result.coerced)
# {"record_id": 42, "limit": 10, "include_deleted": True}

The result.conversions list tells you exactly what was changed:

[
    Conversion(key="record_id", from_type="str", to_type="int", original="42", coerced=42),
    Conversion(key="limit", from_type="str", to_type="int", original="10", coerced=10),
    Conversion(key="include_deleted", from_type="str", to_type="bool", original="true", coerced=True),
]

If a coercion fails (say the model sends "not_a_number" for an int field), it shows up in result.failed with the original value and the error. You decide whether to raise, skip, or return an error message to the LLM.

What it does NOT do

llm-tool-arg-coerce does not validate that the coerced value is in range or matches business rules. Coercing "999999" to 999999 succeeds even if your database has no record with that ID. Validation is a separate step.

It does not fix missing required arguments. If the LLM omits a field your function requires, the coercion of the present fields still succeeds, but calling the function will still raise TypeError. Use tool-arg-defaults for filling missing args.

It does not strip extra arguments the LLM sends. If the LLM sends a field that is not in your function signature or schema, that field passes through unchanged in result.coerced. Filter it yourself before calling the function.

It does not handle nested objects beyond one level by default. For deeply nested schemas, the coercion is applied at the top level only. Pass a deep=True flag for recursive coercion, but be cautious with complex nested types.

Inside the library: design choices

The function-signature mode uses inspect.signature() to extract parameter names and annotations. Only parameters with explicit type annotations are coerced. Parameters annotated as Any or without annotations are passed through unchanged.

The schema mode reads a dict with keys as field names and values as dicts with a type field. Supported types are integer, number, boolean, array, and object. This maps to the same schema shape used by OpenAI and Anthropic tool definitions.

Bool coercion deserves special mention. The LLM might send "true", "True", "1", "yes", or 1 to represent a boolean. The library handles all common representations. It does not guess: anything not in the known-truthy or known-falsy set goes to result.failed.

List coercion handles two input shapes: a JSON array string like "[1, 2, 3]" and a comma-separated string like "1, 2, 3". Both are parsed to Python lists. The element type is not coerced unless you specify it in the schema.

The CoercionResult is a dataclass, not a class with methods. It stores coerced (dict), conversions (list of Conversion), failed (list of FailedCoercion), and a success flag that is True when failed is empty. This keeps the result inspectable without calling methods.

The 38 tests cover every type conversion, both schema and function-sig modes, deeply nested objects with deep=True, all bool input representations, malformed JSON in list fields, and the case where the LLM sends a Python type that already matches (no conversion needed, no entry in conversions).

When this is useful, and when it is not

This is useful when:

You are using LLM tool use with typed Python functions and want a reliable bridge between JSON args and Python types.
You see recurring TypeError or ValueError crashes from tool calls in production.
You want an audit trail of every type conversion the LLM triggered, for debugging or logging.
You are building a generic tool dispatcher that runs any registered function with LLM-supplied args.

This is not the right tool when:

Your tools are already loosely typed and accept strings for everything. No coercion needed.
You want to validate business rules, not just types. Use Pydantic or a custom validator after coercion.
You want to auto-generate tool schemas from your functions. Use tool-schema-from-fn for that.
Your tool args are already properly typed by your LLM client's response parsing. Some clients do type-coerce JSON for you.

Install

pip install git+https://github.com/MukundaKatta/llm-tool-arg-coerce

Minimal example:

from llm_tool_arg_coerce import coerce

def add(a: int, b: int) -> int:
    return a + b

result = coerce(fn=add, args={"a": "3", "b": "7"})
# result.coerced == {"a": 3, "b": 7}
# result.success == True

output = add(**result.coerced)
# output == 10

Siblings in this series

Library	What it does
`tool-schema-from-fn`	Generate tool schema JSON from a Python function signature
`tool-arg-defaults`	Fill missing tool args from schema defaults before calling
`tool-arg-fuzzy`	Fuzzy-match LLM enum args when the value is close but not exact
`tool-arg-rename`	Convert kwarg case conventions (snake_case to camelCase and back)
`tool-result-validator`	Validate the tool's return value against a schema after the call

The typical pipeline in a typed tool dispatcher: rename args, fill defaults, coerce types, call the function, validate the result.

What is next

The library needs a PyPI release. After that, the roadmap includes:

A decorator @coerced that wraps a function and coerces args on every call, so you do not have to call coerce() manually at each call site.
Support for Literal type annotations, so args constrained to a specific set of values get validated rather than just coerced.
A strict mode that raises on the first failed coercion rather than collecting all failures.
A Pydantic model input mode, so you can pass a Pydantic model class instead of a function or schema dict.

If you have a tool dispatcher running in production, you already know this problem. The LLM sends "true". Your tool wants True. The fix is three lines. llm-tool-arg-coerce makes it one.

Part of the Hermes Agent Challenge series. All libraries are on GitHub under MukundaKatta.

DEV Community