Mukunda Rao Katta

Posted on May 25

When the Model Says 'ASCENDING' and Your Enum Wants 'asc': tool-arg-fuzzy

#hermeschallenge #ai #python #agents

The model called sort_records with {"sort_order": "ASCENDING"}.

The tool expected one of ["asc", "desc"]. It rejected the call with a type error.

The model retried with "ascending". Rejected again.

It tried "ascend". Rejected.

It tried "ascending order". Rejected.

Four failed attempts, a full retry loop burned, and the task never completed. The model was not confused about what to do. It knew it wanted ascending sort. It just could not land on the exact string the enum expected.

This is a common failure mode. Enums in tool schemas are case-sensitive and exact. The model often returns a value that is semantically correct but lexically wrong. "ASCENDING", "Asc", "ascending", "asc." are all reasonable interpretations. Only one passes.

The normal fix is to write validation logic that normalizes the input before comparing. That logic is the same in every project: lowercase, strip, prefix-check, substring-check. I got tired of writing it, so I extracted it into a library.

The shape of the fix

Before calling the tool, run the argument through the fuzzy matcher:

from tool_arg_fuzzy import fuzzy_match_enum

sort_order = fuzzy_match_enum("ASCENDING", ["asc", "desc"])
# Returns "asc"

If the model returns "desc", "DESC", "Desc", "descending", or "sort descending", all resolve to "desc" as long as the match is unambiguous.

Wrap it into a tool decorator pattern to intercept before the function executes:

from tool_arg_fuzzy import fuzzy_match_enum, AmbiguousMatch, NoMatch

def normalize_sort_args(args: dict) -> dict:
    if "sort_order" in args:
        try:
            args["sort_order"] = fuzzy_match_enum(
                args["sort_order"],
                ["asc", "desc"]
            )
        except AmbiguousMatch as e:
            raise ValueError(f"Ambiguous sort_order: {e}")
        except NoMatch as e:
            raise ValueError(f"Unknown sort_order: {e}")
    return args

Or apply it directly inside the tool function before the logic runs:

def sort_records(table: str, sort_order: str):
    sort_order = fuzzy_match_enum(sort_order, ["asc", "desc"])
    return db.query(table, order=sort_order)

Both approaches work. The key point is that fuzzy resolution happens at the boundary, before the value reaches business logic.

What it does NOT do

It does not fix wrong types. If the model passes 42 for a string enum field, that is a type error. Use tool-arg-coerce-py for that.
It does not validate required args are present. Use agentvet for full arg validation.
It does not generate the enum list. Use tool-schema-from-fn to extract Literal["asc", "desc"] type hints into a JSON Schema enum automatically.
It does not tolerate semantic similarity without string overlap. "ascending" fuzzy-matches "asc" because "asc" is a prefix of "ascending". "rising" does not match "asc" because there is no string relationship. This is not embedding similarity. It is pure string logic.

Inside the lib: conservative ambiguity

The matching cascade runs in order. Each level only fires if the previous level found no match.

Exact match. "asc" == "asc".
Case-insensitive match. "ASC".lower() == "asc".
Prefix match. The candidate starts with the enum value, or the enum value starts with the candidate.
Substring match. The candidate contains the enum value, or the enum value contains the candidate.

The first level that produces exactly one match returns it.

The interesting rule is what happens when a level produces more than one match. The library raises AmbiguousMatch and stops.

from tool_arg_fuzzy import fuzzy_match_enum, AmbiguousMatch

try:
    fuzzy_match_enum("cen", ["center", "central"])
except AmbiguousMatch as e:
    print(e)
# AmbiguousMatch: 'cen' matches multiple values: ['center', 'central']

Both "center" and "central" start with "cen". The library refuses to pick one.

This is the design choice that took the most thought. The options were:

Pick the shortest match (most specific).
Pick the first match (deterministic but arbitrary).
Pick the highest-ranked match using edit distance.
Raise and let the caller decide.

The problem with picking silently is that the caller does not know a guess was made. A resolved value looks identical whether it came from a clean match or an ambiguous one. If the guess is wrong, the tool executes with incorrect input and nothing in the call stack shows that fuzzy resolution was even involved.

An AmbiguousMatch exception is catchable. The caller can log it, surface it back to the model, or use a fallback strategy. Silent wrong resolution is not recoverable without replaying the whole call.

So the library raises. An explicit failure is better than a quiet wrong answer.

When this is useful

Tools that accept string enums defined by external APIs. The API defines ["us-east-1", "us-west-2", "eu-central-1"] and the model returns "us-east" or "east us". A prefix cascade resolves it cleanly.

Tools with human-readable enum values like ["low", "medium", "high"]. The model often returns "LOW", "Med", or "high priority". All resolvable.

Agents handling natural language commands where the user instruction contains a hint but not the exact token. The user says "sort ascending" and the model passes sort_order="ascending". Prefix match on "asc" resolves it.

Any integration where you do not control the LLM and cannot guarantee it returns exact enum tokens. Third-party models, fine-tuned models, and older models that pre-date your tool schema all fall into this category.

When NOT to use it

When the enum values are short and overlap heavily. An enum like ["a", "ab", "abc"] will generate ambiguous matches on almost any input. In that case, fix the schema.

When you need semantic matching. If the enum contains domain-specific jargon and the model might return a synonym with no string overlap, string cascades will not help. Use embedding similarity or a lookup table instead.

When the model is reliably returning exact tokens. If your system prompt includes the enum values and the model follows instructions, skip the normalization layer. Do not add complexity that is not solving a real problem.

When you want to know about mismatches without resolving them. Logging that the model passed a non-canonical value can be useful telemetry. In that case, collect the mismatch and resolve separately rather than silently normalizing.

Install

pip install tool-arg-fuzzy

Zero dependencies. Python 3.9 and up.

from tool_arg_fuzzy import fuzzy_match_enum, AmbiguousMatch, NoMatch

# Resolves case-insensitive
fuzzy_match_enum("DESC", ["asc", "desc"])  # -> "desc"

# Resolves prefix
fuzzy_match_enum("ascending", ["asc", "desc"])  # -> "asc"

# Resolves substring
fuzzy_match_enum("sort asc", ["asc", "desc"])  # -> "asc"

# Raises on ambiguity
fuzzy_match_enum("a", ["asc", "all"])  # raises AmbiguousMatch

# Raises on no match
fuzzy_match_enum("xyz", ["asc", "desc"])  # raises NoMatch

Siblings

Lib	Boundary	Repo
tool-arg-coerce-py	Type coercion (str to int, str to bool, etc.)	MukundaKatta/tool-arg-coerce-py
agentvet	Full arg validation including enum presence and type	MukundaKatta/agentvet
tool-schema-from-fn	Generates the enum list from Python Literal type hints	MukundaKatta/tool-schema-from-fn
tool-arg-fuzzy-rs	The Rust port of this library	MukundaKatta/tool-arg-fuzzy-rs

These four cover adjacent territory. tool-schema-from-fn generates the enum list. agentvet validates that the value is in the list. tool-arg-fuzzy resolves near-matches before validation. tool-arg-coerce-py handles wrong types before the value even reaches enum matching.

They compose cleanly. In a strict pipeline you would coerce types first, then fuzzy-match enum values, then validate the full arg set. Each library has a single responsibility and none of them depend on each other.

What's next

The Rust port (tool-arg-fuzzy-rs) is already on crates.io. Same cascade, same ambiguity semantics.

A possible extension is a fuzzy_match_all_enums helper that takes the full args dict and a schema dict and resolves every string enum field in one pass. Right now you have to call fuzzy_match_enum per field. That is fine for small schemas but gets verbose when a tool has five or six string fields.

Another direction is integration with tool-schema-from-fn. If the schema generator marks fields as "fuzzy": true, the resolution layer could be applied automatically. That would remove the need for per-tool normalization code entirely.

For now, the library does one thing: resolve a single string value against a list of candidates using a conservative string cascade. Twenty tests, zero dependencies, clear failure modes.

That was enough to stop the retry loop.

DEV Community