Stop Branching on Error Strings: A Closed Enum for Tool Exceptions

#hermeschallenge #ai #python #agents

The code looked like this:

except Exception as e:
    if "rate limit" in str(e).lower():
        time.sleep(60)
    elif "unauthorized" in str(e).lower():
        refresh_auth()
    elif "not found" in str(e).lower():
        handle_missing()
    else:
        raise

Three things wrong: string matching is fragile, the strings differ across providers, and else: raise silently drops context. New error types added by the provider break existing branches.

tool-error-classify replaces freeform string matching with a closed ErrorKind enum and a classifier that knows about every major provider's error format.

The Shape of the Fix

from tool_error_classify import classify, ErrorKind

try:
    result = call_tool(args)
except Exception as e:
    kind = classify(e)

    if kind == ErrorKind.RATE_LIMIT:
        retry_after = kind.retry_after_seconds(e)
        time.sleep(retry_after or 60)
    elif kind == ErrorKind.AUTH_ERROR:
        refresh_credentials()
    elif kind == ErrorKind.NOT_FOUND:
        return {"error": "resource_not_found"}
    elif kind == ErrorKind.TIMEOUT:
        return {"error": "timed_out"}
    else:
        # ErrorKind.UNKNOWN — log the full exception for triage
        logger.error("unclassified tool error", exc_info=True)
        raise

classify() returns an ErrorKind. The match is on a closed enum, not strings. New exception types from the provider need to be added to the classifier — they do not silently fall into wrong branches.

What It Does NOT Do

tool-error-classify does not retry for you. It tells you what kind of error happened. llm-retry-py does the retry logic. These are two separate concerns.

It does not parse every possible error format from every possible tool. It covers: HTTP status codes (via response.status_code), exception class names (matching Anthropic, OpenAI, Google known exception classes), and Retry-After header parsing.

Custom exceptions from your own tools need custom classifiers. Pass a custom_classifier callable to classify().

Inside the Library

The ErrorKind enum:

class ErrorKind(Enum):
    RATE_LIMIT = "rate_limit"
    AUTH_ERROR = "auth_error"
    NOT_FOUND = "not_found"
    VALIDATION_ERROR = "validation_error"
    TIMEOUT = "timeout"
    SERVER_ERROR = "server_error"
    QUOTA_EXCEEDED = "quota_exceeded"
    CONTENT_FILTER = "content_filter"
    CONTEXT_LENGTH = "context_length"
    UNKNOWN = "unknown"

The classifier chain: status code first, then exception class name, then exception message keywords as a last resort.

def classify(exc: Exception, custom_classifier=None) -> ErrorKind:
    if custom_classifier:
        result = custom_classifier(exc)
        if result is not None:
            return result

    # Status code
    status = getattr(exc, "status_code", None) or getattr(exc, "response", None)
    if status:
        code = status.status_code if hasattr(status, "status_code") else status
        if code == 429:
            return ErrorKind.RATE_LIMIT
        elif code in (401, 403):
            return ErrorKind.AUTH_ERROR
        elif code == 404:
            return ErrorKind.NOT_FOUND
        # ... etc

    # Class name matching
    class_name = type(exc).__name__
    if "RateLimit" in class_name:
        return ErrorKind.RATE_LIMIT
    # ... etc

    return ErrorKind.UNKNOWN

retry_after_seconds() parses the Retry-After header from the exception response, if present. Returns an int or None.

The 51 tests cover every ErrorKind via status code, via class name, via keyword (last resort), Retry-After parsing, custom classifier precedence, and UNKNOWN for truly unrecognized errors.

When to Use It

Use it anywhere you catch exceptions from external tool calls and branch on the type. HTTP APIs, LLM providers, database clients. The pattern replaces fragile string matching with a stable enum.

The UNKNOWN case is important. Do not silently swallow it. Log the full exception with traceback. Unknown errors are the ones that will surprise you in production.

CONTENT_FILTER and CONTEXT_LENGTH are especially useful for LLM tool calls. Both require different handling: content filter errors may need prompt modification, context length errors need message windowing. Having named cases for these makes the agent loop logic readable.

Install

pip install git+https://github.com/MukundaKatta/tool-error-classify

from tool_error_classify import classify, ErrorKind
from llm_retry_py import LLMRetry

retry = LLMRetry(max_attempts=3, base_delay=1.0)

def call_with_classification(args: dict):
    try:
        return retry.call(
            fn=lambda: actual_call(args),
            classify_error=lambda e: classify(e).value,  # ErrorKind.RATE_LIMIT.value == "rate_limit"
        )
    except Exception as e:
        kind = classify(e)
        if kind == ErrorKind.CONTEXT_LENGTH:
            # Trim context and retry
            return call_with_trimmed_context(args)
        elif kind == ErrorKind.CONTENT_FILTER:
            return {"error": "content_not_allowed"}
        raise

Sibling Libraries

Library	What it solves
`llm-retry-py`	Retry with backoff, uses error codes for retry decisions
`llm-circuit-breaker-py`	Circuit breaker that opens on repeated errors
`agentvet`	Validate tool arguments before the call that might fail
`tool-result-validator`	Validate tool output after the call
`llm-fallback-chain`	Fall through to backup provider on failure

The error handling stack: tool-error-classify identifies the kind, llm-retry-py decides to retry, llm-circuit-breaker-py stops sending to broken providers, llm-fallback-chain routes to a backup.

What's Next

Provider-specific classifiers as plugins. Right now the classifier handles Anthropic, OpenAI, and Google exception shapes. A plugin interface for classify() would let library authors add support for other providers without modifying the core.

An is_retryable() helper on ErrorKind would simplify the common case: if classify(e).is_retryable(): retry() instead of checking specific enum values.

Structured logging integration: a log_error(exc, logger) helper that logs the classified kind plus original exception details in a structured format. This is a convenience wrapper but one that enough people would use that it belongs in the library.

Built as part of the agent-stack family: composable Python primitives for production LLM agents.