The code looked like this:
except Exception as e:
if "rate limit" in str(e).lower():
time.sleep(60)
elif "unauthorized" in str(e).lower():
refresh_auth()
elif "not found" in str(e).lower():
handle_missing()
else:
raise
Three things wrong: string matching is fragile, the strings differ across providers, and else: raise silently drops context. New error types added by the provider break existing branches.
tool-error-classify replaces freeform string matching with a closed ErrorKind enum and a classifier that knows about every major provider's error format.
The Shape of the Fix
from tool_error_classify import classify, ErrorKind
try:
result = call_tool(args)
except Exception as e:
kind = classify(e)
if kind == ErrorKind.RATE_LIMIT:
retry_after = kind.retry_after_seconds(e)
time.sleep(retry_after or 60)
elif kind == ErrorKind.AUTH_ERROR:
refresh_credentials()
elif kind == ErrorKind.NOT_FOUND:
return {"error": "resource_not_found"}
elif kind == ErrorKind.TIMEOUT:
return {"error": "timed_out"}
else:
# ErrorKind.UNKNOWN — log the full exception for triage
logger.error("unclassified tool error", exc_info=True)
raise
classify() returns an ErrorKind. The match is on a closed enum, not strings. New exception types from the provider need to be added to the classifier — they do not silently fall into wrong branches.
What It Does NOT Do
tool-error-classify does not retry for you. It tells you what kind of error happened. llm-retry-py does the retry logic. These are two separate concerns.
It does not parse every possible error format from every possible tool. It covers: HTTP status codes (via response.status_code), exception class names (matching Anthropic, OpenAI, Google known exception classes), and Retry-After header parsing.
Custom exceptions from your own tools need custom classifiers. Pass a custom_classifier callable to classify().
Inside the Library
The ErrorKind enum:
class ErrorKind(Enum):
RATE_LIMIT = "rate_limit"
AUTH_ERROR = "auth_error"
NOT_FOUND = "not_found"
VALIDATION_ERROR = "validation_error"
TIMEOUT = "timeout"
SERVER_ERROR = "server_error"
QUOTA_EXCEEDED = "quota_exceeded"
CONTENT_FILTER = "content_filter"
CONTEXT_LENGTH = "context_length"
UNKNOWN = "unknown"
The classifier chain: status code first, then exception class name, then exception message keywords as a last resort.
def classify(exc: Exception, custom_classifier=None) -> ErrorKind:
if custom_classifier:
result = custom_classifier(exc)
if result is not None:
return result
# Status code
status = getattr(exc, "status_code", None) or getattr(exc, "response", None)
if status:
code = status.status_code if hasattr(status, "status_code") else status
if code == 429:
return ErrorKind.RATE_LIMIT
elif code in (401, 403):
return ErrorKind.AUTH_ERROR
elif code == 404:
return ErrorKind.NOT_FOUND
# ... etc
# Class name matching
class_name = type(exc).__name__
if "RateLimit" in class_name:
return ErrorKind.RATE_LIMIT
# ... etc
return ErrorKind.UNKNOWN
retry_after_seconds() parses the Retry-After header from the exception response, if present. Returns an int or None.
The 51 tests cover every ErrorKind via status code, via class name, via keyword (last resort), Retry-After parsing, custom classifier precedence, and UNKNOWN for truly unrecognized errors.
When to Use It
Use it anywhere you catch exceptions from external tool calls and branch on the type. HTTP APIs, LLM providers, database clients. The pattern replaces fragile string matching with a stable enum.
The UNKNOWN case is important. Do not silently swallow it. Log the full exception with traceback. Unknown errors are the ones that will surprise you in production.
CONTENT_FILTER and CONTEXT_LENGTH are especially useful for LLM tool calls. Both require different handling: content filter errors may need prompt modification, context length errors need message windowing. Having named cases for these makes the agent loop logic readable.
Install
pip install git+https://github.com/MukundaKatta/tool-error-classify
from tool_error_classify import classify, ErrorKind
from llm_retry_py import LLMRetry
retry = LLMRetry(max_attempts=3, base_delay=1.0)
def call_with_classification(args: dict):
try:
return retry.call(
fn=lambda: actual_call(args),
classify_error=lambda e: classify(e).value, # ErrorKind.RATE_LIMIT.value == "rate_limit"
)
except Exception as e:
kind = classify(e)
if kind == ErrorKind.CONTEXT_LENGTH:
# Trim context and retry
return call_with_trimmed_context(args)
elif kind == ErrorKind.CONTENT_FILTER:
return {"error": "content_not_allowed"}
raise
Sibling Libraries
| Library | What it solves |
|---|---|
llm-retry-py |
Retry with backoff, uses error codes for retry decisions |
llm-circuit-breaker-py |
Circuit breaker that opens on repeated errors |
agentvet |
Validate tool arguments before the call that might fail |
tool-result-validator |
Validate tool output after the call |
llm-fallback-chain |
Fall through to backup provider on failure |
The error handling stack: tool-error-classify identifies the kind, llm-retry-py decides to retry, llm-circuit-breaker-py stops sending to broken providers, llm-fallback-chain routes to a backup.
What's Next
Provider-specific classifiers as plugins. Right now the classifier handles Anthropic, OpenAI, and Google exception shapes. A plugin interface for classify() would let library authors add support for other providers without modifying the core.
An is_retryable() helper on ErrorKind would simplify the common case: if classify(e).is_retryable(): retry() instead of checking specific enum values.
Structured logging integration: a log_error(exc, logger) helper that logs the classified kind plus original exception details in a structured format. This is a convenience wrapper but one that enough people would use that it belongs in the library.
Built as part of the agent-stack family: composable Python primitives for production LLM agents.
Top comments (0)