DEV Community

Cover image for One Decorator Away From Production-Ready AI Agents
Hedi Manai
Hedi Manai

Posted on

One Decorator Away From Production-Ready AI Agents

Every agent developer hits the same wall.

The demo works. Then it goes to production — and the cracks show up fast. No retry logic when APIs fail. Identical queries hammering your LLM endpoint over and over. No visibility into what's actually happening. And before long, you're writing the same cache managers, retry decorators, and circuit-breaker wrappers you wrote on the last project.

ToolOps is built to make that boilerplate disappear.


What It Is:
ToolOps is a framework-agnostic Python SDK that wraps any async function in a single decorator, adding caching, retries, circuit breakers, request coalescing, and observability — with zero changes to your business logic.

Think of it the way a service mesh works for microservices: the infrastructure wraps around your code without touching it.


What It Does:
Caching that actually fits production. ToolOps supports in-memory caching for speed, file-based for lightweight persistence, and PostgreSQL for durable, distributed caching shared across processes. Pick the backend that fits the function.

Semantic caching for LLM calls. Standard caches match on exact strings — so "weather in Paris" and "Paris weather" hit the LLM twice. ToolOps uses vector embeddings to match by meaning, collapsing semantically similar queries into a single cached result. For agents handling natural language, this can cut LLM calls by up to 90%.

Request coalescing. When dozens of agents request the same data at once during a cache miss, ToolOps fires one real API call and returns the result to all of them. The thundering herd problem, solved automatically.

Stale-if-error fallback. When an upstream service goes down, ToolOps can serve the last known good value instead of crashing your agent — exactly what you want for slowly-changing data like exchange rates or configuration.

Observability out of the box. Every cache hit, miss, retry, and circuit-breaker event is logged as structured JSON. Add the optional OpenTelemetry extra and you get full distributed tracing and Prometheus metrics — production-grade visibility in a few lines of setup.


Works With Your Stack:
ToolOps wraps plain Python async functions, so it drops into any framework without friction: LangChain, LangGraph, CrewAI, LlamaIndex, PydanticAI, Agno, and more. It also has native MCP support, letting you expose any decorated function as a typed MCP tool definition compatible with Claude Desktop, Cursor, and any MCP-compatible host.

When you migrate frameworks — and most teams eventually do — your infrastructure layer doesn't budge.


The Practical Upside:
The core package installs with a single pip command and has zero external dependencies. Postgres, semantic caching, and OpenTelemetry support are optional extras you add only when you need them. A built-in CLI gives you live hit rates, latency stats, and backend health checks without touching your application code.

Two decorators cover every case: @readonly for functions that read data, @sideeffect for functions that act on the world. That's the entire model.


Try It:
ToolOps is open source, Apache 2.0 licensed, and actively maintained.
If you're building agents that need to survive real traffic, real API failures, and real costs, it's worth your time.

GitHub: https://github.com/hedimanai-pro/toolops
PyPI: https://pypi.org/project/toolops/
Documentation: https://hedimanai.vercel.app/projects/toolops.html

Top comments (0)