DEV Community

ClevAgent
ClevAgent

Posted on

How to Monitor LangChain Agents in Production

Your LangChain agent works in development. Chains resolve, tools return, the ReAct loop converges. Ship it. Day one — fine. Day two — 200 requests, zero errors. Day three — your OpenAI bill says $340.

The agent got stuck in a tool-retry loop at 2 AM. It kept calling a search tool that returned empty results, parsing the response, deciding to search again, and repeating. No exceptions, no crashes, every health check returned 200 OK.

Tracing tools like LangSmith would show you the traces after the fact. Nobody would have woken you up at 2 AM when it started.

If you're running LangChain or LangGraph agents in production, this is the gap between observability and runtime monitoring. Here's how to close it.

Why LangChain agents need runtime monitoring

LangChain agents fail differently from web services:

  • Stuck chains: An HTTP tool call hangs indefinitely. The chain never completes. The process is alive, the health endpoint responds, but no work is happening.
  • Infinite ReAct loops: The agent keeps calling tools without converging. max_iterations helps, but only caps iteration count — not cost.
  • Silent cost spikes: A loop making 50 LLM calls in 30 seconds doesn't spike CPU. It spikes your API bill. By the time you see the invoice, the damage is done.
  • Zombie agents: The callback thread is alive, traces are flowing to LangSmith, but the actual work loop is stuck on a deadlocked resource.

LangSmith and Langfuse are excellent for tracing — understanding what happened after the fact. But they don't answer the real-time question: is this agent alive and making progress right now?


Add ClevAgent to your LangChain agent in 3 lines

Step 1. Install the SDK.

pip install clevagent
Enter fullscreen mode Exit fullscreen mode

Step 2. Initialize ClevAgent with your API key.

import clevagent

clevagent.init(
    api_key="your-key-here",
    agent="langchain-research-agent",
)
Enter fullscreen mode Exit fullscreen mode

Step 3. Add the callback handler to your LLM or chain.

from clevagent.integrations.langchain import ClevAgentCallbackHandler

handler = ClevAgentCallbackHandler()
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
Enter fullscreen mode Exit fullscreen mode

Every LLM call now sends a heartbeat with token usage. If the agent stops calling the LLM — because a chain hung, a tool timed out, or the process crashed — ClevAgent detects the silence and alerts you.

Complete example

import clevagent
from clevagent.integrations.langchain import ClevAgentCallbackHandler
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool

clevagent.init(api_key="your-key-here", agent="research-agent")
handler = ClevAgentCallbackHandler()

llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

tools = [
    Tool(name="search", func=search_web, description="Search the web"),
    Tool(name="calculate", func=calculator, description="Do math"),
]

agent = create_react_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, max_iterations=15)

# Every LLM call and tool use is now monitored
result = executor.invoke({"input": "Research the latest AI agent frameworks"})
Enter fullscreen mode Exit fullscreen mode

LangGraph agents: use the node decorator

For LangGraph's graph-based agents, ClevAgent provides a @monitored_node decorator that wraps each node with automatic heartbeat monitoring:

from clevagent.integrations.langgraph import monitored_node
from langgraph.graph import StateGraph

@monitored_node("research")
def research_node(state):
    result = llm.invoke(state["messages"])
    return {"messages": [result]}

@monitored_node("summarize")
def summarize_node(state):
    summary = llm.invoke(f"Summarize: {state['messages'][-1].content}")
    return {"messages": [summary]}

graph = StateGraph(AgentState)
graph.add_node("research", research_node)
graph.add_node("summarize", summarize_node)
graph.add_edge("research", "summarize")
Enter fullscreen mode Exit fullscreen mode

Each node execution sends a heartbeat. If a node hangs — because an API call never returns or an LLM request times out — ClevAgent detects the gap and alerts you.

What ClevAgent catches

Stuck chains and hung tools

Your agent calls an external API inside a tool. The API hangs. The chain never completes. systemctl status says "running" — but no heartbeats are arriving.

ClevAgent detects the silence within your configured threshold (default: 120 seconds) and sends an alert.

Infinite ReAct loops

The agent enters a loop: call tool → parse result → decide to call tool again → repeat. An agent making 15 iterations of GPT-4o calls in 30 seconds burns through tokens fast.

ClevAgent tracks cumulative token usage per heartbeat cycle. If tokens spike 10-100x above your agent's baseline, you get a cost alert — while the loop is still running, not after.

Silent exits

The process gets OOM-killed at 3 AM. No traceback, no error log, no alert. ClevAgent expects a heartbeat every N seconds. When it stops arriving, you get an alert within one missed interval. Optional auto-restart brings the agent back without manual intervention.

Getting started

  1. pip install clevagent
  2. Get your API key from clevagent.io/signup
  3. Add clevagent.init() and the callback handler
  4. Deploy — ClevAgent starts monitoring immediately
  5. Configure alerts in the dashboard: Telegram, Slack, Discord, or email

Free for 3 agents. No credit card required.

Related reading


ClevAgent monitors LangChain and LangGraph agents with heartbeat detection, cost tracking, and auto-restart. Free for up to 3 agents — start monitoring →

Top comments (0)