DEV Community

Cover image for tracesage: See Inside Your LangGraph Agents
Kshitij Gupta
Kshitij Gupta

Posted on

tracesage: See Inside Your LangGraph Agents

See Inside Your LangGraph Agents

Open-source LangChain/LangGraph tracing — drop in two lines, watch your agents run live in your browser.


The problem: agents are black boxes

If you've built anything non-trivial with LangChain or LangGraph — a multi-agent supervisor, a RAG pipeline, a tool-using ReAct loop — you know the feeling. It works on the happy path, then a real query comes in and… something goes wrong. But what?

  • Which agent actually ran? In what order?

  • Did the model call the tool you expected, or hallucinate a different one?

  • How many tokens did that one request burn?

  • Where did the error come from — your tool, the model, or the orchestration?

The usual answer is a wall of print() statements and verbose=True logs you scroll through at 2 a.m. There are great hosted tracing platforms, but they mean signing up, shipping your prompts to a third party, and wiring up an SDK.

I wanted something I could pip install and have running in ten seconds, entirely on my laptop. So I built tracesage.

What is tracesage?

tracesage is a local-first observability tool for LangChain & LangGraph agents. It hooks into LangChain's callback stream, captures every chain / tool / LLM / retriever event, stores it locally (SQLite + gzipped blobs), and renders it as an interactive graph + timeline UI in your browser — in real time.

  • 🚀 Two-line integration. One callback added to your existing invoke/ainvoke.

  • 🧰 Zero infrastructure. No Docker, no Postgres, no external service. Just pip install.

  • 🔒 Never crashes your app. The callback handler is wrapped to never raise — tracing can fail, your agent keeps running.

  • 🗺️ MCP-aware. Tools loaded from MCP servers are attributed back to their server, so you can see which tools came from where.

  • 🧪 Testable. A pytest fixture lets you assert "did my agent call search?" in CI.

  • 📦 MIT licensed, runs in a single Python process.

Links:


A 30-second taste

Before we write any code, see what we're aiming for:

pip install "tracesage[langchain]"
tracesage demo            # seeds a sample trace and opens the UI
Enter fullscreen mode Exit fullscreen mode

Your browser opens to http://localhost:7842/ui and you're looking at a live agent topology.


Tutorial: trace a real agent

Let's build a tiny but real LangGraph agent and wire tracesage into it. You'll need Python 3.11+ and an LLM provider key (we'll use OpenAI; Anthropic or any LangChain model works identically — tracesage is provider-agnostic).

pip install "tracesage[langchain]" langgraph langchain-openai
export OPENAI_API_KEY=sk-...
Enter fullscreen mode Exit fullscreen mode

Step 1 — the agent (no tracing yet)

before.py — a standard LangGraph ReAct agent with two tools:

import asyncio
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent


@tool
def get_weather(city: str) -> str:
    """Return the current weather for a city."""
    return f"It's 22°C and sunny in {city}."


@tool
def to_fahrenheit(celsius: float) -> float:
    """Convert a temperature in Celsius to Fahrenheit."""
    return celsius * 9 / 5 + 32


agent = create_react_agent(
    ChatOpenAI(model="gpt-4o-mini", temperature=0),
    tools=[get_weather, to_fahrenheit],
)


async def main() -> None:
    result = await agent.ainvoke(
        {"messages": [{"role": "user",
                       "content": "What's the weather in Paris, in Fahrenheit?"}]}
    )
    print(result["messages"][-1].content)


asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

Run it and you get an answer. But you have no idea how it got there — did it call get_weather then to_fahrenheit? Did it loop? How many model calls?

Step 2 — add tracesage (the two lines)

after.py — the only difference is creating a tracer and passing its handler:

import asyncio
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

from tracesage import TraceSage          # 1️⃣ import


@tool
def get_weather(city: str) -> str:
    """Return the current weather for a city."""
    return f"It's 22°C and sunny in {city}."


@tool
def to_fahrenheit(celsius: float) -> float:
    """Convert a temperature in Celsius to Fahrenheit."""
    return celsius * 9 / 5 + 32


agent = create_react_agent(
    ChatOpenAI(model="gpt-4o-mini", temperature=0),
    tools=[get_weather, to_fahrenheit],
)


async def main() -> None:
    tracer = await TraceSage.create()    # 2️⃣ start tracesage (UI on :7842)

    result = await agent.ainvoke(
        {"messages": [{"role": "user",
                       "content": "What's the weather in Paris, in Fahrenheit?"}]},
        config={"callbacks": [tracer.handler]},   # 3️⃣ the one line you add
    )
    print(result["messages"][-1].content)

    # Keep the process alive so you can explore the UI.
    input("Trace ready at http://localhost:7842/ui — press Enter to exit.")
    await tracer.stop()


asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

That's it. Run python after.py, open http://localhost:7842/ui, and your run is there.

Even less code: the with block

For scripts and notebooks, there's a context manager that starts the UI and installs a global handler — so you don't even pass callbacks=:

import tracesage

with tracesage.trace() as tl:                 # starts UI + global capture
    result = agent.invoke({"messages": [...]})    # 🔍 tracesage: http://127.0.0.1:7842/ui/#run=...
    input("Trace ready — open the printed link, then Enter to exit.")
Enter fullscreen mode Exit fullscreen mode

Every new run prints a clickable deep link to that exact trace.


Reading the UI

Here's where tracesage earns its keep. Open a run and you get a topology graph of everything that happened.

Every node is one of six kinds, colour-coded in the legend (bottom-left):

Kind What it is
agent a function you registered as a node, that calls other things
tool a @tool side-effect function (DB, API, calculation)
llm a language-model call (what you count, cost, and cache)
retriever a BaseRetriever — the "R" in RAG
chain plumbing: LCEL pipes, the LangGraph state machine, routing functions
mcp a synthesized node grouping the tools loaded from one MCP server

Click any node to open its inspector — call counts, durations, errors, and the tools it provides or uses:

The timeline on the right replays the run step-by-step; click a step to expand the full payload (prompts, tool inputs/outputs, token usage, and — on errors — the exception type and traceback).


The killer feature: MCP tool-source attribution

If your agent loads tools from MCP servers (via langchain-mcp-adapters), you usually lose track of where each tool came from — they all look like generic LangChain tools at runtime. tracesage fixes that.

Install the extra and register your MCP client:

pip install "tracesage[mcp]"
Enter fullscreen mode Exit fullscreen mode
from langchain_mcp_adapters.client import MultiServerMCPClient
from tracesage import TraceSage
from tracesage.adapters.mcp import register_mcp_client

tracer = await TraceSage.create()

client = MultiServerMCPClient({
    "weather": {"command": "python", "args": ["weather_server.py"], "transport": "stdio"},
    "math":    {"command": "python", "args": ["math_server.py"],    "transport": "stdio"},
})

# Loads every server's tools AND records tool → server provenance.
tools = await register_mcp_client(tracer, client)
# Your own @tool functions stay "local" (unattributed) automatically.
Enter fullscreen mode Exit fullscreen mode

Now the UI shows a "Tools by source" panel and dedicated mcp: nodes — every tool is grouped by where it came from:

Click an MCP server node and you see exactly what it provides, how often it was called, and which agents used it:

A complete, runnable MCP example (two local stdio servers + hardcoded tools, no API key needed) lives in the repo at examples/mcp/:

pip install "tracesage[mcp]"
python examples/mcp/main.py     # then open http://localhost:7842/ui
Enter fullscreen mode Exit fullscreen mode

Testing your agents in CI

Tracing isn't just for eyeballing. tracesage ships a pytest fixture (tracesage_capture, auto-registered) so you can assert behaviour:

def test_agent_uses_search(tracesage_capture):
    agent.invoke("find me a hotel in Paris")
    tracesage_capture.assert_tool_called("get_weather")
    tracesage_capture.assert_no_errors()
    assert tracesage_capture.total_tokens()[0] < 5000   # input-token budget
Enter fullscreen mode Exit fullscreen mode

No setup, no server — the fixture captures the run in-process and gives you assertions like assert_tool_called, assert_no_errors, and total_tokens.


Shipping to production (and turning it off)

tracesage is built so you can wire it in once and control it per-environment:

  • Kill switch: set TRACESAGE_ENABLED=false (or enabled=False) and TraceSage returns an inert tracer — no server, no DB, a no-op handler, near-zero overhead. Same code ships to prod; tracing just turns off.

  • Capture without the UI: TRACESAGE_START_SERVER=false records traces to disk in prod without binding the in-process UI; view them later with tracesage serve.

  • Safety rails: bearer-token auth, root-level sampling (sample_rate), a per-run event cap, and a hard fail-stop if you bind a non-loopback address without an auth token.

# In prod: capture is off, zero overhead, no code change.
#   TRACESAGE_ENABLED=false
# Or capture quietly, no UI:
#   TRACESAGE_START_SERVER=false   → then `tracesage serve` to look later
Enter fullscreen mode Exit fullscreen mode

Try it yourself

pip install "tracesage[langchain]"
tracesage demo
Enter fullscreen mode Exit fullscreen mode

If you build agents with LangChain or LangGraph and you're tired of print-debugging your way through a run, give it a spin. Two lines, one browser tab, and your agent stops being a black box.

tracesage is MIT-licensed and open source.


Top comments (0)