Kacper Włodarczyk

Posted on Apr 18 • Originally published at oss.vstorm.co

Full Observability in AI Agents: What We Added to the pydantic-deepagents TUI

#ai #opensource #python #devtools

Full Observability in AI Agents: What We Added to the pydantic-deepagents TUI

This is a cross-post. Canonical version: oss.vstorm.co/blog/ai-agent-tui-observability-pydantic-deep/

This week I covered how pydantic-deepagents handles stuck loops, context window blindness, and frictionless installation. Today: what you actually see when all of that runs.

The problem with invisible agents

When an AI agent runs, a lot happens between your prompt and the response. The model reasons. It calls tools. Each action burns tokens and costs money. Without observability, you're flying blind — you can't debug, optimize, or trust what's happening.

pydantic-deepagents v0.3.5 — the modular agent runtime for Python — reworks the TUI to surface everything.

What changed

Per-turn token usage

Below every assistant response:

in:2.1K · out:412 · total:2.5K · reqs:3

in = input tokens, out = output tokens, total = turn total, reqs = API calls in this turn.

Cumulative cost in the header

pydantic-deepagents  in:45K out:3K · $0.12

Updates after each response. You always know the running total.

Thinking streamed live → collapsed

Model reasoning appears as dimmed text while running. Collapses to a one-line summary when done. Watch the agent reason without drowning in it.

Side panel on startup

Opens automatically when terminal ≥100 chars wide. Shows subagents before any task:

Subagents:
• planner (idle)
• research (idle)

Status updates as agents are delegated work.

All tool calls visible

Todo tools (read_todos, write_todos, add_todo, update_todo_status, remove_todo) were previously hidden. Now surfaced. Every agent action is visible.

Session saved on crash

_save_session() is now in a finally block. Crash, exception, keyboard interrupt — messages.json is always written. No more lost sessions.

Subagent logs: 20K chars (was 2K)

tool_log.jsonl now stores full subagent output. Critical for /improve — the pipeline that extracts learnings from sessions (more on that tomorrow).

The full layout

┌─────────────────────────────────┬──────────────────┐
│ pydantic-deepagents  in:45K out:3K · $0.12         │
├─────────────────────────────────┼──────────────────┤
│ [thinking... dimmed text]       │ Subagents:       │
│ [collapsed to summary]          │ • planner (idle) │
│                                 │ • research (idle)│
│ Agent response here...          │                  │
│ in:2.1K · out:412 · $0.04       │                  │
└─────────────────────────────────┴──────────────────┘

Try it

curl -fsSL https://oss.vstorm.co/install.sh | bash

GitHub: github.com/vstorm-co/pydantic-deep

Observability is how you debug, optimize, and trust your agent. A black box is a liability.

What's the first metric you check when debugging an agent run?

Top comments (1)

AI Bug Slayer 🐞 • Apr 18

Full observability in AI agents is so underrated. Being able to see token usage, tool calls, and intermediate steps in real-time in a TUI makes debugging multi-agent workflows so much more tractable.