LangSmith gives us incredible visibility into LLM applications: full traces, datasets, prompt versioning, evaluations — everything we need to build reliable AI systems.
But actually using LangSmith day-to-day has always felt clunky:
- Constantly refreshing the web UI
- Writing custom API scripts
- Or using MCP servers that quietly eat 16,100 tokens of context — permanently.
I built langsmith-cli to solve this properly.
It's not only dramatically more efficient (177× less context overhead).
It is fundamentally better for real debugging, analysis, and production monitoring workflows.
Here’s why — with real measurements and concrete examples.
1. Context Is Precious — 177× Less Waste
Straight from /context in Claude Code:
- MCP LangSmith tools → 16,100 tokens always loaded (~8% of 200k context)
- langsmith-cli as Skill → 91 tokens only when activated, 0 when idle
→ 177× difference in context overhead.
This is not theoretical.
Every extra 10–20k tokens of tool definitions means less room for:
- conversation history
- source code
- documentation
- actual reasoning
Add 2–3 more MCP servers → 20–30% of your context disappears before you start working.
2. Real-time Production Monitoring — runs watch
The single feature that made me never want to go back:
langsmith-cli runs watch --project production
You get an auto-refreshing, color-coded terminal dashboard:
- Live status (🟢 / 🔴)
- Latency, token usage, relative time
- Instant visibility into error rate and average performance
- Filter on the fly:
--failed,--slow,--model gpt-4,--tag customer-facing
No browser refresh. No delay.
You literally see production break (or recover) in real time.
MCP + web UI simply cannot match this immediacy.
3. Powerful, Developer-first Filtering
Finding the right runs should not require writing custom code every time.
Examples that MCP/web simply cannot do easily:
# Regex on run names
langsmith-cli runs list --name-regex "^api-v[0-9]+\.[0-9]+"
# Wildcard + smart presets
langsmith-cli runs list --name-pattern "*auth*" --failed --today
# Time ranges (very natural syntax)
langsmith-cli runs list --since "1 hour ago"
langsmith-cli runs list --last 24h
langsmith-cli runs list --since "2025-12-01" --until "2025-12-02"
# Expensive / slow runs
langsmith-cli runs list --min-tokens 8000 --slow --today
These filters are fast, composable, and — most importantly — stay in your terminal flow.
4. Field Pruning: 95% Token Reduction on Responses
A complex multi-agent trace can easily be ~4,200 tokens.
Fetching 10 failed runs full → ~42k tokens just for data.
With --fields:
langsmith-cli --json runs list --failed --limit 10 --fields name,error,latency,status
→ ~214 tokens per run instead of 4,210
→ ~95% reduction
You only pay for the information you actually need.
MCP always returns the complete object. Every time.
5. Dual Excellent UX — Humans + Agents
# Human mode (beautiful rich table)
langsmith-cli runs list --project production --limit 8
→ Color-coded, aggregates, relative times, clean formatting
# Agent / script mode (strict, minimal JSON)
langsmith-cli --json runs list --failed --fields name,error,latency --limit 20
One tool. Two perfect interfaces.
No compromises.
6. Export Formats That Actually Help Teams
-
--format csv→ Excel, pivot tables, stakeholder reports -
--format yaml→ configs, reproducible environments -
--json→ agents, automation, monitoring pipelines
langsmith-cli runs list --failed --today --format csv > failed-runs-today.csv
Open → analyze → share. Done.
7. Unix Philosophy — Full Composability
# How many timeout errors today?
langsmith-cli --json runs list --failed --today \
| jq '.[] | select(.error | contains("timeout"))' \
| wc -l
# Top 5 most common errors
langsmith-cli --json runs list --failed --limit 200 \
| jq -r '.[] | .error' \
| sort | uniq -c | sort -rn | head -5
This is where CLI completely outclasses MCP + web.
You already know these tools.
You already have the scripts.
Now they work with LangSmith too.
Quick Start (Really 30–60 Seconds)
# Install (isolated, safe, works everywhere)
curl -sSL https://raw.githubusercontent.com/langchain-ai/langsmith-cli/main/scripts/install.sh | sh
# Or faster with uv:
uv tool install langsmith-cli
# Add as skill in Claude Code
/plugin marketplace add gigaverse-app/langsmith-cli
# First login
langsmith-cli auth login
Then try:
langsmith-cli runs watch --project production
# or
langsmith-cli runs list --failed --today
Final Verdict
langsmith-cli is not just "lighter" than MCP.
It is objectively better at the things that matter most when debugging and operating LLM systems in production:
- Real-time visibility
- Powerful filtering without code
- Massive context & token savings
- Beautiful human UX + perfect machine UX
- Export formats teams actually use
- Full Unix-style composability
177× less context overhead is nice.
But being able to watch production live, find problems in seconds, and export meaningful data instantly — that's why I built it, and why I never want to go back.
Give it 60 seconds.
Run /context before and after.
The numbers don't lie.
Repo → https://github.com/gigaverse-app/langsmith-cli (MIT)
Happy (much faster) debugging!
Aviad
Top comments (0)