DEV Community

Cover image for LangSmith CLI: Not Just Efficient — Actually Better Than MCP
Aviad Rozenhek
Aviad Rozenhek

Posted on

LangSmith CLI: Not Just Efficient — Actually Better Than MCP

LangSmith gives us incredible visibility into LLM applications: full traces, datasets, prompt versioning, evaluations — everything we need to build reliable AI systems.

But actually using LangSmith day-to-day has always felt clunky:

  • Constantly refreshing the web UI
  • Writing custom API scripts
  • Or using MCP servers that quietly eat 16,100 tokens of context — permanently.

I built langsmith-cli to solve this properly.

It's not only dramatically more efficient (177× less context overhead).

It is fundamentally better for real debugging, analysis, and production monitoring workflows.

Here’s why — with real measurements and concrete examples.

1. Context Is Precious — 177× Less Waste

Straight from /context in Claude Code:

  • MCP LangSmith tools16,100 tokens always loaded (~8% of 200k context)
  • langsmith-cli as Skill91 tokens only when activated, 0 when idle

177× difference in context overhead.

This is not theoretical.

Every extra 10–20k tokens of tool definitions means less room for:

  • conversation history
  • source code
  • documentation
  • actual reasoning

Add 2–3 more MCP servers → 20–30% of your context disappears before you start working.

2. Real-time Production Monitoring — runs watch

The single feature that made me never want to go back:

langsmith-cli runs watch --project production
Enter fullscreen mode Exit fullscreen mode

You get an auto-refreshing, color-coded terminal dashboard:

  • Live status (🟢 / 🔴)
  • Latency, token usage, relative time
  • Instant visibility into error rate and average performance
  • Filter on the fly: --failed, --slow, --model gpt-4, --tag customer-facing

No browser refresh. No delay.

You literally see production break (or recover) in real time.

MCP + web UI simply cannot match this immediacy.

3. Powerful, Developer-first Filtering

Finding the right runs should not require writing custom code every time.

Examples that MCP/web simply cannot do easily:

# Regex on run names
langsmith-cli runs list --name-regex "^api-v[0-9]+\.[0-9]+"

# Wildcard + smart presets
langsmith-cli runs list --name-pattern "*auth*" --failed --today

# Time ranges (very natural syntax)
langsmith-cli runs list --since "1 hour ago"
langsmith-cli runs list --last 24h
langsmith-cli runs list --since "2025-12-01" --until "2025-12-02"

# Expensive / slow runs
langsmith-cli runs list --min-tokens 8000 --slow --today
Enter fullscreen mode Exit fullscreen mode

These filters are fast, composable, and — most importantly — stay in your terminal flow.

4. Field Pruning: 95% Token Reduction on Responses

A complex multi-agent trace can easily be ~4,200 tokens.

Fetching 10 failed runs full → ~42k tokens just for data.

With --fields:

langsmith-cli --json runs list --failed --limit 10 --fields name,error,latency,status
Enter fullscreen mode Exit fullscreen mode

→ ~214 tokens per run instead of 4,210

~95% reduction

You only pay for the information you actually need.

MCP always returns the complete object. Every time.

5. Dual Excellent UX — Humans + Agents

# Human mode (beautiful rich table)
langsmith-cli runs list --project production --limit 8
Enter fullscreen mode Exit fullscreen mode

→ Color-coded, aggregates, relative times, clean formatting

# Agent / script mode (strict, minimal JSON)
langsmith-cli --json runs list --failed --fields name,error,latency --limit 20
Enter fullscreen mode Exit fullscreen mode

One tool. Two perfect interfaces.

No compromises.

6. Export Formats That Actually Help Teams

  • --format csv → Excel, pivot tables, stakeholder reports
  • --format yaml → configs, reproducible environments
  • --json → agents, automation, monitoring pipelines
langsmith-cli runs list --failed --today --format csv > failed-runs-today.csv
Enter fullscreen mode Exit fullscreen mode

Open → analyze → share. Done.

7. Unix Philosophy — Full Composability

# How many timeout errors today?
langsmith-cli --json runs list --failed --today \
  | jq '.[] | select(.error | contains("timeout"))' \
  | wc -l

# Top 5 most common errors
langsmith-cli --json runs list --failed --limit 200 \
  | jq -r '.[] | .error' \
  | sort | uniq -c | sort -rn | head -5
Enter fullscreen mode Exit fullscreen mode

This is where CLI completely outclasses MCP + web.

You already know these tools.

You already have the scripts.

Now they work with LangSmith too.

Quick Start (Really 30–60 Seconds)

# Install (isolated, safe, works everywhere)
curl -sSL https://raw.githubusercontent.com/langchain-ai/langsmith-cli/main/scripts/install.sh | sh

# Or faster with uv:
uv tool install langsmith-cli

# Add as skill in Claude Code
/plugin marketplace add gigaverse-app/langsmith-cli

# First login
langsmith-cli auth login
Enter fullscreen mode Exit fullscreen mode

Then try:

langsmith-cli runs watch --project production
# or
langsmith-cli runs list --failed --today
Enter fullscreen mode Exit fullscreen mode

Final Verdict

langsmith-cli is not just "lighter" than MCP.

It is objectively better at the things that matter most when debugging and operating LLM systems in production:

  • Real-time visibility
  • Powerful filtering without code
  • Massive context & token savings
  • Beautiful human UX + perfect machine UX
  • Export formats teams actually use
  • Full Unix-style composability

177× less context overhead is nice.

But being able to watch production live, find problems in seconds, and export meaningful data instantly — that's why I built it, and why I never want to go back.

Give it 60 seconds.

Run /context before and after.

The numbers don't lie.

Repo → https://github.com/gigaverse-app/langsmith-cli (MIT)

Happy (much faster) debugging!

Aviad

LangSmith #LLM #Observability #AIDevTools #ClaudeCode

Top comments (0)