DEV Community

Cover image for LangSmith CLI Why Lightweight Skills Crush Heavy MCP Servers (Context Is All You Need)
Aviad Rozenhek
Aviad Rozenhek

Posted on

LangSmith CLI Why Lightweight Skills Crush Heavy MCP Servers (Context Is All You Need)

TL;DR

Measured reality in Claude Code sessions:

  • MCP LangSmith tools16,100 tokens always loaded (≈8% of 200k context)
  • langsmith-cli as Skill91 tokens when activated, 0 tokens when idle
  • Difference: ×177 less context overhead

Installation: 30 seconds vs typical 15+ minutes

Field pruning: up to 95% token reduction on responses

Startup: 43–87 ms cold/warm

Skills win for the majority of stateless AI tooling operations.


The Context Tax – Measured Reality

Right now, in my Claude Code session, the LangSmith MCP tools are consuming:

MCP tools · /mcp
├ mcp__langsmith__run_experiment     3.2k tokens
├ mcp__langsmith__push_prompt         2.8k tokens
├ mcp__langsmith__fetch_runs          2.2k tokens
...
└ mcp__langsmith__get_prompt_by_name    146 tokens
TOTAL: 16,100 tokens   (≈8% of 200k context window)
Enter fullscreen mode Exit fullscreen mode

These definitions are permanently loaded — even if I never touch LangSmith during the entire conversation.

The same functionality implemented as a Skill (subprocess-based CLI):

Skills · /skills
├ commit-commands:clean_gone          46 tokens
├ agent-sdk-dev:new-sdk-app           19 tokens
...
TOTAL when activated: 91 tokens   (0.045% of context)
Inactive: 0 tokens
Enter fullscreen mode Exit fullscreen mode

177× difference.

Not an estimate — actual numbers from /context command.


Why Does This Matter? Context Economics

Item Price (Claude Opus 4.5) Impact of losing 16k tokens
Input tokens $15 / million ~$0.24 per query just overhead
200k context window shared resource 8% permanently occupied
3 typical MCP servers ~36–48k tokens 18–24% of context gone
Freed context (35k+ tokens) ≈30 pages docs / 500+ LOC / long conversation history

The more MCP servers you add, the faster your effective context window shrinks — before any real work begins.


Architectural Comparison: Persistent vs On-demand

Aspect MCP Servers Skills (subprocess CLI)
Loading moment At application start Only when explicitly activated
Context occupation Permanent Temporary + very small
Startup time (measured) Usually 1–3+ seconds 43–87 ms
Resource consumption Persistent process Starts → works → exits
Lifecycle management Required (start/stop/restart/debug) None
Installation complexity Medium–high (config, env vars, debugging) Very low (curl / uv tool)
Composability Limited (JSON only) Excellent (Unix pipes friendly)
Output control Full objects always Field pruning + multiple formats

Most AI tooling operations are stateless queries

→ list, get, create, update, export

→ They don't need persistent connections, pools, watchers, or bidirectional streaming.


Added Value of langsmith-cli (Beyond Context Efficiency)

  1. Aggressive field pruning

    Full Run object ≈ 4.2k tokens

    Pruned (name, error, latency, etc.) ≈ 200–300 tokens

    ~90–95% reduction

  2. Multiple output formats

    --json, --format csv, --format yaml

  3. Human-friendly + agent-friendly dual UX

    Rich tables when interactive, clean JSON when piped

  4. Advanced filtering presets

    --failed, --slow, --today, regex/wildcard on names, etc.

  5. Live watching TUI

    langsmith-cli runs watch --project production


Real Numbers from Real Session (Debug Example)

Task: Find failed runs from last hour + show error messages

Skills version

Context cost: 91 tokens (skill definition)

Response: ≈500 tokens (pruned fields, 5 runs)

Total ≈ 591 tokens

MCP version

Context cost: 16,100 tokens (always)

Response: ≈2,000 tokens (full objects)

Total ≈ 18,100 tokens

×30.6 more context for the same information


Installation – 30 Seconds vs 15+ Minutes

Recommended (Skills):

# One-liner (creates isolated venv, adds to PATH)
curl -sSL https://raw.githubusercontent.com/gigaverse-app/langsmith-cli/main/scripts/install.sh | sh

# Then in Claude Code
/plugin marketplace add gigaverse-app/langsmith-cli
Enter fullscreen mode Exit fullscreen mode

Typical MCP path:

  • pip install langsmith-mcp-server
  • manual editing of config.json
  • setting env variables
  • debugging python path / permissions / port conflicts
  • restart client
  • check logs... → frequently 15–40 minutes

When MCP Still Makes Sense (Fair Comparison)

Use MCP servers when you really need:

  • persistent expensive state (connection pools, large in-memory caches)
  • background processing (file watchers, long-polling)
  • bidirectional streaming
  • very heavy initialization (5GB+ ML models)

For 90–95% of current LangSmith / tracing / evaluation use-cases → skills are superior.


Quick Start – Measure It Yourself

# Install CLI
curl -sSL https://raw.githubusercontent.com/gigaverse-app/langsmith-cli/main/scripts/install.sh | sh

# Add as skill in Claude Code
/plugin marketplace add gigaverse-app/langsmith-cli

# See the dramatic difference
/context
Enter fullscreen mode Exit fullscreen mode

Repo: https://github.com/gigaverse-app/langsmith-cli

( MIT license – contributions welcome )


Context is the most precious resource in long-context LLMs.

Don't waste it on infrastructure that can be replaced with an 80-millisecond subprocess call.

Try the skills approach.

The numbers don't lie.

Happy (much lighter) hacking!

Aviad

Top comments (0)