DEV Community

Cover image for MCP Spine v0.2.5: I Built a Full Middleware Stack for MCP Tool Calls
Donnyb369
Donnyb369

Posted on

MCP Spine v0.2.5: I Built a Full Middleware Stack for MCP Tool Calls

Last month I shipped MCP Spine v0.1 — a basic proxy that sat between Claude Desktop and MCP servers. It did schema minification and security basics.

Since then, it's grown into a full middleware stack. Here's everything in v0.2.5 and why each piece exists.

The Starting Point

57 tools. 5 servers. Claude Desktop config file with one entry pointing to Spine. Everything routes through the proxy.

pip install mcp-spine
mcp-spine init
Enter fullscreen mode Exit fullscreen mode

The setup wizard detects your installed servers (npx, node, Python), asks what features you want, and writes a tailored config.

Schema Minification: 61% Fewer Tokens

Every tool call starts with the LLM reading tool schemas. With 57 tools, that's thousands of tokens before the conversation even begins.

Spine's minifier strips $schema, additionalProperties, parameter descriptions, titles, and defaults — keeping only what the LLM actually needs. Level 2 cuts 61% of schema tokens with zero information loss.

The web dashboard shows real-time savings:

Dashboard

State Guard: No More Stale Edits

In long coding sessions, Claude memorizes file contents from earlier in the conversation. Then it "edits" the old version — silently overwriting your current code.

State Guard watches your project files, computes SHA-256 hashes, and injects compact version pins into every tool response. When Claude's cached version doesn't match, it knows to re-read.

Prompt Injection Detection

This one surprised me. Tool responses can contain text that looks like instructions to the LLM — "ignore previous instructions", "[SYSTEM]", or encoded payloads.

Spine now scans every tool response for 8 categories of injection patterns before it reaches the model. Detections are logged as security events and can trigger webhook alerts to Slack or Discord.

# spine/injection.py detects:
# - System prompt overrides
# - Role injection ("you are now a...")
# - Instruction hijacking
# - Jailbreak attempts (DAN, developer mode)
# - Data exfiltration URLs
# - Base64-encoded payloads
Enter fullscreen mode Exit fullscreen mode

Plugin System: The Compliance Layer

This is the feature I'm most excited about. Spine plugins are Python files that hook into the tool call pipeline:

from spine.plugins import SpinePlugin

class SlackFilter(SpinePlugin):
    name = "slack-filter"
    deny_channels = ["hr-private", "exec-salary"]

    def on_tool_response(self, tool_name, arguments, response):
        if "slack" not in tool_name:
            return response
        # Filter messages from denied channels
        content = response.get("content", [])
        filtered = [b for b in content
                    if not any(ch in b.get("text", "").lower()
                              for ch in self.deny_channels)]
        return {**response, "content": filtered}
Enter fullscreen mode Exit fullscreen mode

Drop it in your plugins/ directory, enable in config, done. The LLM never sees messages from those channels.

Four hook points: on_tool_call (transform args or block calls), on_tool_response (filter responses), on_tool_list (hide tools), and lifecycle hooks.

Web Dashboard

Zero-dependency browser dashboard at localhost:8777:

mcp-spine web --db spine_audit.db
Enter fullscreen mode Exit fullscreen mode

Shows tool calls, security events, token budget usage, schema token savings, server latency, request log, and client sessions. Auto-refreshes every 3 seconds.

Tool Response Caching

Read-only tools like read_file and list_directory often get called with the same arguments multiple times in a conversation. Spine now caches these responses:

[tool_cache]
enabled = true
cacheable_tools = ["read_file", "read_query", "list_directory"]
ttl_seconds = 300
Enter fullscreen mode Exit fullscreen mode

Cache hits skip the downstream server call entirely. LRU eviction with TTL expiration.

Everything Else in v0.2.5

  • Token budget: daily limits, per-server limits, warn/block actions, persistent tracking, spine_budget meta-tool
  • Tool aliasing: create_or_update_fileedit_github_file
  • Config hot-reload: edit config while running, changes apply in seconds
  • Webhook notifications: Slack/Discord/JSON alerts on security events
  • Multi-user audit: session-tagged entries, mcp-spine audit --sessions
  • Analytics export: CSV/JSON with time and event filtering
  • Streamable HTTP: MCP 2025-03-26 transport support
  • Interactive wizard: mcp-spine init detects your setup
  • Latency monitoring: per-server tracking with degradation alerts

The Numbers

  • 20 source files
  • 190+ tests
  • CI on Windows + Linux, Python 3.11-3.13
  • AAA score on Glama
  • Approved on mcpservers.org
  • MIT licensed

Try It

pip install mcp-spine
mcp-spine init
mcp-spine doctor --config spine.toml
mcp-spine serve --config spine.toml
mcp-spine web --db spine_audit.db
Enter fullscreen mode Exit fullscreen mode

GitHub: https://github.com/Donnyb369/mcp-spine

What would you build with a plugin system for MCP tool calls?

Top comments (0)