Donnyb369

Posted on Apr 19

I Built the Middleware Layer MCP is Missing

#ai #security #mcp #python

Every MCP tutorial shows the same thing: connect Claude to your filesystem, your database, your GitHub. Five servers, 57 tools, infinite power.

Nobody talks about what happens next.

The Problems Nobody Mentions

Token waste. With 40+ tools loaded, you're burning thousands of tokens on JSON schemas every turn. Before Claude even reads your question, it's consumed half its context window on tool definitions.

Context rot. In long coding sessions, Claude memorizes file contents from earlier in the conversation. Then it edits the old version — silently overwriting your latest changes. You don't notice until the code breaks.

Zero security boundary. MCP servers run with full access. No audit trail. No rate limits. No secret scrubbing. Your GitHub token shows up in logs. There's nothing between the LLM and your tools.

No compliance layer. Claude wants to read Slack? Hope you're okay with it seeing your DMs with your boss. There's no way to filter what reaches the model.

MCP Spine: One Proxy, Full Control

I built MCP Spine — a local-first middleware proxy that sits between your LLM client and your MCP servers. One config file, one entry point in claude_desktop_config.json, and everything routes through it.

Here's what it does:

61% Token Savings

The schema minifier strips unnecessary fields from tool definitions — $schema, additionalProperties, verbose descriptions, defaults. Level 2 cuts token usage by 61% with zero information loss.

State Guard Stops Context Rot

Spine watches your project files, tracks SHA-256 hashes, and injects version pins into every tool response. When Claude has a stale cached version, the pin tells it to re-read. Context rot solved.

Security That Actually Works

Rate limiting (per-tool and global), path traversal jails, secret scrubbing (AWS keys, GitHub tokens, private keys), HMAC-fingerprinted audit trails, and circuit breakers on failing servers. Defense-in-depth — every layer assumes the others might fail.

Plugin System for Compliance

Drop-in Python plugins hook into the tool call pipeline. The included Slack filter example strips messages from sensitive channels before the LLM ever sees them:

from spine.plugins import SpinePlugin

class SlackFilter(SpinePlugin):
    name = "slack-filter"
    deny_channels = ["hr-private", "exec-salary"]

    def on_tool_response(self, tool_name, arguments, response):
        if "slack" not in tool_name:
            return response
        # Filter out messages from denied channels
        ...

Everything Else

Semantic routing with local embeddings (no API calls) — only relevant tools reach the LLM
Human-in-the-loop confirmation for destructive tools
Token budget tracking with daily limits and warn/block enforcement
Config hot-reload — edit your config while Spine is running
Multi-user audit with session-tagged entries
Three transports: stdio, SSE, and Streamable HTTP (MCP 2025-03-26)
Interactive setup wizard (mcp-spine init)

Quick Start

pip install mcp-spine
mcp-spine init
mcp-spine doctor --config spine.toml

Then add one entry to your claude_desktop_config.json:

{
  "mcpServers": {
    "spine": {
      "command": "python",
      "args": ["-m", "spine.cli", "serve", "--config", "/path/to/spine.toml"]
    }
  }
}

Battle-Tested on Windows

Most MCP tooling assumes macOS. Spine is battle-tested on Windows with MSIX sandbox paths, npx.cmd resolution, paths with spaces and parentheses, environment variable merging, and unbuffered stdout to prevent pipe hangs. It also runs on macOS and Linux.

190+ tests, CI on Windows + Linux across Python 3.11-3.13.

DEV Community