<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mitiku Yohannes</title>
    <description>The latest articles on DEV Community by Mitiku Yohannes (@mitiku1).</description>
    <link>https://dev.to/mitiku1</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F136452%2F381b1e91-c5a6-49d1-b409-868cceba913d.png</url>
      <title>DEV Community: Mitiku Yohannes</title>
      <link>https://dev.to/mitiku1</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mitiku1"/>
    <language>en</language>
    <item>
      <title>mcp-pvp — Privacy Vault Protocol for MCP</title>
      <dc:creator>Mitiku Yohannes</dc:creator>
      <pubDate>Sat, 14 Mar 2026 23:54:23 +0000</pubDate>
      <link>https://dev.to/mitiku1/mcp-pvp-privacy-vault-protocol-for-mcp-2b80</link>
      <guid>https://dev.to/mitiku1/mcp-pvp-privacy-vault-protocol-for-mcp-2b80</guid>
      <description>&lt;h2&gt;
  
  
  Inspiration
&lt;/h2&gt;

&lt;p&gt;We were building an AI agent that handled customer support emails. The agent used MCP to call tools like &lt;code&gt;send_email&lt;/code&gt;, &lt;code&gt;lookup_account&lt;/code&gt;, and &lt;code&gt;create_ticket&lt;/code&gt;. It worked beautifully — until we looked at the logs.&lt;/p&gt;

&lt;p&gt;Every customer's email address, phone number, and account ID was sitting in plain text across four different places: the LLM prompt, the debug logs, the agent's conversation history, and the tool responses being fed back into the context window. One user request created four copies of their PII that we had no control over.&lt;/p&gt;

&lt;p&gt;The obvious fix — redacting PII before the LLM sees it — breaks everything. If you replace &lt;code&gt;alice@example.com&lt;/code&gt; with &lt;code&gt;[REDACTED]&lt;/code&gt;, the agent tries to call &lt;code&gt;send_email(to="[REDACTED]")&lt;/code&gt; and the workflow fails. You're trading a privacy problem for a functionality problem.&lt;/p&gt;

&lt;p&gt;We needed a way to keep sensitive data hidden from the LLM while still letting tools use it. That's what mcp-pvp does.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;mcp-pvp&lt;/strong&gt; is an open-source Python library that adds automatic PII protection to any MCP server. Instead of redacting sensitive data, it &lt;strong&gt;tokenizes&lt;/strong&gt; it — replacing real values with typed, opaque references while storing the originals in a local vault.&lt;/p&gt;

&lt;p&gt;Here's what happens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User says:  "Send the report to alice@example.com"
LLM sees:   "Send the report to [[PII:EMAIL:tkn_x7k9m2]]"
Tool gets:  to="alice@example.com"  (resolved at execution time)
LLM gets:   to="[[PII:EMAIL:tkn_n3w]]"  (re-tokenized result)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM can still reason perfectly — it knows there's an email address and it should pass it to the email tool. But it never possesses the actual value. The real email only exists in memory for the brief moment the tool function executes (~50ms).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key features:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Drop-in integration&lt;/strong&gt; — &lt;code&gt;FastPvpMCP&lt;/code&gt; subclasses FastMCP. Wrap your existing server in 3 lines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Default-deny policy&lt;/strong&gt; — You declare exactly which PII types each tool can access, at which argument paths. Everything else is blocked.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HMAC-signed capabilities&lt;/strong&gt; — Every disclosure is authorized by a cryptographic token bound to the session, PII reference, and target tool.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session-scoped storage&lt;/strong&gt; — Each MCP connection gets an isolated vault. Tokens can't leak across sessions. Everything is cleaned up on disconnect.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic re-tokenization&lt;/strong&gt; — Tool results are scanned for PII and re-tokenized before they reach the agent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pluggable detection&lt;/strong&gt; — Built-in regex detector for zero-dependency use, optional Microsoft Presidio integration for production NLP-based detection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full audit trail&lt;/strong&gt; — Every tokenization, resolution, and policy denial is logged (never with raw PII values).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How we built it
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;

&lt;p&gt;The system has six core components:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Client ──► FastPvpMCP Server
               │
               ├── Detector Pipeline (Regex / Presidio)
               │     Finds PII spans in text
               │
               ├── Vault (central coordinator)
               │     Tokenizes content, resolves tokens, manages lifecycle
               │
               ├── Session Store (in-memory, TTL-bounded)
               │     Isolated per-connection storage for PII values
               │
               ├── Policy Engine (declarative, default-deny)
               │     Controls which tools see which PII types
               │
               ├── Capability Manager (HMAC-SHA256)
               │     Cryptographic authorization for each disclosure
               │
               └── Audit Logger (structured, PII-free)
                     Records every operation for compliance
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Tech stack
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Python 3.11+&lt;/strong&gt; with full type annotations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pydantic v2&lt;/strong&gt; for all data models&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP SDK 1.26+&lt;/strong&gt; — standard Model Context Protocol&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Microsoft Presidio&lt;/strong&gt; (optional) for NLP-based PII detection with spaCy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;structlog&lt;/strong&gt; for structured audit logging&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HMAC-SHA256&lt;/strong&gt; for capability signing with constant-time verification&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The tokenization flow
&lt;/h3&gt;

&lt;p&gt;When content arrives, the vault:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Serializes&lt;/strong&gt; — Recursively flattens nested structures (dicts, lists, Pydantic models) into scannable text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detects&lt;/strong&gt; — Runs the detector pipeline to find PII spans with type and confidence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stores&lt;/strong&gt; — Generates unique references (&lt;code&gt;tkn_&amp;lt;random&amp;gt;&lt;/code&gt;), stores raw values in the session&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replaces&lt;/strong&gt; — Swaps PII spans with typed tokens, working right-to-left to preserve character positions&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When a tool is called with tokenized arguments:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Scans&lt;/strong&gt; — An O(n) state machine extracts token references (no regex backtracking)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evaluates policy&lt;/strong&gt; — Checks if this PII type is allowed for this tool at this argument path&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Issues capability&lt;/strong&gt; — Creates an HMAC-signed, time-limited authorization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resolves&lt;/strong&gt; — Retrieves the real value from the session store&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Executes&lt;/strong&gt; — Runs the tool with real values (the only moment PII is in memory)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Re-tokenizes&lt;/strong&gt; — Scans the result for any PII and replaces it with fresh tokens&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Server code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp_pvp.bindings.mcp.server&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastPvpMCP&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp_pvp.models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PIIType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Policy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PolicyAllow&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SinkPolicy&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp_pvp.vault&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Vault&lt;/span&gt;

&lt;span class="c1"&gt;# Declare what each tool is allowed to see
&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Policy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sinks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool:send_email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;SinkPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;allow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;PolicyAllow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PIIType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EMAIL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg_paths&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])]&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool:lookup_user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;SinkPolicy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;allow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="nc"&gt;PolicyAllow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PIIType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EMAIL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg_paths&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
            &lt;span class="nc"&gt;PolicyAllow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PIIType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PHONE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arg_paths&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;phone&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastPvpMCP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-app&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vault&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Vault&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# 'to' is the real email — resolved automatically by the vault
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recipient&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lookup_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;plan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;mcp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stdio&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Every &lt;code&gt;@mcp.tool()&lt;/code&gt; gets automatic PII interception. The &lt;code&gt;pvp_tokenize&lt;/code&gt; tool and &lt;code&gt;pvp://session&lt;/code&gt; resource are registered automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Client code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Tokenize — raw PII stays server-side
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pvp_tokenize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Contact alice@example.com or call 555-0123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# data["redacted"] = "Contact [[PII:EMAIL:tkn_abc]] or call [[PII:PHONE:tkn_def]]"
&lt;/span&gt;
    &lt;span class="c1"&gt;# Use tokens in tool calls — vault resolves them
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;send_email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;subject&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Monthly update&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="c1"&gt;# Result contains re-tokenized PII, safe for LLM context
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Challenges we ran into
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Token scanning without regex backtracking.&lt;/strong&gt; Our first implementation used a regex pattern for &lt;code&gt;[[PII:TYPE:REF]]&lt;/code&gt; extraction. It worked — until we ran it against adversarial input with thousands of nested brackets. The regex engine would backtrack exponentially. We replaced it with a hand-written state machine (&lt;code&gt;TokenScanner&lt;/code&gt;) that processes each character exactly once in O(n) time, regardless of input.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recursive PII in nested structures.&lt;/strong&gt; Tool arguments and results aren't always flat strings. They're nested dicts, lists, Pydantic models, and sometimes exception objects. We had to build a recursive serializer (&lt;code&gt;serialize_for_pii_detection&lt;/code&gt;) that could flatten any Python structure for PII scanning and then map detected spans back to the original structure for replacement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result re-tokenization.&lt;/strong&gt; Tools frequently echo their input: &lt;code&gt;send_email&lt;/code&gt; might return &lt;code&gt;{"to": "alice@example.com"}&lt;/code&gt;. If we didn't scan results, that raw email would flow right back into the LLM context. We added automatic result re-tokenization as the final step of every tool execution — it generates fresh token references, maintaining a clean audit trail that separates input tokens from output tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Policy granularity.&lt;/strong&gt; Early versions had tool-level policies ("this tool can see emails"). But that's too coarse — &lt;code&gt;send_email&lt;/code&gt; should see an email in its &lt;code&gt;to&lt;/code&gt; argument, but not if someone passes one in the &lt;code&gt;body&lt;/code&gt; argument. We added &lt;strong&gt;argument path&lt;/strong&gt; specificity so policies control exactly where within a tool's arguments PII can appear.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session isolation under concurrent connections.&lt;/strong&gt; MCP servers handle multiple simultaneous clients. Each client's PII must be completely isolated. We built session-scoped storage with ownership tagging — every stored PII value records which session created it, and cross-session access is rejected even if an attacker guesses a valid token reference.&lt;/p&gt;

&lt;h2&gt;
  
  
  Accomplishments that we're proud of
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero protocol changes&lt;/strong&gt; — Works with standard MCP. No custom transport, no protocol extensions. Drop it into any existing MCP server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3-line integration&lt;/strong&gt; — Import &lt;code&gt;FastPvpMCP&lt;/code&gt;, define a policy, and replace &lt;code&gt;FastMCP&lt;/code&gt; with &lt;code&gt;FastPvpMCP&lt;/code&gt;. All existing &lt;code&gt;@mcp.tool()&lt;/code&gt; decorators keep working unchanged.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;257 passing tests&lt;/strong&gt; covering adversarial inputs, session integrity, recursive scrubbing, policy edge cases, and audit coherence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;O(n) guaranteed scanning&lt;/strong&gt; — The state machine scanner handles pathological input where regex would hang.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complete audit trail&lt;/strong&gt; with zero raw PII in any log record — every disclosure is traceable from tokenization through resolution to tool execution.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Default-deny by design&lt;/strong&gt; — It's impossible to accidentally over-permit. There's no wildcard, no "allow all," and LLM/engine sinks are permanently blocked.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What we learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The redaction vs. functionality tradeoff is a false dilemma.&lt;/strong&gt; Tokenization resolves it elegantly — the LLM gets enough semantic information to reason correctly ("this is an email address, I should pass it to the email tool") without ever possessing the raw value.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Capability-based security &amp;gt; access control lists.&lt;/strong&gt; Instead of asking "is this caller allowed?", capabilities encode "this specific operation is authorized." They're unforgeable (HMAC-signed), time-limited, and bound to a specific session + PII reference + target tool. This eliminates confused deputy attacks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Privacy needs to be a middleware concern, not an application concern.&lt;/strong&gt; If every tool author has to remember to scrub PII from results, someone will forget. Making it automatic and transparent — at the framework level — is the only reliable approach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session lifecycle is the natural privacy boundary.&lt;/strong&gt; MCP connections already have a defined lifecycle (connect → use → disconnect). Binding vault sessions to MCP connections means cleanup is automatic and there's no long-lived token database to worry about.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;More detector backends&lt;/strong&gt; — Support for AWS Comprehend, Google DLP, and Azure AI Language PII detection in addition to Presidio&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming token resolution&lt;/strong&gt; — Resolve tokens in streamed responses as chunks arrive, not just batch results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policy-as-code&lt;/strong&gt; — Load policies from YAML/TOML config files for easier management across environments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metrics and dashboards&lt;/strong&gt; — Export tokenization/resolution counts, policy denial rates, and session statistics via OpenTelemetry&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-language SDKs&lt;/strong&gt; — TypeScript and Go implementations for non-Python MCP servers&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Built with
&lt;/h2&gt;

&lt;p&gt;Python, Pydantic, MCP SDK, Microsoft Presidio, spaCy, HMAC-SHA256, structlog&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;mcp-pvp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/Hidet-io/mcp-pvp" rel="noopener noreferrer"&gt;github.com/Hidet-io/mcp-pvp&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyPI:&lt;/strong&gt; &lt;a href="https://pypi.org/project/mcp-pvp" rel="noopener noreferrer"&gt;pypi.org/project/mcp-pvp&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docs:&lt;/strong&gt; &lt;a href="https://hidet-io.github.io/mcp-pvp" rel="noopener noreferrer"&gt;hidet-io.github.io/mcp-pvp&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;License:&lt;/strong&gt; Apache-2.0&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>privacy</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
