<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shashikiran ML</title>
    <description>The latest articles on DEV Community by Shashikiran ML (@shashikiran_ml).</description>
    <link>https://dev.to/shashikiran_ml</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3616793%2F18c01036-dda9-466f-92cb-6d259022b78d.png</url>
      <title>DEV Community: Shashikiran ML</title>
      <link>https://dev.to/shashikiran_ml</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shashikiran_ml"/>
    <language>en</language>
    <item>
      <title>How We Stop PII From Leaking Through AI Pipelines (Without Breaking the LLM)</title>
      <dc:creator>Shashikiran ML</dc:creator>
      <pubDate>Tue, 24 Mar 2026 15:41:03 +0000</pubDate>
      <link>https://dev.to/shashikiran_ml/how-we-stop-pii-from-leaking-through-ai-pipelines-without-breaking-the-llm-36ib</link>
      <guid>https://dev.to/shashikiran_ml/how-we-stop-pii-from-leaking-through-ai-pipelines-without-breaking-the-llm-36ib</guid>
      <description>&lt;p&gt;Every AI pipeline tutorial shows you the happy path. Chunk your documents, embed them, stuff them into a context window, call your LLM, get a great answer.&lt;/p&gt;

&lt;p&gt;None of them show what happens when your documents contain patient names, account numbers, or SSNs. At that point the demo breaks — and so does your compliance posture.&lt;/p&gt;

&lt;p&gt;I've watched this play out repeatedly. Teams build a RAG pipeline, get it working, then legal or security asks a simple question: &lt;strong&gt;where exactly does the PII go?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The answer is usually &lt;em&gt;everywhere.&lt;/em&gt; Into the embedding. Into the prompt. Logged by the orchestration framework. Possibly cached. Definitely inside the LLM API call leaving your network.&lt;/p&gt;

&lt;p&gt;Here's how we think about the problem, and how we solve it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The naive fix that makes things worse
&lt;/h2&gt;

&lt;p&gt;First instinct: redaction. Find the PII, replace with &lt;code&gt;[REDACTED]&lt;/code&gt;, move on.&lt;/p&gt;

&lt;p&gt;This breaks LLMs in a specific way that's easy to miss in testing and obvious in production.&lt;/p&gt;

&lt;p&gt;Take this sentence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Patient John Doe, DOB 04/12/1978, residing at 123 Main Street.
His email is john.doe@example.com and he was prescribed metformin.
Follow-up with Dr. Sarah Connors on 03/15/2025.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After naive redaction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Patient [REDACTED], DOB [REDACTED], residing at [REDACTED].
His email is [REDACTED] and he was prescribed metformin.
Follow-up with [REDACTED] on [REDACTED].
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ask an LLM to summarize this. The model either hallucinates to fill the gaps or produces a hedged non-answer. Worse, if this record appears in two different chunks, &lt;code&gt;[REDACTED]&lt;/code&gt; gives the model no way to know both refer to the same person.&lt;/p&gt;

&lt;p&gt;Referential integrity is gone. The model can't reason across the context correctly.&lt;/p&gt;




&lt;h2&gt;
  
  
  What format-preserving tokenization does differently
&lt;/h2&gt;

&lt;p&gt;Instead of blanking sensitive values, Protecto replaces them with &lt;strong&gt;typed, entity-scoped tokens&lt;/strong&gt; that preserve meaning and referential integrity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Patient &amp;lt;PER&amp;gt;005O 0BY&amp;lt;/PER&amp;gt;, DOB &amp;lt;DATE_TIME&amp;gt;06N 00E1&amp;lt;/DATE_TIME&amp;gt;,
residing at &amp;lt;ADDRESS&amp;gt;06N 00E1 00003b&amp;lt;/ADDRESS&amp;gt;.
His email is &amp;lt;EMAIL&amp;gt;3&amp;lt;/EMAIL&amp;gt; and he was prescribed metformin.
Follow-up with &amp;lt;PER&amp;gt;7H2K 9QR&amp;lt;/PER&amp;gt; on &amp;lt;DATE_TIME&amp;gt;14P 88X2&amp;lt;/DATE_TIME&amp;gt;.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM sees a coherent record. It knows &lt;code&gt;&amp;lt;PER&amp;gt;005O 0BY&amp;lt;/PER&amp;gt;&lt;/code&gt; and&lt;br&gt;
&lt;code&gt;&amp;lt;PER&amp;gt;7H2K 9QR&amp;lt;/PER&amp;gt;&lt;/code&gt; are two distinct people. If the same patient appears in five chunks, their token is consistent across all of them.&lt;/p&gt;

&lt;p&gt;Output quality holds. We track this with something we call &lt;strong&gt;RARI&lt;/strong&gt; (Response Accuracy Retention Index): does the LLM still give accurate answers after masking? With typed tokens, yes. With &lt;code&gt;[REDACTED]&lt;/code&gt;, often no.&lt;/p&gt;


&lt;h2&gt;
  
  
  The masking API in practice
&lt;/h2&gt;

&lt;p&gt;Here's a basic scan-and-mask call using auto-detection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;John Doe lives at 123 Main Street. His email is john.doe@example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://protecto-trial.protecto.ai/api/vault/mask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &amp;lt;AUTH_TOKEN&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response returns the original value alongside its masked token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"John Doe lives at 123 Main Street. His email is john.doe@example.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"token_value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;PER&amp;gt;005O 0BY&amp;lt;/PER&amp;gt; lives at &amp;lt;ADDRESS&amp;gt;06N 00E1 00003b&amp;lt;/ADDRESS&amp;gt;. His email is &amp;lt;EMAIL&amp;gt;3&amp;lt;/EMAIL&amp;gt;"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"success"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To unmask, pass the &lt;code&gt;token_value&lt;/code&gt; back to the unmask endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;unmask_payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;unmask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;token_value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;PER&amp;gt;005O 0BY&amp;lt;/PER&amp;gt; lives at &amp;lt;ADDRESS&amp;gt;06N 00E1 00003b&amp;lt;/ADDRESS&amp;gt;. His email is &amp;lt;EMAIL&amp;gt;3&amp;lt;/EMAIL&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://protecto-trial.protecto.ai/api/vault/unmask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &amp;lt;AUTH_TOKEN&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;unmask_payload&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Unmask is role-gated. Only users or agents with the right access policy can retrieve the original values. Every call is logged for audit.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where this fits in a RAG pipeline
&lt;/h2&gt;

&lt;p&gt;The integration point is before data hits the vector store, and again before you send retrieved context to the LLM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Raw documents
      |
   [MASK]           &amp;lt;- Protecto API, at ingestion
      |
Vector store (masked embeddings)
      |
Retrieval
      |
   [MASK]           &amp;lt;- Protecto API, on retrieved chunks at query time
      |
LLM prompt (masked)
      |
LLM response (masked)
      |
   [UNMASK]         &amp;lt;- Protecto API, policy-checked, role-gated
      |
Final response to authorized user
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM never sees real PII. Your vector store doesn't contain it. Your logs don't capture it. An authorized user gets the real data unmasked in the final response.&lt;br&gt;
Everyone else gets tokens.&lt;/p&gt;


&lt;h2&gt;
  
  
  The async path for batch workloads
&lt;/h2&gt;

&lt;p&gt;Real-time masking is fine for live agent calls. For bulk jobs — generating embeddings from 50M records, running ETL pipelines — use async:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;batch_payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;He lives in the U.S.A&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ram lives in the U.S.A&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://protecto-trial.protecto.ai/api/vault/mask/async&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &amp;lt;AUTH_TOKEN&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;batch_payload&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;tracking_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tracking_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The job runs with autoscaling, Kafka integration for streaming pipelines, and built-in caching for repeated text patterns (useful when processing logs with the same patient ID across thousands of entries).&lt;/p&gt;




&lt;h2&gt;
  
  
  The edge case that surprised us most
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Arabic-script numerals.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A Middle Eastern financial institution we worked with uses GPT-4o for financial summarization. Their input is mixed Arabic and English, and phone numbers often appear in Eastern Arabic digits (٠١٢٣٤٥٦٧٨٩ rather than 0123456789). Standard NER models including AWS Comprehend missed these almost entirely.&lt;/p&gt;

&lt;p&gt;Building character-level patterns on top of the base model to handle this got them to &lt;strong&gt;99% recall, 96% precision.&lt;/strong&gt; It required treating the problem as multilingual NER, not just English PII detection.&lt;/p&gt;

&lt;p&gt;If you're building for international deployments, this class of problem is worth solving before you hit production.&lt;/p&gt;




&lt;h2&gt;
  
  
  On GCP Marketplace
&lt;/h2&gt;

&lt;p&gt;We just listed on &lt;a href="https://console.cloud.google.com/marketplace/product/protecto-public/protecto-data-privacy-vault" rel="noopener noreferrer"&gt;Google Cloud Marketplace&lt;/a&gt;, which means if you're on GCP you can deploy Protecto Vault directly from your account without a separate procurement track. APIs work with LangChain, Databricks, Snowflake, n8n, crewAI and so many more.&lt;/p&gt;

&lt;p&gt;Docs: &lt;a href="https://docs.protecto.ai" rel="noopener noreferrer"&gt;docs.protecto.ai&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Happy to answer questions in the comments about the architecture, tokenization&lt;br&gt;
approach, or how we handle semantic drift at scale.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>llm</category>
      <category>privacy</category>
    </item>
    <item>
      <title>Building Multi-Tenant AI SaaS Without the Data Privacy Nightmares</title>
      <dc:creator>Shashikiran ML</dc:creator>
      <pubDate>Tue, 09 Dec 2025 16:21:19 +0000</pubDate>
      <link>https://dev.to/shashikiran_ml/building-multi-tenant-ai-saas-without-the-data-privacy-nightmares-2ig4</link>
      <guid>https://dev.to/shashikiran_ml/building-multi-tenant-ai-saas-without-the-data-privacy-nightmares-2ig4</guid>
      <description>&lt;p&gt;You've built something cool. An AI agent that answers customer questions. A RAG system that extracts insights from documents. An LLM endpoint that your users love.&lt;/p&gt;

&lt;p&gt;Then your CISO asks: "Where's the data protection?"&lt;/p&gt;

&lt;p&gt;And you realize: You're shipping customer data through your system completely unmasked. It's in your logs. Your vector database. Your fine-tuning pipeline. Nowhere is it safe.&lt;/p&gt;

&lt;p&gt;Now you have three options:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Buy an enterprise tool&lt;/strong&gt; ($50K+/month, 3-month sales cycle) - Too expensive, too slow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build your own masking solution&lt;/strong&gt; (6+ months of engineering) - Too complex, too much maintenance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Find something built for developers&lt;/strong&gt; (this is where Protecto SaaS comes in) - Fast, affordable, easy&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This article walks through option 3. How to add production-grade PII masking to your AI stack in an afternoon. &lt;/p&gt;

&lt;h2&gt;
  
  
  Why is PII masking hard in AI?
&lt;/h2&gt;

&lt;p&gt;Most data masking tools were built in the 1990s for enterprise data warehouses. They're designed for database admins and compliance officers. They require:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Infrastructure setup and management&lt;/li&gt;
&lt;li&gt;Custom rule definition&lt;/li&gt;
&lt;li&gt;Manual testing and validation&lt;/li&gt;
&lt;li&gt;Vendor negotiations and contracts&lt;/li&gt;
&lt;li&gt;3-month minimum commitments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Meanwhile, your AI stack moves at a different pace. You need to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add privacy in hours, not months&lt;/li&gt;
&lt;li&gt;Integrate via API, not database connections&lt;/li&gt;
&lt;li&gt;Pay for what you use, not reserved capacity&lt;/li&gt;
&lt;li&gt;Use tools that understand your workflow (LangChain, Llamaindex, Databricks, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The specific problem:
&lt;/h3&gt;

&lt;p&gt;When you process customer data through an AI agent, that data needs to flow through multiple layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Input layer: Customer query with PII&lt;/li&gt;
&lt;li&gt;Logging layer: Everything your agent does gets logged&lt;/li&gt;
&lt;li&gt;Vector DB layer: Embeddings created from customer data&lt;/li&gt;
&lt;li&gt;Fine-tuning layer: Training data with real customer information&lt;/li&gt;
&lt;li&gt;Evaluation layer: Test sets with unmasked examples&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Traditional masking tools can protect one or two layers. But they struggle with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unstructured text: Customer conversations, documents, support tickets&lt;/li&gt;
&lt;li&gt;Context preservation: When you mask everything, you destroy data utility&lt;/li&gt;
&lt;li&gt;Edge cases: Names hidden in unstructured data, informal identifiers&lt;/li&gt;
&lt;li&gt;Performance: Traditional masking is slow (milliseconds matter in real-time)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result: Most AI teams either ship unprotected (risky) or build custom masking (expensive).&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution: How LLM-Based Detection Changes Everything
&lt;/h2&gt;

&lt;p&gt;Here's the architecture we built at Protecto to solve this:&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 1: Intelligent PII Detection
&lt;/h3&gt;

&lt;p&gt;Traditional approach: Regex patterns. Simple, fast, but misses 15-30% of actual PII.&lt;/p&gt;

&lt;p&gt;Better approach: Combine LLMs + statistical validation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Raw text&lt;/strong&gt;: "John Smith from Acme Corp called about his account 123-45-6789"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Regex approach finds&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"123-45-6789" → SSN&lt;/li&gt;
&lt;li&gt;Misses: "Acme Corp" (organization), "John Smith" (name, sometimes)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;LLM approach finds&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"John Smith" → PERSON (98% confidence)&lt;/li&gt;
&lt;li&gt;"Acme Corp" → ORG (99% confidence)&lt;/li&gt;
&lt;li&gt;"123-45-6789" → SSN (99% confidence)&lt;/li&gt;
&lt;li&gt;Validates each finding with statistical model&lt;/li&gt;
&lt;li&gt;Result: 99%+ accuracy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why this matters: You catch edge cases that regex misses. You get high confidence scores. You reduce false positives.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 2: Context-Aware Masking
&lt;/h3&gt;

&lt;p&gt;Here's where most tools fail. They mask aggressively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before&lt;/strong&gt;: "Patient John Smith has diabetes diagnosed in 2019 and takes metformin daily."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Traditional masking&lt;/strong&gt;:&lt;br&gt;
"Patient [PII] has [PII] diagnosed in [PII] and takes [PII] daily."&lt;br&gt;
→ Completely useless for AI&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intelligent masking&lt;/strong&gt;:&lt;br&gt;
"Patient [PERSON] has diabetes diagnosed in 2019 and takes metformin daily."&lt;br&gt;
→ AI still understands the context&lt;/p&gt;

&lt;p&gt;The difference: Your LLM can work with masked data. It understands the structure. It knows there's a patient with a condition and a medication. The specific details (name, diagnosis type) are masked, but the semantic meaning is preserved.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer 3: Compliance &amp;amp; Control
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Audit logging: Every operation tracked&lt;/li&gt;
&lt;li&gt;Policy management: Define exactly what gets masked how&lt;/li&gt;
&lt;li&gt;Unmasking controls: Only authorized users can unmask specific records&lt;/li&gt;
&lt;li&gt;Multi-tenancy: Customer data completely isolated&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Real Numbers From Production
&lt;/h2&gt;

&lt;p&gt;We've been running this with customers since June 2024:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Processing 50+ million API calls per month&lt;/li&gt;
&lt;li&gt;99%+ accuracy on PII detection&lt;/li&gt;
&lt;li&gt;Average latency: 12ms for real-time, 30 seconds per 1M documents for async&lt;/li&gt;
&lt;li&gt;Cost per million API calls: $15-50 depending on data complexity&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Customer Results:
&lt;/h3&gt;

&lt;p&gt;Series A fintech startup: Went from "we can't process customer data" to "training models on real masked data" in 48 hours.&lt;/p&gt;

&lt;p&gt;Healthcare startup: Previously couldn't meet HIPAA requirements for unstructured text. Now processes patient notes with zero compliance risk.&lt;/p&gt;

&lt;p&gt;Enterprise SaaS: Reduced privacy implementation time from 3 months (estimated) to 2 weeks.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Get Started
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Visit &lt;a href="https://portal.protecto.ai/" rel="noopener noreferrer"&gt;https://portal.protecto.ai/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Sign up for a free account (&lt;/li&gt;
&lt;li&gt;Activate the account by email verification&lt;/li&gt;
&lt;li&gt;Start using our API (it’s that simple)&lt;/li&gt;
&lt;li&gt;No credit card for free tier. No long-term commitments.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Privacy doesn't have to slow you down. It can be as fast as the code you write.&lt;/p&gt;

&lt;p&gt;The companies winning in 2026 will be the ones that built privacy in from day one, not as an afterthought.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.protecto.ai/saas/" rel="noopener noreferrer"&gt;Try Protecto SaaS&lt;/a&gt; free. See how fast you can add privacy to your AI.&lt;/p&gt;

</description>
      <category>security</category>
      <category>privacy</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
