<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jay</title>
    <description>The latest articles on DEV Community by Jay (@jay_f97d46e0c14e668895cc4).</description>
    <link>https://dev.to/jay_f97d46e0c14e668895cc4</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3893648%2F1ae2803f-c01d-4239-8aef-4bd31831ed6c.png</url>
      <title>DEV Community: Jay</title>
      <link>https://dev.to/jay_f97d46e0c14e668895cc4</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jay_f97d46e0c14e668895cc4"/>
    <language>en</language>
    <item>
      <title>OpenAI Just Released a Privacy Filter. Here's What It Can't Do.</title>
      <dc:creator>Jay</dc:creator>
      <pubDate>Thu, 23 Apr 2026 06:57:28 +0000</pubDate>
      <link>https://dev.to/jay_f97d46e0c14e668895cc4/openai-just-released-a-privacy-filter-heres-what-it-cant-do-38l</link>
      <guid>https://dev.to/jay_f97d46e0c14e668895cc4/openai-just-released-a-privacy-filter-heres-what-it-cant-do-38l</guid>
      <description>&lt;p&gt;OpenAI released their Privacy Filter this week: a 1.5 billion parameter open-source model that detects and redacts PII from text before it reaches a language model. It runs locally, it's Apache 2.0, and it scores 96% F1 on the PII-Masking-300k benchmark.&lt;/p&gt;

&lt;p&gt;It's genuinely good work, and the timing is notable. The company that builds the models developers are sending sensitive data to is now shipping a tool to help them stop doing that. That's a signal.&lt;/p&gt;

&lt;p&gt;But after reading through the release and running the model, I think there's an important gap between what the Privacy Filter &lt;em&gt;does&lt;/em&gt; and what a real production deployment actually needs. If you're evaluating privacy infrastructure for an LLM pipeline, this gap matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the OpenAI Privacy Filter does well
&lt;/h2&gt;

&lt;p&gt;The model detects 8 entity types: names, addresses, emails, phone numbers, URLs, dates, account numbers, and secrets (passwords, API keys). It runs locally. No data leaves your machine. At int8 quantization, it fits on a standard laptop or browser. The recall is high (98%), which means it misses very little.&lt;/p&gt;

&lt;p&gt;For a quick personal project or a developer who wants a sanity check on a dataset, this is genuinely useful. Running entirely on-device is meaningful. You're not trading one data exposure for another.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it stops
&lt;/h2&gt;

&lt;p&gt;Here's what the filter does: it &lt;strong&gt;redacts&lt;/strong&gt;. It replaces detected PII with a blank. What it does not do is give you any way to get that information back.&lt;/p&gt;

&lt;p&gt;That's fine if you're cleaning a dataset and you never need the original values again. But most LLM pipelines are not dataset cleaning jobs. They look like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User submits a query containing their name, account number, date of birth&lt;/li&gt;
&lt;li&gt;You need the LLM to reason over that information&lt;/li&gt;
&lt;li&gt;The LLM responds&lt;/li&gt;
&lt;li&gt;You need the response to reference the user's actual data, not &lt;code&gt;[REDACTED]&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One-way redaction breaks this. The LLM gets &lt;code&gt;[REDACTED] opened their account on [REDACTED] and their balance is [REDACTED]&lt;/code&gt;. It either produces a meaningless response or hallucinates values to fill the gaps. Neither is acceptable in production.&lt;/p&gt;

&lt;p&gt;OpenAI's own documentation acknowledges this directly: &lt;em&gt;"This model is designed to be a redaction aid and should not be considered a safety guarantee."&lt;/em&gt; It's a tool for a specific, limited use case. The documentation doesn't position it as infrastructure for a live application.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three problems redaction can't solve
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. LLM coherence
&lt;/h3&gt;

&lt;p&gt;When you send &lt;code&gt;[REDACTED]&lt;/code&gt; to a language model, you're not sending anonymized data. You're sending noise. The model has no signal to reason over.&lt;/p&gt;

&lt;p&gt;Fake substitution solves this: replace "John Smith" with "David Park", "555-392-7810" with "555-213-4891". The model reasons naturally over realistic values and produces coherent output. You restore the original values in the response.&lt;/p&gt;

&lt;p&gt;This is the difference between a healthcare AI that says &lt;em&gt;"The patient's condition..."&lt;/em&gt; and one that says &lt;em&gt;"[REDACTED]'s condition..."&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Reversibility
&lt;/h3&gt;

&lt;p&gt;Every production system that sanitizes inputs needs to restore outputs. Clinical documentation, financial summaries, legal drafting all share the same requirement: the LLM's response needs to reference real entities. If you can't map sanitized text back to originals, the pipeline isn't useful.&lt;/p&gt;

&lt;p&gt;This requires a session model: map each sanitization operation to a session ID, store the token-to-original mapping, and restore on the way back out. A standalone detection model doesn't include any of this. You'd have to build it yourself.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Infrastructure burden
&lt;/h3&gt;

&lt;p&gt;The Privacy Filter is 1.5 billion parameters. At float32 that's roughly 6GB. At int8 quantization, it's around 1.5GB. If you want to run this in a serverless environment (Lambda, Cloud Run, or any auto-scaling compute), you're looking at cold start times of 15-20 seconds and significant memory costs. You need to host it, scale it, version it, and monitor it.&lt;/p&gt;

&lt;p&gt;This is not a knock on the model. It's the inherent tradeoff of running local ML inference at scale. But "runs on your laptop" and "runs reliably in production at scale" are different problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually needs to happen before a prompt reaches an LLM
&lt;/h2&gt;

&lt;p&gt;A production privacy layer needs to do five things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Detect:&lt;/strong&gt; identify PII across multiple entity types with high recall&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replace:&lt;/strong&gt; substitute with either tokens or realistic synthetic values (not blanks)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Store:&lt;/strong&gt; persist the mapping between replacement and original&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Forward:&lt;/strong&gt; send the sanitized prompt to the LLM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Restore:&lt;/strong&gt; replace synthetic values in the response with originals&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The OpenAI Privacy Filter handles step 1 partially, and step 2 only in the redaction sense. It doesn't touch steps 3, 4, or 5.&lt;/p&gt;

&lt;h2&gt;
  
  
  The detection layer is actually the easy part
&lt;/h2&gt;

&lt;p&gt;Here's something counterintuitive: building a detector with high F1 score is not the hardest problem in PII-safe LLM pipelines. It's an important problem, but it's tractable.&lt;/p&gt;

&lt;p&gt;The harder problems are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-turn conversations:&lt;/strong&gt; when a user mentions their name in turn 1 and you need to track it through turn 7, you need a session model that accumulates entity mappings across turns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance guarantees:&lt;/strong&gt; healthcare deployments need a BAA, not just a model with high recall. HIPAA HITECH doesn't care what your F1 score is.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HIPAA mode:&lt;/strong&gt; under HIPAA, you can't send data to a third-party NLP service, even for analysis. All inference has to stay local. This is an architectural requirement, not a configuration toggle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit trails:&lt;/strong&gt; regulated industries need logs of every sanitization operation: what was detected, what was replaced, when, by which customer, with what configuration. Detection alone produces none of this.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Selective purge:&lt;/strong&gt; once data is in your session store, GDPR Article 17 means you need to be able to delete specific values from all historical sessions on request. A detector doesn't touch your storage layer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are the problems that make PII-safe LLM pipelines hard in practice. A high-accuracy detector is the entry ticket, not the finish line.&lt;/p&gt;

&lt;h2&gt;
  
  
  The right way to think about it
&lt;/h2&gt;

&lt;p&gt;The OpenAI Privacy Filter is a signal that the industry has finally acknowledged that sending raw user data to LLMs is a problem. That acknowledgment matters. The tool is genuinely useful for dataset cleaning, offline processing, and low-stakes applications where one-way redaction is acceptable.&lt;/p&gt;

&lt;p&gt;For production LLM pipelines that need to stay coherent, reversible, and compliant, it's a detection component. One layer of a larger system.&lt;/p&gt;

&lt;p&gt;We built &lt;a href="https://raipii.com" rel="noopener noreferrer"&gt;raipii&lt;/a&gt; because we needed the full system, not just the detector. Three detection layers (regex at confidence 1.0, local Presidio/SpaCy NER, and AWS Comprehend for paid tiers). Three replacement modes (token, redact, and fake substitution). A session model for multi-turn conversations. HIPAA mode with local-only inference. Audit logs. A purge API for GDPR compliance.&lt;/p&gt;

&lt;p&gt;The OpenAI Privacy Filter will likely become one of those detection layers. But the layer is not the pipeline.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>privacy</category>
      <category>llm</category>
      <category>security</category>
    </item>
  </channel>
</rss>
