<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Binu George</title>
    <description>The latest articles on DEV Community by Binu George (@bgp).</description>
    <link>https://dev.to/bgp</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3918991%2Ff18ebbbb-d88f-4735-ae29-e6928b36b858.jpg</url>
      <title>DEV Community: Binu George</title>
      <link>https://dev.to/bgp</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bgp"/>
    <language>en</language>
    <item>
      <title>I Built an Open-Source AI Firewall Because Every LLM App Leaks Data</title>
      <dc:creator>Binu George</dc:creator>
      <pubDate>Fri, 08 May 2026 02:39:23 +0000</pubDate>
      <link>https://dev.to/bgp/i-built-an-open-source-ai-firewall-because-every-llm-app-leaks-data-5468</link>
      <guid>https://dev.to/bgp/i-built-an-open-source-ai-firewall-because-every-llm-app-leaks-data-5468</guid>
      <description>&lt;p&gt;Every LLM app I audited had the same problem.&lt;/p&gt;

&lt;p&gt;Users type real data into AI features. Names, emails, social security numbers, credit card numbers, medical details. The app takes that input, wraps it in a prompt, and sends it straight to OpenAI or Anthropic. No filtering. No redaction. Nothing.&lt;/p&gt;

&lt;p&gt;The developer didn't plan for it. The product manager didn't think about it. The compliance team doesn't even know AI features exist yet.&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://aisecuritygateway.ai" rel="noopener noreferrer"&gt;AI Security Gateway&lt;/a&gt; to fix this. It's an open-source proxy that sits between your app and any LLM provider. Every prompt passes through a security layer before it reaches the model.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Does
&lt;/h2&gt;

&lt;p&gt;The proxy inspects every request in real-time and applies four layers of governance:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. PII Redaction
&lt;/h3&gt;

&lt;p&gt;Before your prompt reaches OpenAI, Anthropic, Google, or anyone else, the proxy detects and redacts 28+ PII entity types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Personal identifiers&lt;/strong&gt; — names, emails, phone numbers, dates of birth&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Financial data&lt;/strong&gt; — credit card numbers, IBANs, bank accounts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Government IDs&lt;/strong&gt; — SSNs, passport numbers, driver's licenses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Medical identifiers&lt;/strong&gt; — medical record numbers, NPI numbers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Locations&lt;/strong&gt; — physical addresses, IP addresses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom patterns&lt;/strong&gt; — your own regex for internal codes, customer IDs, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It also handles images. If a user uploads a screenshot to a vision model (GPT-4o, Claude, Gemini), our OCR pipeline extracts text from the image and scans it for PII before the image reaches the provider.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Prompt Injection Blocking
&lt;/h3&gt;

&lt;p&gt;Heuristic detection catches jailbreak attempts, role override attacks, and instruction extraction — combined with custom regex rules for your specific application patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Budget Enforcement
&lt;/h3&gt;

&lt;p&gt;Set hard spend caps per API key. When a key hits its limit, the proxy returns &lt;code&gt;HTTP 402&lt;/code&gt;. Not a warning — a hard stop.&lt;/p&gt;

&lt;p&gt;This exists because I watched an agent loop burn through $3,000 in a single night during testing.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Smart Cost Routing
&lt;/h3&gt;

&lt;p&gt;Configure multiple providers and the proxy automatically routes each request to the cheapest available model. We track live pricing across 600+ models and 8+ providers. Teams typically see 30-60% cost reduction from routing alone.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture Decision That Matters Most
&lt;/h2&gt;

&lt;p&gt;AISG is fully stateless. This isn't a feature toggle — it's the architecture.&lt;/p&gt;

&lt;p&gt;Prompts pass through memory and are discarded. Only metadata survives: cost, latency, token counts, PII entity counts, policy violations. The proxy physically cannot retain prompt content. There's no database to store it, no log to write it to, no queue to buffer it.&lt;/p&gt;

&lt;p&gt;I made this decision early because the alternative — a proxy that logs everything "for observability" — creates exactly the problem it claims to solve. You're trying to prevent data leaking to third parties, so you route it through a proxy that... stores all the data? That never made sense to me.&lt;/p&gt;

&lt;p&gt;This matters for compliance:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Standard&lt;/th&gt;
&lt;th&gt;What it means with AISG&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HIPAA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Patient data in prompts never persists outside your app&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PCI DSS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Credit card numbers redacted before any third-party API call&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GDPR&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No personal data stored by the proxy layer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SOC 2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Audit logs capture what happened without capturing what was said&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The Tech Stack
&lt;/h2&gt;

&lt;p&gt;For anyone interested in what's under the hood:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Python + FastAPI&lt;/strong&gt; — async proxy layer, handles streaming responses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Presidio + custom NER&lt;/strong&gt; — multi-layered PII detection pipeline&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database&lt;/strong&gt; — metadata only (costs, violations, never prompts)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker Compose&lt;/strong&gt; — single command self-hosting&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS&lt;/strong&gt; — managed cloud version&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Integration
&lt;/h2&gt;

&lt;p&gt;If you're using the OpenAI SDK, it's two lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.aisecuritygateway.ai/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-aisg-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Your existing code stays exactly the same
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Summarize this contract...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No new SDK.&lt;/p&gt;

&lt;p&gt;No wrapper library.&lt;/p&gt;

&lt;p&gt;Your existing OpenAI calls now go through:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PII redaction&lt;/li&gt;
&lt;li&gt;Injection blocking&lt;/li&gt;
&lt;li&gt;Budget enforcement&lt;/li&gt;
&lt;li&gt;Smart routing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All transparent to your application.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned Building This
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. PII Detection Is Harder Than You Think
&lt;/h3&gt;

&lt;p&gt;"John Smith" is a name. "Smith &amp;amp; Wesson" is not. "Call me at 555-1234" contains a phone number. "Error code 555-1234" does not. Context matters enormously. Regex alone gets you maybe 60% accuracy. You need NER models layered on top.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Latency Budgets Are Brutal
&lt;/h3&gt;

&lt;p&gt;Every millisecond of proxy overhead is overhead users feel.We got text inspection down to ~50ms. Image OCR still costs ~0.5–1 second. That's the trade-off — and for images containing PII, it's worth it.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Budget Enforcement Became the Killer Feature
&lt;/h3&gt;

&lt;p&gt;I originally built this for PII redaction. But the feature people ask about most is budget caps. Turns out, "My agent loop burned $2,000 overnight" is a more common pain point than, "My prompts contain SSNs."&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Self-Hosting Is a Trust Multiplier
&lt;/h3&gt;

&lt;p&gt;Making the entire stack open-source under Apache 2.0 was the best decision I made. Enterprise security teams don't trust a proxy they can't inspect. Open source removes that objection immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Managed Cloud
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Website:&lt;/strong&gt; &lt;a href="https://aisecuritygateway.ai" rel="noopener noreferrer"&gt;https://aisecuritygateway.ai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Free credits:&lt;/strong&gt; 1M credits&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Credit card required:&lt;/strong&gt; No&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Self-Host
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/aisecuritygateway/aisecuritygateway" rel="noopener noreferrer"&gt;https://github.com/aisecuritygateway/aisecuritygateway&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Documentation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://aisecuritygateway.ai/docs" rel="noopener noreferrer"&gt;https://aisecuritygateway.ai/docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;The project is Apache 2.0 licensed. Stars, issues, and PRs are all welcome.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;I'd love to hear from anyone dealing with PII in LLM prompts.&lt;/p&gt;

&lt;p&gt;What's your current approach?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Filtering at the application layer?&lt;/li&gt;
&lt;li&gt;Using a proxy?&lt;/li&gt;
&lt;li&gt;Ignoring it and hoping for the best?&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>opensource</category>
      <category>python</category>
    </item>
  </channel>
</rss>
