<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tharuka Madhuwantha Threemavithana</title>
    <description>The latest articles on DEV Community by Tharuka Madhuwantha Threemavithana (@tharuka_madhuwantha).</description>
    <link>https://dev.to/tharuka_madhuwantha</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4004317%2F86a327ff-c000-4bed-8d77-639f35d1775d.jpg</url>
      <title>DEV Community: Tharuka Madhuwantha Threemavithana</title>
      <link>https://dev.to/tharuka_madhuwantha</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tharuka_madhuwantha"/>
    <language>en</language>
    <item>
      <title>Why Regex Can’t Stop AI SQL Injection (And How AST Parsing Can)</title>
      <dc:creator>Tharuka Madhuwantha Threemavithana</dc:creator>
      <pubDate>Fri, 26 Jun 2026 17:15:52 +0000</pubDate>
      <link>https://dev.to/tharuka_madhuwantha/why-regex-cant-stop-ai-sql-injection-and-how-ast-parsing-can-4d05</link>
      <guid>https://dev.to/tharuka_madhuwantha/why-regex-cant-stop-ai-sql-injection-and-how-ast-parsing-can-4d05</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;The rise of "Text-to-SQL" AI agents has created a massive security headache. Giving an LLM access to your database is incredibly powerful, but fundamentally unsafe. Prompt engineering is easily bypassed, and while read-only database credentials stop destructive writes, they do nothing to stop an AI from executing an unbounded &lt;code&gt;SELECT&lt;/code&gt; query that brings your production database to its knees via Denial of Service (DoS).&lt;/p&gt;

&lt;h2&gt;
  
  
  The Regex Fallacy
&lt;/h2&gt;

&lt;p&gt;To solve this, many teams implement middleware that uses regular expressions to scan incoming SQL for keywords like &lt;code&gt;DELETE&lt;/code&gt;, &lt;code&gt;DROP&lt;/code&gt;, or &lt;code&gt;TRUNCATE&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;This is a losing battle. SQL is a complex, declarative language. A clever attacker (or a hallucinating LLM) can bypass regex easily using nested Common Table Expressions (CTEs):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;recursive_trap&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;RETURNING&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;recursive_trap&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;A regex looking for &lt;code&gt;^DELETE&lt;/code&gt; completely misses this. Trying to write a regex to parse full SQL syntax is mathematically impossible.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Solution: Abstract Syntax Trees (AST)
&lt;/h2&gt;

&lt;p&gt;The only deterministic way to understand a query is to parse it exactly how the database parses it. &lt;/p&gt;

&lt;p&gt;I built &lt;strong&gt;AgentIAM&lt;/strong&gt;, an open-source Postgres proxy written in Go. Instead of reading strings, AgentIAM intercepts the Postgres wire protocol and uses &lt;code&gt;pg_query_go&lt;/code&gt; (a wrapper around PostgreSQL's native C parser) to convert the incoming SQL into an AST.&lt;/p&gt;

&lt;p&gt;By representing the query as a tree structure, we can use a recursive Visitor pattern to walk through every node. If we find a &lt;code&gt;DeleteStmt&lt;/code&gt; node—no matter how deeply it is buried inside a subquery or CTE—we instantly reject the query.&lt;/p&gt;
&lt;h2&gt;
  
  
  Beyond Blocking: AST Rewriting
&lt;/h2&gt;

&lt;p&gt;Blocking bad queries is only half the battle. To prevent DoS attacks from unbounded &lt;code&gt;SELECT&lt;/code&gt; statements, AgentIAM goes a step further: it mutates the AST on the fly.&lt;/p&gt;

&lt;p&gt;If the proxy traverses the tree and finds a &lt;code&gt;SelectStmt&lt;/code&gt; that lacks a &lt;code&gt;Limit&lt;/code&gt; node, it dynamically injects a &lt;code&gt;LIMIT 100&lt;/code&gt; node into the tree, deparses the AST back into a SQL string, and forwards it to the database. The AI agent gets its data, and your database survives.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Challenges of Wire Proxies
&lt;/h2&gt;

&lt;p&gt;Building this required deeply understanding the &lt;code&gt;pgproto3&lt;/code&gt; wire protocol. Intercepting the &lt;code&gt;Parse&lt;/code&gt;, &lt;code&gt;Bind&lt;/code&gt;, and &lt;code&gt;Execute&lt;/code&gt; messages requires careful state management to ensure the client driver (like &lt;code&gt;psycopg2&lt;/code&gt;) doesn't desync and crash. &lt;/p&gt;

&lt;p&gt;If you're building AI data integrations, stop relying on regex. Check out the open-source implementation of AgentIAM on GitHub and see how AST manipulation can secure your agents.&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/TM-threemavithana" rel="noopener noreferrer"&gt;
        TM-threemavithana
      &lt;/a&gt; / &lt;a href="https://github.com/TM-threemavithana/agentiam" rel="noopener noreferrer"&gt;
        agentiam
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;AgentIAM&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;A Postgres wire proxy that blocks SQL injection from AI agents at the AST level.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Connecting Large Language Models (LLMs) directly to your database for "Text-to-SQL" functionality is incredibly dangerous. AgentIAM sits between your LangChain/LlamaIndex agent and your database, intercepting Postgres wire protocol traffic to parse and block destructive queries before they can execute.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/tm-threemavithana/agentiam/actions/workflows/ci.yml" rel="noopener noreferrer"&gt;&lt;img src="https://github.com/tm-threemavithana/agentiam/actions/workflows/ci.yml/badge.svg" alt="CI"&gt;&lt;/a&gt;
&lt;a href="https://goreportcard.com/report/github.com/tm-threemavithana/agentiam" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/ae6885ac93e8ac0bdc5a474445764f6fc676868a9ff2912f453a892f3b1fabd3/68747470733a2f2f676f7265706f7274636172642e636f6d2f62616467652f6769746875622e636f6d2f746d2d74687265656d6176697468616e612f6167656e7469616d" alt="Go Report Card"&gt;&lt;/a&gt;
&lt;a href="https://www.gnu.org/licenses/agpl-3.0" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/a3e79d899e42547e60b531886472b3cb818e2a42dac09cc16a8d87105c077ca8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4147504c5f76332d626c75652e737667" alt="License: AGPL v3"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div&gt;
  &lt;a rel="noopener noreferrer" href="https://github.com/TM-threemavithana/agentiam/assets/demo_screenshot.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2FTM-threemavithana%2Fagentiam%2FHEAD%2Fassets%2Fdemo_screenshot.png" alt="AgentIAM Demo" width="800"&gt;&lt;/a&gt;
&lt;/div&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;🛑 The Problem&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;If you give an AI Agent a database connection, it &lt;em&gt;will&lt;/em&gt; eventually try to delete data or overwhelm your database.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prompt Injection:&lt;/strong&gt; An attacker can easily trick the LLM into generating &lt;code&gt;DELETE FROM users;&lt;/code&gt; instead of a harmless query.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Denial of Service (DoS):&lt;/strong&gt; An LLM might accidentally run &lt;code&gt;SELECT * FROM massive_table;&lt;/code&gt;, attempting to fetch millions of rows and crashing your database server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Regex Evasion:&lt;/strong&gt; Standard SQL firewalls that use regular expressions can be easily bypassed using nested Common Table Expressions (CTEs), subqueries, or obscure formatting.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Relying on "prompt engineering" or…&lt;/p&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/TM-threemavithana/agentiam" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1781cp14ca62q4bk0z9c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1781cp14ca62q4bk0z9c.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>postgres</category>
      <category>ai</category>
      <category>security</category>
    </item>
  </channel>
</rss>
