<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Genevieve Breton</title>
    <description>The latest articles on DEV Community by Genevieve Breton (@genevieve_breton_cb795f52).</description>
    <link>https://dev.to/genevieve_breton_cb795f52</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3871062%2Fc7d07369-125b-45ef-aa07-49eb6d9ee21d.png</url>
      <title>DEV Community: Genevieve Breton</title>
      <link>https://dev.to/genevieve_breton_cb795f52</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/genevieve_breton_cb795f52"/>
    <language>en</language>
    <item>
      <title>Pandas pipelines through AI without leaking your column names</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Fri, 12 Jun 2026 09:36:56 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/pandas-pipelines-through-ai-without-leaking-your-column-names-3fie</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/pandas-pipelines-through-ai-without-leaking-your-column-names-3fie</guid>
      <description>&lt;p&gt;&lt;em&gt;Every other framework in this series leaked through identifiers. Pandas leaks through strings — and "never rewrite strings" was the rule that kept the whole pipeline safe.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The rule pandas breaks
&lt;/h2&gt;

&lt;p&gt;Across the &lt;a href="https://dev.to/genevieve_breton_cb795f52/python-obfuscation-for-ai-assistants-runnable-workspaces-and-off-disk-secrets-172i"&gt;Python&lt;/a&gt; and &lt;a href="https://dev.to/genevieve_breton_cb795f52/django-obfuscation-for-ai-assistants-6-invisible-contracts-we-found-the-hard-way-1d62"&gt;Django&lt;/a&gt; articles, one rule held without exception: &lt;strong&gt;PromptCape never rewrites string literals.&lt;/strong&gt; Strings are user-visible labels, error messages, template paths, MIME types, SQL fragments, data. Rewriting them is how you turn &lt;code&gt;"Download CSV"&lt;/code&gt; into garbage on a button, or &lt;code&gt;mime="text/csv"&lt;/code&gt; into a broken response. The whole reverse-apply story leans on it — strings are inert, only identifiers carry the rename.&lt;/p&gt;

&lt;p&gt;Pandas is the framework where that rule fails. Consider one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;churn_probability&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is no identifier here worth protecting. &lt;code&gt;df&lt;/code&gt; is a throwaway local. &lt;code&gt;0.5&lt;/code&gt; is a number. The business secret — the fact that this company scores customers on a &lt;em&gt;churn probability&lt;/em&gt; model — lives entirely inside the string literal &lt;code&gt;"churn_probability"&lt;/code&gt;. And in any real pandas pipeline there are dozens of them: &lt;code&gt;"annual_salary"&lt;/code&gt;, &lt;code&gt;"patient_diagnosis_code"&lt;/code&gt;, &lt;code&gt;"ltv_segment"&lt;/code&gt;, &lt;code&gt;"fraud_score"&lt;/code&gt;. The column names &lt;em&gt;are&lt;/em&gt; the schema, and the schema &lt;em&gt;is&lt;/em&gt; the thing you don't want sitting in an AI provider's logs.&lt;/p&gt;

&lt;p&gt;So pandas forces the uncomfortable thing: a &lt;strong&gt;scoped, deliberate exception&lt;/strong&gt; to the never-touch-strings rule. Not "rewrite all strings" — that re-breaks everything the rule protected. Rewrite &lt;em&gt;only&lt;/em&gt; strings the engine can prove are column names. The entire difficulty of pandas obfuscation is in the word &lt;em&gt;prove&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where column names hide
&lt;/h2&gt;

&lt;p&gt;If column names only ever appeared as &lt;code&gt;df["name"]&lt;/code&gt;, this would be a one-line regex. They don't. A column name is any string sitting in a "column position", and pandas has a sprawling vocabulary of column positions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Access pattern&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;th&gt;Column strings&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Single subscript&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df["annual_salary"]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;annual_salary&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;List subscript&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df[["dept", "salary"]]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;dept&lt;/code&gt;, &lt;code&gt;salary&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Attribute access&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df.annual_salary&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;annual_salary&lt;/code&gt; (as an &lt;em&gt;identifier&lt;/em&gt;, not a string)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;.loc&lt;/code&gt; / &lt;code&gt;.at&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;df.loc[:, "salary"]&lt;/code&gt;, &lt;code&gt;df.at[0, "salary"]&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;&lt;code&gt;salary&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Grouping&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;df.groupby("department")&lt;/code&gt;, &lt;code&gt;groupby(by=["a", "b"])&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;grouping keys&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Joining&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;df.merge(other, on="employee_id")&lt;/code&gt;, &lt;code&gt;left_on=&lt;/code&gt;, &lt;code&gt;right_on=&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;join keys&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sorting / indexing&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;df.sort_values("hire_date")&lt;/code&gt;, &lt;code&gt;df.set_index("id")&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;sort/index keys&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reshaping&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df.pivot_table(index="dept", columns="year", values="salary")&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;three sets of columns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Melting&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df.melt(id_vars="id", value_vars=["q1", "q2"])&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;id + value columns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Renaming&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df.rename(columns={"old": "new"})&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;keys and values both&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Assignment&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;df.assign(margin=...)&lt;/code&gt;, &lt;code&gt;df["margin"] = ...&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;new column names (identifier &lt;em&gt;and&lt;/em&gt; string)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Named aggregation&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df.agg(avg_pay=("salary", "mean"))&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;output kwarg &lt;code&gt;avg_pay&lt;/code&gt; &lt;strong&gt;and&lt;/strong&gt; source string &lt;code&gt;salary&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Typing / IO&lt;/td&gt;
&lt;td&gt;&lt;code&gt;read_csv(usecols=[...], dtype={...}, names=[...], parse_dates=[...])&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;every listed column&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query strings&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df.query("churn_probability &amp;gt; 0.5")&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;column names &lt;strong&gt;inside an expression string&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Three of these are genuinely nasty:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;rename(columns={...})&lt;/code&gt;&lt;/strong&gt; carries column names in &lt;em&gt;both&lt;/em&gt; the keys and the values of a dict literal. Miss the values and you leak every renamed column.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Named aggregation&lt;/strong&gt; (&lt;code&gt;df.agg(avg_pay=("salary", "mean"))&lt;/code&gt;) puts an output column name in a &lt;strong&gt;keyword-argument position&lt;/strong&gt; — a Python identifier — while the source column sits in a string in the same call. One line, two different obfuscation mechanisms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;df.query("...")&lt;/code&gt;&lt;/strong&gt; and &lt;code&gt;df.eval("...")&lt;/code&gt; embed column names inside a mini-language that pandas parses at runtime. You can't treat the argument as an opaque string; you have to parse the expression, find the names, rewrite them, and re-serialize — or refuse and warn.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The detection problem: the schema isn't in the code
&lt;/h2&gt;

&lt;p&gt;Every other detector in this series had it easy: the names it needed to find were &lt;em&gt;declared&lt;/em&gt; somewhere. Pydantic fields are in the class body. Django models declare their fields. SQLAlchemy columns are &lt;code&gt;Column(...)&lt;/code&gt; assignments. An AST scan finds the declaration site and you're done.&lt;/p&gt;

&lt;p&gt;Pandas has no declaration site. A DataFrame's columns come from the &lt;strong&gt;data&lt;/strong&gt; — the header row of a CSV, a SQL result set, a Parquet schema — none of which is in the source code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3://acme-hr/attrition_2026.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# df now has 40 columns. None of their names appear anywhere in this file.
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;col&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;col&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;fillna&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;        &lt;span class="c1"&gt;# every column touched, zero literals
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This splits column names into two populations, and only one is reachable:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Population&lt;/th&gt;
&lt;th&gt;Where it appears&lt;/th&gt;
&lt;th&gt;Obfuscatable statically?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Referenced columns&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;As literals in code: &lt;code&gt;df["churn_probability"]&lt;/code&gt;, &lt;code&gt;groupby("dept")&lt;/code&gt;, &lt;code&gt;rename(columns=...)&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Yes&lt;/strong&gt; — they're in the AST&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dynamic-only columns&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Only in the data, touched via &lt;code&gt;df.columns&lt;/code&gt;, &lt;code&gt;for col in df.columns&lt;/code&gt;, &lt;code&gt;df.select_dtypes(...)&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;No&lt;/strong&gt; — the name never appears as a literal to rewrite&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The honest framing PromptCape ships with: &lt;strong&gt;it obfuscates the column names that appear as literals in the code, because those are exactly the ones that would otherwise reach the AI.&lt;/strong&gt; A column that's never named in the source can't leak through the source — it only leaks if the AI reads the &lt;em&gt;data file&lt;/em&gt;, which is a separate problem (covered below). The &lt;code&gt;PandasColumnDetector&lt;/code&gt; does an AST/LibCST scan of column-position contexts and collects every literal it finds into a project-wide &lt;strong&gt;column registry&lt;/strong&gt;, hashed the same way identifiers are — &lt;code&gt;churn_probability&lt;/code&gt; → &lt;code&gt;col_e2d4b7c9&lt;/code&gt; — so the mapping round-trips through reverse-apply exactly like &lt;code&gt;fld_&lt;/code&gt; and &lt;code&gt;mtd_&lt;/code&gt; names.&lt;/p&gt;




&lt;h2&gt;
  
  
  The type-inference trap: not every &lt;code&gt;x["string"]&lt;/code&gt; is a column
&lt;/h2&gt;

&lt;p&gt;Here is the bug that defined the whole subsystem.&lt;/p&gt;

&lt;p&gt;The first version of the column detector treated &lt;em&gt;every&lt;/em&gt; string subscript as a column name: any &lt;code&gt;x["..."]&lt;/code&gt; got rewritten. It worked beautifully on pandas code and corrupted everything else:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Before obfuscation
&lt;/span&gt;&lt;span class="n"&gt;db_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DATABASE_URL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;feature_flags&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;new_dashboard&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;annual_salary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# After the naive obfuscator (three bugs and one correct rewrite)
&lt;/span&gt;&lt;span class="n"&gt;db_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_a1b2c3d4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;          &lt;span class="c1"&gt;# ← BUG: env var lookup now fails
&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_5e6f7a8b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# ← BUG: dict keys mangled
&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;     &lt;span class="c1"&gt;# ← BUG: HTTP header broken
&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_99c1d2e3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;                       &lt;span class="c1"&gt;# ← correct
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;os.environ["DATABASE_URL"]&lt;/code&gt;, dict subscripts, JSON payloads — they all use the exact same syntax as a DataFrame column access. Subscript-with-a-string is not a pandas signal; it's a Python idiom. Rewriting it blindly turns a privacy tool into a code corrupter.&lt;/p&gt;

&lt;p&gt;The fix is &lt;strong&gt;DataFrame-variable inference&lt;/strong&gt;: only rewrite subscript strings on a variable the engine can show is a DataFrame. The sidecar tracks, per scope, which names are bound to a DataFrame by walking assignments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Assigned from a constructor or IO call: &lt;code&gt;pd.read_csv(...)&lt;/code&gt;, &lt;code&gt;pd.read_sql(...)&lt;/code&gt;, &lt;code&gt;pd.DataFrame(...)&lt;/code&gt;, &lt;code&gt;pd.read_parquet(...)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Assigned from a DataFrame-returning operation on a known DataFrame: &lt;code&gt;df.copy()&lt;/code&gt;, &lt;code&gt;df[mask]&lt;/code&gt;, &lt;code&gt;df.groupby(...).agg(...)&lt;/code&gt;, &lt;code&gt;df.merge(...)&lt;/code&gt;, &lt;code&gt;df.rename(...)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Annotated &lt;code&gt;: pd.DataFrame&lt;/code&gt; in a parameter or variable.&lt;/li&gt;
&lt;li&gt;A column-position keyword on a &lt;em&gt;known&lt;/em&gt; pandas method (&lt;code&gt;groupby(by=...)&lt;/code&gt;, &lt;code&gt;merge(on=...)&lt;/code&gt;) — here the method itself proves the argument is a column, even without var inference.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Strings subscripted on anything the engine &lt;em&gt;can't&lt;/em&gt; prove is a DataFrame are left untouched. The trade-off is deliberate and stated: a DataFrame that arrives through an un-inferrable path (returned from a third-party function with no annotation, stored in a list, pulled from a dict) won't have its columns obfuscated. PromptCape chooses &lt;strong&gt;silent under-obfuscation over silent corruption&lt;/strong&gt; — a leaked column name is a privacy miss; a rewritten &lt;code&gt;os.environ&lt;/code&gt; key is a broken app. The first you can catch in review; the second wastes an afternoon.&lt;/p&gt;




&lt;h2&gt;
  
  
  Before / after
&lt;/h2&gt;

&lt;p&gt;A real-shaped HR attrition pipeline. Watch the columns, the named aggregations, and notice what is &lt;em&gt;kept&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Source the AI must never see:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_attrition_report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;employment_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;active&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenure_years&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenure_months&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;

    &lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;department&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;agg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="n"&gt;headcount&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;employee_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
              &lt;span class="n"&gt;avg_salary&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;annual_salary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mean&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
              &lt;span class="n"&gt;attrition_risk&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;churn_probability&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mean&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
          &lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reset_index&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attrition_risk&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ascending&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What the AI actually receives:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mtd_3f2a1b0c&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;fld_9c8d7e6f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fld_9c8d7e6f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fld_9c8d7e6f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;fld_9c8d7e6f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_4a1f8b2e&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;active&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;fld_9c8d7e6f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_7d3e9a14&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fld_9c8d7e6f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_b6c2f085&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;

    &lt;span class="n"&gt;fld_2e5a8c91&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;fld_9c8d7e6f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;groupby&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_1f7b3d6a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;agg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="n"&gt;col_c5e90a2b&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_d40a1e77&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
              &lt;span class="n"&gt;col_88a1d3f4&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_55b3e9c0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mean&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
              &lt;span class="n"&gt;col_e2d4b7c9&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_aa1f2b30&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mean&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
          &lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reset_index&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort_values&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_e2d4b7c9&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ascending&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fld_2e5a8c91&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The provider sees a function that filters, divides, groups, and aggregates. It does &lt;strong&gt;not&lt;/strong&gt; see that this is attrition modelling, that employees have a &lt;code&gt;churn_probability&lt;/code&gt;, or that the company tracks &lt;code&gt;annual_salary&lt;/code&gt;. The shape of the analysis is intact; the meaning is gone.&lt;/p&gt;

&lt;p&gt;Three things to notice in the diff:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;attrition_risk&lt;/code&gt; round-trips as one name.&lt;/strong&gt; It's &lt;em&gt;created&lt;/em&gt; as a named-aggregation kwarg (&lt;code&gt;attrition_risk=&lt;/code&gt;) and &lt;em&gt;referenced&lt;/em&gt; three lines later in &lt;code&gt;sort_values("attrition_risk")&lt;/code&gt;. Both became &lt;code&gt;col_e2d4b7c9&lt;/code&gt; — the registry is keyed by the real name, so a column that's born in &lt;code&gt;.agg()&lt;/code&gt; and used in &lt;code&gt;.sort_values()&lt;/code&gt; stays consistent. Get this wrong and the workspace raises &lt;code&gt;KeyError: 'col_e2d4b7c9'&lt;/code&gt; because the sort references a column the agg never produced under that name.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;"active"&lt;/code&gt; is untouched.&lt;/strong&gt; It's a &lt;em&gt;value&lt;/em&gt;, not a column. It stays readable — which is the runtime-data boundary discussed below, not an oversight.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;pd&lt;/code&gt;, &lt;code&gt;read_csv&lt;/code&gt;, &lt;code&gt;groupby&lt;/code&gt;, &lt;code&gt;agg&lt;/code&gt;, &lt;code&gt;mean&lt;/code&gt;, &lt;code&gt;count&lt;/code&gt;, &lt;code&gt;reset_index&lt;/code&gt;, &lt;code&gt;sort_values&lt;/code&gt;, &lt;code&gt;ascending&lt;/code&gt;&lt;/strong&gt; all survive. That's the &lt;code&gt;PandasDetector&lt;/code&gt;'s job, unchanged from earlier articles.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  What does NOT change (and why)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Preserved&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pandas API names — &lt;code&gt;read_csv&lt;/code&gt;, &lt;code&gt;groupby&lt;/code&gt;, &lt;code&gt;agg&lt;/code&gt;, &lt;code&gt;merge&lt;/code&gt;, &lt;code&gt;pivot_table&lt;/code&gt;, &lt;code&gt;to_csv&lt;/code&gt;, &lt;code&gt;value_counts&lt;/code&gt;, &lt;code&gt;reset_index&lt;/code&gt;, &lt;code&gt;fillna&lt;/code&gt;, &lt;code&gt;astype&lt;/code&gt;, …&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;PandasDetector&lt;/code&gt;: a fixed list of ~200 DataFrame/Series/IO method and attribute names. Renaming &lt;code&gt;to_csv&lt;/code&gt; → &lt;code&gt;mtd_xxx&lt;/code&gt; raises &lt;code&gt;AttributeError&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aggregation function strings — &lt;code&gt;"mean"&lt;/code&gt;, &lt;code&gt;"sum"&lt;/code&gt;, &lt;code&gt;"count"&lt;/code&gt;, &lt;code&gt;"first"&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;These are pandas reduction names, not columns. They're a small fixed allow-list inside the column detector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Value literals — &lt;code&gt;"active"&lt;/code&gt;, &lt;code&gt;"2026-01-01"&lt;/code&gt;, &lt;code&gt;"USD"&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Data, not schema. PromptCape never rewrites values (see boundary section)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File paths — &lt;code&gt;pd.read_csv("attrition_2026.csv")&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;A path, not a column. Subscript/kwarg position is what marks a column; a positional path argument to &lt;code&gt;read_csv&lt;/code&gt; is not a column position&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dict / env / JSON subscripts — &lt;code&gt;os.environ["X"]&lt;/code&gt;, &lt;code&gt;config["y"]&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;The variable isn't an inferred DataFrame, so the string is left alone&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;df&lt;/code&gt; and other local variable names visible above&lt;/td&gt;
&lt;td&gt;They &lt;em&gt;are&lt;/em&gt; renamed (&lt;code&gt;fld_…&lt;/code&gt;) — shown here only to contrast with the columns&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The data-file leak (the other half)
&lt;/h2&gt;

&lt;p&gt;Obfuscating &lt;code&gt;df["churn_probability"]&lt;/code&gt; to &lt;code&gt;df["col_e2d4b7c9"]&lt;/code&gt; in the code accomplishes nothing if the AI can open &lt;code&gt;attrition_2026.csv&lt;/code&gt; and read this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;employee_id,department,annual_salary,churn_probability,employment_status
4471,Engineering,142000,0.12,active
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Worse, there's a runtime conflict: the obfuscated code asks for a column &lt;code&gt;col_e2d4b7c9&lt;/code&gt; that the real CSV header calls &lt;code&gt;churn_probability&lt;/code&gt;. The workspace won't even run — &lt;code&gt;KeyError: 'col_e2d4b7c9'&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is the same split the &lt;a href="https://dev.to/genevieve_breton_cb795f52/python-obfuscation-for-ai-assistants-runnable-workspaces-and-off-disk-secrets-172i"&gt;Python article&lt;/a&gt; drew for &lt;code&gt;.env&lt;/code&gt;: a value that must exist at runtime but must not reach the AI's view of the workspace. PromptCape resolves pandas data files two ways depending on whether they're fixtures or real data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bundled sample/fixture data&lt;/strong&gt; (small CSVs the repo ships for tests/demos) get their &lt;strong&gt;header row rewritten&lt;/strong&gt; with the same column registry: &lt;code&gt;employee_id,department,annual_salary,...&lt;/code&gt; → &lt;code&gt;col_d40a1e77,col_1f7b3d6a,col_55b3e9c0,...&lt;/code&gt;. The workspace runs on the fixture, the obfuscated code matches the obfuscated header, and the AI sees neither the real names nor the real rows' meaning (values stay, but a header-less &lt;code&gt;0.12&lt;/code&gt; leaks little).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real production data&lt;/strong&gt; stays in the source project and never enters the workspace, exactly like &lt;code&gt;.env&lt;/code&gt;. &lt;code&gt;promptcape run&lt;/code&gt; resolves the data path at launch and a thin pandas IO shim applies &lt;code&gt;rename(columns=registry)&lt;/code&gt; immediately after each read, so the code's &lt;code&gt;col_…&lt;/code&gt; names line up with the freshly-renamed frame. The real headers live only in the developer's source tree, never in &lt;code&gt;~/.promptcape/cache/&amp;lt;hash&amp;gt;/&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The principle is identical to the secrets story: &lt;strong&gt;the AI-visible workspace directory contains schema and structure, never the real vocabulary or the real rows.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Reverse-apply and AI-invented columns
&lt;/h2&gt;

&lt;p&gt;When the AI adds a feature — &lt;em&gt;"add a column flagging anyone above the 90th salary percentile"&lt;/em&gt; — it writes against the obfuscated frame:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;fld_9c8d7e6f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_new_flag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;fld_9c8d7e6f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_55b3e9c0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;fld_9c8d7e6f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;col_55b3e9c0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;quantile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two cases on the way back:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;References to existing columns&lt;/strong&gt; (&lt;code&gt;col_55b3e9c0&lt;/code&gt;) hit the registry and reverse-map cleanly to &lt;code&gt;annual_salary&lt;/code&gt;. Same hash-resolver as identifiers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The AI's &lt;em&gt;new&lt;/em&gt; column&lt;/strong&gt; is a name the AI invented. It's not in the registry, and that's fine — it's not a leak (the AI chose it, the provider never saw a real name). It comes back verbatim as whatever the AI typed (&lt;code&gt;col_new_flag&lt;/code&gt;, or &lt;code&gt;high_earner&lt;/code&gt; if it wrote a readable name). The developer reviews and renames to taste, identical to how AI-invented variable names are handled in the Java pipeline.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The one failure mode worth a guard: if the AI writes a &lt;code&gt;col_xxxxxxxx&lt;/code&gt;-shaped string that &lt;em&gt;collides&lt;/em&gt; with a registry hash it shouldn't (extraordinarily unlikely with 8 hex digits, but the resolver is strict), the pre-apply gate flags any &lt;code&gt;col_&lt;/code&gt; literal in the diff that maps to a column the surrounding code never read. In practice this never fires; it exists so a collision is loud rather than silent.&lt;/p&gt;




&lt;h2&gt;
  
  
  What this does NOT protect
&lt;/h2&gt;

&lt;p&gt;The threat boundary is the same shape as the rest of the series, with one pandas-specific edge.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Threat&lt;/th&gt;
&lt;th&gt;Protected?&lt;/th&gt;
&lt;th&gt;By what&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Column &lt;strong&gt;names&lt;/strong&gt; reaching the AI provider&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Column registry + scoped string rewriting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Business &lt;strong&gt;logic / pipeline shape&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Partially&lt;/td&gt;
&lt;td&gt;The operations are visible (a &lt;code&gt;groupby().agg()&lt;/code&gt; is still a &lt;code&gt;groupby().agg()&lt;/code&gt;); only the vocabulary is gone&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runtime &lt;strong&gt;data values&lt;/strong&gt; in the AI's view&lt;/td&gt;
&lt;td&gt;Only if data is kept out / fixtures are header-stripped&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;promptcape run&lt;/code&gt; data indirection; values themselves are never rewritten&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dynamic-only columns (never named in code)&lt;/td&gt;
&lt;td&gt;Not via code obfuscation&lt;/td&gt;
&lt;td&gt;They don't appear in source; they only leak if the data file does, which the data-file handling addresses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Columns inside un-inferrable DataFrames&lt;/td&gt;
&lt;td&gt;No (deliberate)&lt;/td&gt;
&lt;td&gt;Under-obfuscate rather than corrupt non-pandas subscripts&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The honest line: PromptCape removes the &lt;strong&gt;schema vocabulary&lt;/strong&gt; from what the AI sees. It does not pretend to hide that you're doing data analysis, nor to encrypt the numbers. For a column called &lt;code&gt;churn_probability&lt;/code&gt;, the name is the secret — and that's what's gone.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Pandas is the framework that inverts the series' central rule. Everywhere else, the leak was in identifiers and strings were safe to ignore. In pandas the leak &lt;em&gt;is&lt;/em&gt; the strings, because column names are the business vocabulary and they live as literals.&lt;/p&gt;

&lt;p&gt;The three load-bearing ideas:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Scoped string rewriting, gated on proof.&lt;/strong&gt; The fix isn't "rewrite strings now" — that re-breaks labels, paths, and dict keys. It's a column registry fed only by strings in proven column positions, hashed and reverse-mapped exactly like identifiers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DataFrame-variable inference is the whole ballgame.&lt;/strong&gt; &lt;code&gt;x["str"]&lt;/code&gt; is the most overloaded syntax in Python. Without knowing &lt;code&gt;x&lt;/code&gt; is a DataFrame, you either leak columns (too cautious) or corrupt &lt;code&gt;os.environ&lt;/code&gt; (too eager). The detector errs toward under-obfuscation because a broken app is worse than a missed name you can catch in review.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The data file is half the problem.&lt;/strong&gt; Obfuscating code while the AI can read &lt;code&gt;attrition_2026.csv&lt;/code&gt; is theatre. Fixtures get header-rewritten; real data stays in the source and is renamed on load — the same secrets-never-touch-the-workspace principle as &lt;code&gt;.env&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;PromptCape ships open for trial at &lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;https://promptcape.com/&lt;/a&gt; — free for 3 months, no credit card required. The &lt;code&gt;PandasColumnDetector&lt;/code&gt;, the DataFrame-inference sidecar, and the data-file handling ship in the same JAR as the rest of the Python pipeline; the language is auto-detected from the source tree.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>pandas</category>
      <category>python</category>
      <category>privacy</category>
    </item>
    <item>
      <title>Django obfuscation for AI assistants: 6 invisible contracts we found the hard way</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Mon, 08 Jun 2026 09:47:44 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/django-obfuscation-for-ai-assistants-6-invisible-contracts-we-found-the-hard-way-1d62</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/django-obfuscation-for-ai-assistants-6-invisible-contracts-we-found-the-hard-way-1d62</guid>
      <description>&lt;p&gt;&lt;em&gt;A walkthrough of what broke each time we re-ran a Django test suite against an obfuscated workspace — and what we had to add to the detector to make the next round green.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;PromptCape obfuscates source code before it reaches an AI assistant (Claude Code, Cursor, etc.) so the AI works on renamed identifiers rather than your real class, method, and field names. We've covered the &lt;a href="https://dev.to/"&gt;Java pipeline&lt;/a&gt; and the &lt;a href="https://dev.to/"&gt;general Python flow&lt;/a&gt; in earlier posts.&lt;/p&gt;

&lt;p&gt;This post is about Django specifically. Django has more &lt;strong&gt;invisible name contracts&lt;/strong&gt; than any framework we've integrated so far — strings inside Python code, in templates, in migration files, in URL configurations, in admin registrations, all referencing identifier names. The compiler can't catch them because Python has no compiler. The static import verifier can't catch them because they aren't imports. They surface only at the moment Django's introspection layer reaches for &lt;code&gt;getattr(form, 'clean_&amp;lt;field&amp;gt;', None)&lt;/code&gt; and finds nothing.&lt;/p&gt;

&lt;p&gt;The story below is the literal sequence of bugs that surfaced when we ran an obfuscated &lt;code&gt;django-blog&lt;/code&gt; test app (Post / Comment / Tag models, CBVs + FBVs, ModelForm + standalone Form, admin registrations, pytest-django). Each bug exposed a contract; each contract drove a change to the &lt;code&gt;DjangoDetector&lt;/code&gt;. Sixteen tests, six iterations to green.&lt;/p&gt;




&lt;h2&gt;
  
  
  Iteration 1 — &lt;code&gt;'django.db.migrations' has no attribute 'Cls_270e5090'&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;First obfuscation. Tests fail immediately on collection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ERROR tests/test_blog.py::test_post_creation_persists_all_fields
E   AttributeError: module 'django.db.migrations' has no attribute 'Cls_270e5090'
    blog/migrations/0001_initial.py:7: AttributeError
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Line 7 of every Django migration file is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Migration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;migrations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Migration&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The obfuscator picked up &lt;code&gt;Migration&lt;/code&gt; from the class definition, registered it, and rewrote every subsequent occurrence — including &lt;code&gt;migrations.Migration&lt;/code&gt; (the base class reference). Django's &lt;code&gt;migrations&lt;/code&gt; module doesn't have an attribute &lt;code&gt;Cls_270e5090&lt;/code&gt;, so the class can't be loaded.&lt;/p&gt;

&lt;p&gt;The temptation: add &lt;code&gt;Migration&lt;/code&gt; to a list of protected names. That handles one occurrence; the next line down is &lt;code&gt;migrations.CreateModel(...)&lt;/code&gt;, then &lt;code&gt;migrations.AddField(...)&lt;/code&gt;, then &lt;code&gt;migrations.RunPython(...)&lt;/code&gt;. There are roughly twenty operation classes and a handful of class-level attributes (&lt;code&gt;initial&lt;/code&gt;, &lt;code&gt;dependencies&lt;/code&gt;, &lt;code&gt;operations&lt;/code&gt;, &lt;code&gt;replaces&lt;/code&gt;, &lt;code&gt;atomic&lt;/code&gt;). Worse, each &lt;code&gt;CreateModel(name=..., fields=..., options=..., bases=..., managers=...)&lt;/code&gt; call uses kwargs that are extremely generic — &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;fields&lt;/code&gt;, &lt;code&gt;options&lt;/code&gt;, &lt;code&gt;bases&lt;/code&gt; collide with every second user identifier in a real codebase. Adding them all to a project-wide exclusion list would gut obfuscation.&lt;/p&gt;

&lt;p&gt;The right fix recognises that &lt;strong&gt;migration files are machine-generated&lt;/strong&gt;. Django writes them via &lt;code&gt;python manage.py makemigrations&lt;/code&gt;. The user never edits them by hand. They reference framework internals exclusively. There is nothing in them that needs to be renamed.&lt;/p&gt;

&lt;p&gt;So: skip &lt;code&gt;migrations/&lt;/code&gt;, &lt;code&gt;alembic/&lt;/code&gt;, &lt;code&gt;versions/&lt;/code&gt; directories entirely. The obfuscator's collection pass doesn't visit them; the obfuscation pass doesn't rewrite them; they're copied to the workspace verbatim. Django's &lt;code&gt;migrate&lt;/code&gt; command sees its expected files unchanged.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ObfuscationEngine.java&lt;/span&gt;
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;MACHINE_GENERATED_DIR_NAMES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;"migrations"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"alembic"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"versions"&lt;/span&gt;
&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;isInMachineGeneratedDir&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Path&lt;/span&gt; &lt;span class="n"&gt;relative&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;relative&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getNameCount&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;MACHINE_GENERATED_DIR_NAMES&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;contains&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;relative&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There's no leak: the migration files reference model class names and field names that are already visible in &lt;code&gt;models.py&lt;/code&gt; (which the AI does see, obfuscated). Migrations contain the same names the AI already has, in machine-generated form. Skipping them costs nothing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Iteration 2 — &lt;code&gt;urlpatterns&lt;/code&gt; doesn't appear to have any patterns
&lt;/h2&gt;

&lt;p&gt;The same test, re-run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;django.core.exceptions.ImproperlyConfigured: The included URLconf 'mysite.urls'
does not appear to have any patterns in it. If you see the 'urlpatterns'
variable with valid patterns in the file then the issue is probably caused
by a circular import.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;mysite/urls.py&lt;/code&gt; defines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;urlpatterns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nf"&gt;path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;admin/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;admin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;site&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;include&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;blog.urls&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The obfuscator picked up &lt;code&gt;urlpatterns&lt;/code&gt; as a module-level assignment target and rewrote it. Django's URL resolver looks for the literal name &lt;code&gt;urlpatterns&lt;/code&gt; in the imported URLconf module via &lt;code&gt;getattr(module, "urlpatterns")&lt;/code&gt; — when it finds nothing, it reports "no patterns found."&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;module-level convention&lt;/strong&gt; pattern. Django reads a handful of named constants from urls.py files by exact name:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;urlpatterns&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The list of URL routes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;app_name&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Application namespace for &lt;code&gt;reverse('app_name:view_name')&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;handler400&lt;/code&gt;, &lt;code&gt;handler403&lt;/code&gt;, &lt;code&gt;handler404&lt;/code&gt;, &lt;code&gt;handler500&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Custom error-view callables&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These all need protection. They go into the project-wide Django API name list, applied unconditionally as soon as &lt;code&gt;from django&lt;/code&gt; or &lt;code&gt;import django&lt;/code&gt; appears anywhere in the project.&lt;/p&gt;

&lt;p&gt;A few names in the same category that surfaced separately:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;INSTALLED_APPS&lt;/code&gt;, &lt;code&gt;MIDDLEWARE&lt;/code&gt;, &lt;code&gt;DATABASES&lt;/code&gt;, &lt;code&gt;TEMPLATES&lt;/code&gt;, &lt;code&gt;ROOT_URLCONF&lt;/code&gt;, &lt;code&gt;STATIC_URL&lt;/code&gt;, &lt;code&gt;MEDIA_URL&lt;/code&gt;, &lt;code&gt;LANGUAGE_CODE&lt;/code&gt;, &lt;code&gt;SECRET_KEY&lt;/code&gt;, &lt;code&gt;DEBUG&lt;/code&gt;, &lt;code&gt;ALLOWED_HOSTS&lt;/code&gt;, &lt;code&gt;AUTH_USER_MODEL&lt;/code&gt;, &lt;code&gt;DEFAULT_AUTO_FIELD&lt;/code&gt;, etc. — &lt;code&gt;settings.py&lt;/code&gt; module-level constants, all introspected by name.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Iteration 3 — &lt;code&gt;ModelForm has no model class specified&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Forms next:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ValueError: ModelForm has no model class specified.
  File ".../django/forms/models.py", line 362, in __init__
    if opts.model is None:
        raise ValueError("ModelForm has no model class specified.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The user's &lt;code&gt;PostForm&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PostForm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ModelForm&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Meta&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Post&lt;/span&gt;
        &lt;span class="n"&gt;fields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slug&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;author&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;published&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After obfuscation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Cls_d367f020&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ModelForm&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Cls_7f994c64&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;               &lt;span class="c1"&gt;# &amp;lt;- was Meta
&lt;/span&gt;        &lt;span class="n"&gt;fld_08db7fa3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Cls_b8106b41&lt;/span&gt;   &lt;span class="c1"&gt;# &amp;lt;- was model = Post
&lt;/span&gt;        &lt;span class="n"&gt;fld_b4a3ded4&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...]&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;- was fields = [...]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three layers of damage:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;Meta&lt;/code&gt; is the magic inner-class name Django introspects on every Form/ModelForm/Serializer/Model subclass. Renaming it makes the inner class invisible.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;model&lt;/code&gt; is the attribute inside &lt;code&gt;Meta&lt;/code&gt; that tells &lt;code&gt;ModelForm&lt;/code&gt; which model to bind to.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fields&lt;/code&gt; is the attribute that lists which model fields to expose.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The fix: add &lt;code&gt;Meta&lt;/code&gt;, &lt;code&gt;model&lt;/code&gt;, &lt;code&gt;fields&lt;/code&gt; (and the dozen other Meta attributes — &lt;code&gt;widgets&lt;/code&gt;, &lt;code&gt;error_messages&lt;/code&gt;, &lt;code&gt;field_classes&lt;/code&gt;, &lt;code&gt;localized_fields&lt;/code&gt;, &lt;code&gt;help_texts&lt;/code&gt;, &lt;code&gt;labels&lt;/code&gt;, &lt;code&gt;read_only_fields&lt;/code&gt;, &lt;code&gt;extra_kwargs&lt;/code&gt;, &lt;code&gt;abstract&lt;/code&gt;, &lt;code&gt;proxy&lt;/code&gt;, &lt;code&gt;managed&lt;/code&gt;, &lt;code&gt;app_label&lt;/code&gt;, &lt;code&gt;indexes&lt;/code&gt;, &lt;code&gt;constraints&lt;/code&gt;, &lt;code&gt;get_latest_by&lt;/code&gt;, &lt;code&gt;default_manager_name&lt;/code&gt;, &lt;code&gt;default_related_name&lt;/code&gt;, ...) to the protected list.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;model&lt;/code&gt; and &lt;code&gt;fields&lt;/code&gt; are uncomfortably generic — they're going to over-protect plenty of unrelated user code. But the alternative is Django ModelForms not working at all. We took the trade.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;inner-class convention&lt;/strong&gt; pattern: Django uses &lt;code&gt;class Meta:&lt;/code&gt; inside another class to attach metadata. The outer class's behavior depends on Django finding &lt;code&gt;Meta&lt;/code&gt; by literal name and reading its attributes by literal name.&lt;/p&gt;




&lt;h2&gt;
  
  
  Iteration 4 — &lt;code&gt;no such table: blog_cls_b8106b41&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Tests progressed to the DB layer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;django.db.utils.OperationalError: no such table: blog_cls_b8106b41
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Django's ORM generates default DB table names from the model class name via &lt;code&gt;app_label_classname.lower()&lt;/code&gt;. With:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Django creates table &lt;code&gt;blog_post&lt;/code&gt;. The migration file (copied verbatim, see Iteration 1) declares that table. The obfuscated &lt;code&gt;models.py&lt;/code&gt; declares:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Cls_b8106b41&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Django's ORM now expects table &lt;code&gt;blog_cls_b8106b41&lt;/code&gt;. The migration created &lt;code&gt;blog_post&lt;/code&gt;. The two never reconcile.&lt;/p&gt;

&lt;p&gt;The fix is a small extension to the AST scan that detects models. The detector was already scanning classes that inherit from &lt;code&gt;models.Model&lt;/code&gt; to extract field names — the same scan can emit the &lt;strong&gt;class name&lt;/strong&gt; alongside the fields, with the reason &lt;code&gt;"Django model class (drives db_table + migration refs)"&lt;/code&gt;. Once the class name is protected, both the model declaration and the migration's &lt;code&gt;CreateModel(name='Post', ...)&lt;/code&gt; references stay consistent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# sidecar's detect_django_models command
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;stmt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;libcst&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClassDef&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;_class_inherits_from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_DJANGO_MODEL_BASE_CLASSES&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;libcst&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="n"&gt;class_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;
    &lt;span class="n"&gt;fields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_scan_django_model_fields&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;libcst&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="n"&gt;models_found&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="c1"&gt;# Class name protects the DB table name and the migration's name='X' refs
&lt;/span&gt;    &lt;span class="n"&gt;identifiers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;class_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kind&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;class&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...})&lt;/span&gt;
    &lt;span class="c1"&gt;# Plus each field
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;field_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;factory&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;identifiers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;field_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kind&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;field&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the &lt;strong&gt;string-derived name&lt;/strong&gt; pattern: Django uses a Python identifier as a source for a string that gets stored in another system (DB schema, migration JSON, etc.). The class name has to be preserved end-to-end across the layers that aren't directly visible to the obfuscator.&lt;/p&gt;




&lt;h2&gt;
  
  
  Iteration 5 — empty post list when the database has posts
&lt;/h2&gt;

&lt;p&gt;Render tests next:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_post_list_view_renders_published_posts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;draft_post&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;reverse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;blog:post_list&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fixture &lt;code&gt;post&lt;/code&gt; creates a Post with &lt;code&gt;title="Hello"&lt;/code&gt; and &lt;code&gt;published=True&lt;/code&gt;. The view should render it. The assertion fails — the page contains "No posts yet" instead.&lt;/p&gt;

&lt;p&gt;The obfuscated &lt;code&gt;PostListView&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PostListView&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ListView&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Post&lt;/span&gt;
    &lt;span class="n"&gt;fld_ede69551&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;blog/post_list.html&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;     &lt;span class="c1"&gt;# was template_name
&lt;/span&gt;    &lt;span class="n"&gt;fld_475e135f&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;posts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;                   &lt;span class="c1"&gt;# was context_object_name
&lt;/span&gt;    &lt;span class="n"&gt;fld_6b917593&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;                        &lt;span class="c1"&gt;# was paginate_by
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mtd_0729535b&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;                  &lt;span class="c1"&gt;# was get_queryset
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;objects&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;published&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Django's CBV machinery looks for &lt;code&gt;template_name&lt;/code&gt;, &lt;code&gt;context_object_name&lt;/code&gt;, &lt;code&gt;paginate_by&lt;/code&gt;, and &lt;code&gt;get_queryset&lt;/code&gt; by &lt;strong&gt;literal getattr&lt;/strong&gt; name. When they're renamed, the base &lt;code&gt;ListView.get()&lt;/code&gt; method picks up its defaults instead: no template (falls back to a synthesised one), default context name (&lt;code&gt;object_list&lt;/code&gt;), default queryset (&lt;code&gt;Post.objects.all()&lt;/code&gt; — which excludes the published filter), no pagination. The test loaded the page successfully but saw an empty queryset because the user's &lt;code&gt;get_queryset&lt;/code&gt; override was effectively gone.&lt;/p&gt;

&lt;p&gt;The fix is to add the CBV introspection points to the API name list:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CBV class attributes:&lt;/strong&gt; &lt;code&gt;template_name&lt;/code&gt;, &lt;code&gt;template_name_field&lt;/code&gt;, &lt;code&gt;template_name_suffix&lt;/code&gt;, &lt;code&gt;context_object_name&lt;/code&gt;, &lt;code&gt;paginate_by&lt;/code&gt;, &lt;code&gt;paginator_class&lt;/code&gt;, &lt;code&gt;page_kwarg&lt;/code&gt;, &lt;code&gt;queryset&lt;/code&gt;, &lt;code&gt;form_class&lt;/code&gt;, &lt;code&gt;form_kwargs&lt;/code&gt;, &lt;code&gt;success_url&lt;/code&gt;, &lt;code&gt;initial&lt;/code&gt;, &lt;code&gt;slug_url_kwarg&lt;/code&gt;, &lt;code&gt;pk_url_kwarg&lt;/code&gt;, &lt;code&gt;slug_field&lt;/code&gt;, &lt;code&gt;raise_exception&lt;/code&gt;, &lt;code&gt;redirect_field_name&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CBV methods:&lt;/strong&gt; &lt;code&gt;get_queryset&lt;/code&gt;, &lt;code&gt;get_context_data&lt;/code&gt;, &lt;code&gt;get_object&lt;/code&gt;, &lt;code&gt;get_form_class&lt;/code&gt;, &lt;code&gt;get_form_kwargs&lt;/code&gt;, &lt;code&gt;get_form&lt;/code&gt;, &lt;code&gt;get_initial&lt;/code&gt;, &lt;code&gt;get_template_names&lt;/code&gt;, &lt;code&gt;get_success_url&lt;/code&gt;, &lt;code&gt;get_absolute_url&lt;/code&gt;, &lt;code&gt;form_valid&lt;/code&gt;, &lt;code&gt;form_invalid&lt;/code&gt;, &lt;code&gt;dispatch&lt;/code&gt;, &lt;code&gt;setup&lt;/code&gt;, &lt;code&gt;as_view&lt;/code&gt;, &lt;code&gt;http_method_not_allowed&lt;/code&gt;, &lt;code&gt;http_method_names&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implicit hooks:&lt;/strong&gt; &lt;code&gt;post_save&lt;/code&gt;, &lt;code&gt;pre_save&lt;/code&gt;, &lt;code&gt;post_delete&lt;/code&gt;, &lt;code&gt;pre_delete&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;CBV class-attribute names are the &lt;strong&gt;declarative-attribute convention&lt;/strong&gt; pattern: you declare class-level attributes whose names match what the framework's base class expects to read, and you override methods whose names match what the base class's dispatcher calls. Same shape as Spring's &lt;code&gt;@Bean&lt;/code&gt;-on-method or JPA's &lt;code&gt;@Entity&lt;/code&gt;-on-class — declarative metadata, but expressed in Python via name conventions instead of decorators.&lt;/p&gt;




&lt;h2&gt;
  
  
  Iteration 6 — &lt;code&gt;clean_body&lt;/code&gt; silently disappears
&lt;/h2&gt;

&lt;p&gt;Last failure, on the standalone form:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CommentForm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Form&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;author_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;widget&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Textarea&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean_body&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cleaned_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Comment body too short&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The test expects a 5-char minimum to fail validation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_comment_form_requires_minimum_body_length&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;form&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CommentForm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;author_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tiny&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;form&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_valid&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;      &lt;span class="c1"&gt;# AssertionError: form IS valid
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The obfuscated form:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Cls_xxx&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Form&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;author_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;widget&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Textarea&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mtd_f10d53d5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;                          &lt;span class="c1"&gt;# was clean_body
&lt;/span&gt;        &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cleaned_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="n"&gt;forms&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Comment body too short&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The method body is identical. The method NAME changed. Django's &lt;code&gt;BaseForm._clean_fields()&lt;/code&gt; iterates over declared fields and does:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;clean_method&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clean_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;clean_method&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;clean_method&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;getattr(self, "clean_body", None)&lt;/code&gt; returns &lt;code&gt;None&lt;/code&gt;. The user's validator silently never runs. The form accepts &lt;code&gt;"tiny"&lt;/code&gt; and the test fails — not with a clear &lt;code&gt;AttributeError&lt;/code&gt;, but with &lt;code&gt;not form.is_valid()&lt;/code&gt; being False instead of True.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;discover-by-name&lt;/strong&gt; pattern, analogous to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pytest's &lt;code&gt;def test_*&lt;/code&gt; discovery&lt;/li&gt;
&lt;li&gt;Spring Data's &lt;code&gt;findByX&lt;/code&gt; derived queries&lt;/li&gt;
&lt;li&gt;Lombok's &lt;code&gt;getX()&lt;/code&gt; from a field &lt;code&gt;x&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fix extends the form AST scan: in addition to declared fields, walk the class body's &lt;code&gt;FunctionDef&lt;/code&gt; children and collect any whose name matches &lt;code&gt;clean_&amp;lt;field&amp;gt;&lt;/code&gt; or &lt;code&gt;validate_&amp;lt;field&amp;gt;&lt;/code&gt; (DRF uses the latter prefix on serializers). Add them to the exclusion list with the reason &lt;code&gt;"Django form clean_X method (Django discovers by name)"&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# sidecar's detect_django_forms command
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_scan_django_form_clean_methods&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;class_def&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;libcst&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;methods&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;member&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;class_def&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;member&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;libcst&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FunctionDef&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;member&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clean_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;validate_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;methods&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;methods&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The trade-off is mild: a user method called &lt;code&gt;clean_audit_log&lt;/code&gt; (not actually a Django form clean method) won't be obfuscated. But the prefix is specific enough that the false-positive rate is low and the trade is worth it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the detector looks like at the end
&lt;/h2&gt;

&lt;p&gt;After iterations 1-6, the &lt;code&gt;DjangoDetector&lt;/code&gt; is the largest Python detector in PromptCape, with four complementary passes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Fixed list, ~360 names, applied project-wide on any &lt;code&gt;from django&lt;/code&gt; / &lt;code&gt;import django&lt;/code&gt;.&lt;/strong&gt; Includes the Field types, Manager/QuerySet methods, ORM aggregates (&lt;code&gt;Count&lt;/code&gt;/&lt;code&gt;F&lt;/code&gt;/&lt;code&gt;Q&lt;/code&gt;/&lt;code&gt;OuterRef&lt;/code&gt;/...), CBV bases + class attribute names + method names, FBV decorators (&lt;code&gt;@login_required&lt;/code&gt;, &lt;code&gt;@require_POST&lt;/code&gt;, &lt;code&gt;@api_view&lt;/code&gt;, ...), HTTP responses + shortcuts, URL routing (&lt;code&gt;path&lt;/code&gt;, &lt;code&gt;re_path&lt;/code&gt;, &lt;code&gt;include&lt;/code&gt;, &lt;code&gt;urlpatterns&lt;/code&gt;, &lt;code&gt;app_name&lt;/code&gt;, &lt;code&gt;handler400&lt;/code&gt;–&lt;code&gt;500&lt;/code&gt;), Forms/ModelForms/Serializers, Admin (&lt;code&gt;ModelAdmin&lt;/code&gt;, &lt;code&gt;list_display&lt;/code&gt;, ...), Auth shortcuts, settings keys (&lt;code&gt;INSTALLED_APPS&lt;/code&gt;, &lt;code&gt;MIDDLEWARE&lt;/code&gt;, ...), the &lt;code&gt;Meta&lt;/code&gt; inner-class convention + its 25+ attribute names.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AST scan of models&lt;/strong&gt; via &lt;code&gt;detect_django_models&lt;/code&gt;: emits the model class name AND every &lt;code&gt;name = models.XField(...)&lt;/code&gt; field with the factory name. Skips &lt;code&gt;Meta&lt;/code&gt;, &lt;code&gt;objects&lt;/code&gt;, &lt;code&gt;_*&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AST scan of views&lt;/strong&gt; via &lt;code&gt;detect_django_views&lt;/code&gt;: emits CBV class names (subclasses of &lt;code&gt;View&lt;/code&gt;/&lt;code&gt;ListView&lt;/code&gt;/...incl. DRF &lt;code&gt;APIView&lt;/code&gt;/&lt;code&gt;ViewSet&lt;/code&gt;) AND FBV function names — detected by &lt;code&gt;request&lt;/code&gt; first param in a file named &lt;code&gt;views.py&lt;/code&gt; OR by a known Django/DRF view decorator.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AST scan of forms&lt;/strong&gt; via &lt;code&gt;detect_django_forms&lt;/code&gt;: emits form/serializer class field names AND &lt;code&gt;clean_&amp;lt;field&amp;gt;&lt;/code&gt; / &lt;code&gt;validate_&amp;lt;field&amp;gt;&lt;/code&gt; method names.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Plus the structural rule: &lt;code&gt;migrations/&lt;/code&gt;, &lt;code&gt;alembic/&lt;/code&gt;, &lt;code&gt;versions/&lt;/code&gt; directories are copied verbatim, never obfuscated, never collected.&lt;/p&gt;

&lt;p&gt;On the test app it produces roughly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Framework detection [Django]: 360 API + 12 model fields + 7 views + 6 form names = 385 rules in 1.4s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For comparison, the other Python detectors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;stdlib-common-attrs:    230 names (always on)
python-dotenv:            3 keys
pytest:                  78 fixtures + 16 test names
PydanticDetector:       102 API + AST-scanned BaseModel fields
FastApiDetector:         76 names
FlaskDetector:           82 names + view function AST scan
CeleryDetector:         ~130 names + AST-scanned task function + every parameter
ClickTyperDetector:      ~80 names + AST-scanned command function + every parameter
RequestsHttpxDetector:  ~110 names (Response attrs, request kwargs, exceptions; no AST scan)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Django needs every other framework combined, in rule count alone. The structural complexity is in the same ballpark.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why discover-by-name is the recurring shape
&lt;/h2&gt;

&lt;p&gt;Six iterations, six different name contracts, but the same underlying shape every time:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A Python framework asks "does this object have an attribute whose name is exactly X?" by string at runtime, where X is derived from the user's identifier names.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Iteration&lt;/th&gt;
&lt;th&gt;Where the lookup happens&lt;/th&gt;
&lt;th&gt;Form of the string&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1 (migrations)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;getattr(migrations_module, 'Migration')&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Imported class name&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2 (urlpatterns)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;getattr(urlconf, 'urlpatterns')&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Module-level constant name&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3 (Meta)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;getattr(form_class, 'Meta')&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Inner class name&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4 (model class name)&lt;/td&gt;
&lt;td&gt;DB schema generation + migration &lt;code&gt;name='Post'&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Class name → string in another system&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5 (CBV attrs)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;getattr(self, 'template_name')&lt;/code&gt; etc.&lt;/td&gt;
&lt;td&gt;Class attribute / method name&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6 (clean_body)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;getattr(self, 'clean_body')&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Method name following a prefix convention&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;There is no compile-time check for any of these in Python. The framework's runtime does the lookup. If the lookup fails, the framework's fallback path runs — and the fallback is &lt;em&gt;almost always silently wrong&lt;/em&gt; (default behavior instead of the user's override).&lt;/p&gt;

&lt;p&gt;This is what makes proactive detection mandatory for Python frameworks. The reactive approach — obfuscate, run, see what breaks — doesn't even reliably surface the bugs. Django's CBV fallback to defaults (Iteration 5) didn't crash; it rendered an empty list. The form's silent validation skip (Iteration 6) didn't crash; the form accepted invalid data. A test suite is the only thing that catches them — and only if the test suite exercises the exact path that depends on the override.&lt;/p&gt;

&lt;p&gt;The fix for each iteration is mechanical. The discipline is to enumerate every contract the framework has, in advance, and bake them into the detector before the first user reports a bug. The &lt;code&gt;django-blog&lt;/code&gt; test app in PromptCape's repo exists specifically to surface these regressions on every release. Sixteen tests, one of which (&lt;code&gt;test_comment_form_requires_minimum_body_length&lt;/code&gt;) exists because Iteration 6 silently happened to a real codebase first.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Django obfuscation for AI assistants isn't fundamentally different from any other Python framework integration: detect the framework, scan the project, build an exclusion list. What's different is the &lt;strong&gt;density of contracts&lt;/strong&gt;. Models, views, forms, admin, settings, migrations, templates — every layer has names that double as strings, and the strings live in different places (other Python files, database schemas, generated migrations, HTML templates, URL configurations).&lt;/p&gt;

&lt;p&gt;The four-pass design (fixed list + 3 AST scans) plus the verbatim-copy rule for machine-generated directories cover the surface as we currently understand it. New regressions will surface; each one will be one more entry in the detector. The pattern doesn't change — only the list grows.&lt;/p&gt;

&lt;p&gt;Three takeaways for anyone integrating with a similar framework:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Enumerate name contracts upfront.&lt;/strong&gt; Every place the framework does &lt;code&gt;getattr(obj, 'string')&lt;/code&gt; at runtime is a contract. Find them in the framework's source code if you have to; don't wait for production breakage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AST scans beat fixed lists when the contract is structural.&lt;/strong&gt; "Every field of every &lt;code&gt;BaseModel&lt;/code&gt;/&lt;code&gt;models.Model&lt;/code&gt; subclass" is shorter to express as a scanner than as a hand-maintained list of every field name ever used.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Machine-generated files don't belong in the obfuscation pipeline.&lt;/strong&gt; Migrations, alembic, anything you can regenerate from a command — copy verbatim, never rewrite. Whatever they expose to the AI is already exposed via the human-written source code, and rewriting them creates more failure modes than the obfuscation it provides.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;PromptCape is open for trial at &lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;https://promptcape.com/&lt;/a&gt; — free for 3 months, no credit card required. The Django detector + the 15 other Python framework detectors ship in the same JAR; the language and framework set are auto-detected from the source tree. The &lt;code&gt;django-blog&lt;/code&gt; test app that drove the iterations in this post is in the public docs repo at &lt;a href="https://gitlab.com/gbreton7/promptcape-docs/-/tree/main/applications/django-blog" rel="noopener noreferrer"&gt;gitlab.com/gbreton7/promptcape-docs/-/tree/main/applications/django-blog&lt;/a&gt; for anyone who wants to reproduce the cycle.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>django</category>
      <category>security</category>
    </item>
    <item>
      <title>Python obfuscation for AI assistants: runnable workspaces and off-disk secrets</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Wed, 03 Jun 2026 08:21:06 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/python-obfuscation-for-ai-assistants-runnable-workspaces-and-off-disk-secrets-172i</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/python-obfuscation-for-ai-assistants-runnable-workspaces-and-off-disk-secrets-172i</guid>
      <description>&lt;p&gt;&lt;em&gt;Why obfuscating Python for AI tools requires a different mental model than Java — and how &lt;code&gt;.env&lt;/code&gt; handling becomes the load-bearing question.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Java vs Python: a different relationship with the workspace
&lt;/h2&gt;

&lt;p&gt;Obfuscating Java for an AI assistant is — at heart — about &lt;strong&gt;producing a workspace that still compiles&lt;/strong&gt;. The developer rarely runs the obfuscated workspace directly; they let the AI work in it, apply the changes back to source, and run the app from there. Compilation is the contract. If &lt;code&gt;mvn test-compile&lt;/code&gt; passes after obfuscation, you're 95% done.&lt;/p&gt;

&lt;p&gt;Python is a fundamentally different game. There is no compile step. The workspace's "validation" happens &lt;strong&gt;at runtime&lt;/strong&gt;, when the developer fires up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;streamlit run dashboard.py
pytest &lt;span class="nt"&gt;-v&lt;/span&gt;
python main.py
uvicorn main:app &lt;span class="nt"&gt;--reload&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a framework introspects a class name, a function name, a string in a URL pattern, or a Pydantic field — and that name was rewritten by the obfuscator — the error surfaces only when Python tries to call it. There is no compiler to catch it for you.&lt;/p&gt;

&lt;p&gt;That changes the obfuscator's job in three concrete ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;What you protect changes.&lt;/strong&gt; Identifier names that double as string identifiers (template references, JSON keys, test discovery names) become the primary battle, not the secondary one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What you check after obfuscation changes.&lt;/strong&gt; A Python &lt;code&gt;--verify&lt;/code&gt; step can't compile; it has to do static import resolution and let runtime catch the rest.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What you do with secrets changes.&lt;/strong&gt; A Java workspace is read-only for the AI. A Python workspace is &lt;em&gt;run&lt;/em&gt; by the developer, which means real values have to be available somewhere — and the naive choice (copy &lt;code&gt;.env&lt;/code&gt; to the workspace) instantly defeats the obfuscation's purpose.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This article walks through each of those, then explains the &lt;code&gt;promptcape run&lt;/code&gt; pattern: how to give the Python workspace the env vars it needs at launch time without ever writing them to disk in the AI-visible location.&lt;/p&gt;




&lt;h2&gt;
  
  
  Names that double as string contracts
&lt;/h2&gt;

&lt;p&gt;In Java, framework conventions usually leave a compile-time trace: a missing &lt;code&gt;getName()&lt;/code&gt; from a Lombok-renamed field throws &lt;code&gt;cannot find symbol&lt;/code&gt; at &lt;code&gt;javac&lt;/code&gt; time. You can detect it, you can auto-fix it. Spring Data derived queries (&lt;code&gt;findByActiveTrue&lt;/code&gt;) are the rare exception that bites at startup, not at compile time — and that's already documented as a hard case.&lt;/p&gt;

&lt;p&gt;Python frameworks are full of conventions like Spring Data. Names are silent contracts:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Framework&lt;/th&gt;
&lt;th&gt;Identifier&lt;/th&gt;
&lt;th&gt;Contract&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pydantic v2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;class User(BaseModel): email: str&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;email&lt;/code&gt; is the JSON key in every &lt;code&gt;user.model_dump()&lt;/code&gt; call. Rename it, every API consumer breaks silently.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Flask&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;def index(): ...&lt;/code&gt; decorated with &lt;code&gt;@bp.route("/")&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;The function name becomes the &lt;strong&gt;default endpoint string&lt;/strong&gt; for &lt;code&gt;url_for("blog.index")&lt;/code&gt; and &lt;code&gt;{{ url_for('blog.index') }}&lt;/code&gt;. Rename it, every redirect and template link 500s with &lt;code&gt;werkzeug.routing.BuildError&lt;/code&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Django&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;class Post(models.Model)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The class name drives the DB table (&lt;code&gt;app_label_post&lt;/code&gt;) AND every migration reference. Rename it, your &lt;code&gt;INSERT&lt;/code&gt; query targets a table that doesn't exist.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SQLAlchemy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;id = Column(Integer, primary_key=True)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;id&lt;/code&gt; is the column name on the table. Plus it's accessed as &lt;code&gt;instance.id&lt;/code&gt; everywhere.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;pytest&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;def test_login_succeeds(...)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;pytest discovers tests by the &lt;code&gt;test_&lt;/code&gt; prefix. Rename to &lt;code&gt;mtd_xxx&lt;/code&gt;, pytest collects 0 tests — your CI silently passes with no signal.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;dataclass / attrs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;@dataclass class Post: title: str&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Field names are accessed as &lt;code&gt;obj.title&lt;/code&gt;, dumped via &lt;code&gt;asdict()&lt;/code&gt; and rendered in Jinja templates as &lt;code&gt;{{ post.title }}&lt;/code&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Django forms / DRF serializers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;def clean_email(self): ...&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Django discovers field-level validators by the literal &lt;code&gt;clean_&amp;lt;field&amp;gt;&lt;/code&gt; / &lt;code&gt;validate_&amp;lt;field&amp;gt;&lt;/code&gt; name. Rename it, your validation silently disappears.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Celery&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;@shared_task def send_email(recipient, subject)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;send_email.delay(recipient="alice@…")&lt;/code&gt; serialises the kwarg name through the broker (Redis, RabbitMQ). The worker reconstructs the call as &lt;code&gt;send_email(recipient=…)&lt;/code&gt;; rename the parameter to &lt;code&gt;p_xxx&lt;/code&gt; and the worker raises &lt;code&gt;TypeError: got an unexpected keyword argument 'recipient'&lt;/code&gt;. Affects function name AND every parameter name.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Click / Typer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;@click.option("--config") def run(config)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Click maps the CLI option &lt;code&gt;--config&lt;/code&gt; to the Python kwarg &lt;code&gt;config&lt;/code&gt; &lt;em&gt;by string&lt;/em&gt;. Rename the parameter and the CLI call &lt;code&gt;run --config foo.yaml&lt;/code&gt; raises &lt;code&gt;TypeError: got an unexpected keyword argument 'config'&lt;/code&gt;. Affects every option/argument parameter.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All of these are invisible at "compile" time (which doesn't exist anyway). They fail at runtime, often in the form of a 500 in the second route the AI touches.&lt;/p&gt;

&lt;p&gt;The fix has to be &lt;strong&gt;proactive detection&lt;/strong&gt;, not reactive. For each framework, scan the project for the relevant declarations and add the discovered names to a project-wide exclusion list &lt;strong&gt;before&lt;/strong&gt; identifier collection. The PromptCape codebase has 16 Python detectors today, 11 of which run an AST scan (the rest are pure import-check + fixed name lists):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PydanticDetector       AST scan: every BaseModel/RootModel field name
SqlalchemyDetector     AST scan: every declarative-model column / relationship
StreamlitDetector      AST scan: every top-level callable in streamlit scripts
FlaskDetector          AST scan: every @bp.route / @bp.get / @bp.errorhandler view
DataclassDetector      AST scan: every @dataclass / @attrs.define field
PytestDetector         AST scan: every def test_* / class Test* in the project
DjangoDetector         AST scan: model class+fields, CBV+FBV view names, form fields + clean_X methods
CeleryDetector         AST scan: every @app.task / @shared_task — function name + every parameter
ClickTyperDetector     AST scan: every @click.command / @app.command — function name + every parameter
RequestsHttpxDetector  Fixed list: ~110 names (Response attrs, request kwargs, exceptions); no AST scan
StdlibCommonAttrsDetector   ~230 stdlib method names (close, year, keys, items, split, …) — fixed list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AST scans run via a bundled Python sidecar that parses each candidate file with LibCST and emits the discovered names back to the Java engine as JSON. The engine merges every detector's output into a single exclusion set before the obfuscation pass starts.&lt;/p&gt;




&lt;h2&gt;
  
  
  --verify for Python: import resolution, not compilation
&lt;/h2&gt;

&lt;p&gt;Java's &lt;code&gt;--verify&lt;/code&gt; runs &lt;code&gt;mvn test-compile&lt;/code&gt; and reads &lt;code&gt;javac&lt;/code&gt; output. There's a one-line equivalent on the Python side: there is none.&lt;/p&gt;

&lt;p&gt;Python's closest analogue is &lt;code&gt;importlib.util.find_spec(...)&lt;/code&gt;. Given a dotted name like &lt;code&gt;staffing.database&lt;/code&gt;, it returns &lt;code&gt;None&lt;/code&gt; if the module can't be located, or a &lt;code&gt;ModuleSpec&lt;/code&gt; if it can. The catch: it &lt;strong&gt;executes&lt;/strong&gt; the parent package's &lt;code&gt;__init__.py&lt;/code&gt; while looking. If &lt;code&gt;staffing/__init__.py&lt;/code&gt; does &lt;code&gt;from .database import sqlalchemy_stuff&lt;/code&gt;, then &lt;code&gt;find_spec&lt;/code&gt; transitively imports SQLAlchemy, your DB driver, and probably half your app.&lt;/p&gt;

&lt;p&gt;That's a non-starter for an obfuscation verification step: you don't want to import the user's code, you don't want the user's third-party dependencies installed in the sidecar's Python interpreter, and you definitely don't want side effects (database connections opened at module import time — a real Python anti-pattern, but common).&lt;/p&gt;

&lt;p&gt;The strategy PromptCape ended up with classifies each import statement at the AST level and routes it to a different check:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Import shape&lt;/th&gt;
&lt;th&gt;Check&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;import xmlrpc.client&lt;/code&gt; (top-level is a stdlib name)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;importlib.util.find_spec("xmlrpc.client")&lt;/code&gt; — safe, stdlib never has side effects&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;from staffing.database import X&lt;/code&gt; (top-level is a workspace-local directory)&lt;/td&gt;
&lt;td&gt;Check that &lt;code&gt;workspace/staffing/database.py&lt;/code&gt; or &lt;code&gt;workspace/staffing/database/__init__.py&lt;/code&gt; exists on disk. &lt;strong&gt;Never imports the file.&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;import sqlalchemy&lt;/code&gt; (third-party)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Skipped.&lt;/strong&gt; Can't verify without the project's virtualenv installed alongside the sidecar — too much false-positive noise. Trust it.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This catches the canonical bug — &lt;code&gt;import xmlrpc.client&lt;/code&gt; rewritten to &lt;code&gt;import xmlrpc.fld_b8460726&lt;/code&gt; when a user identifier &lt;code&gt;client&lt;/code&gt; lands in the registry — without needing any project dependencies to be installed where the obfuscator runs.&lt;/p&gt;

&lt;p&gt;It does NOT catch runtime &lt;code&gt;AttributeError&lt;/code&gt; on stdlib instances (e.g. &lt;code&gt;today.year&lt;/code&gt; where &lt;code&gt;year&lt;/code&gt; was renamed because the user has a function called &lt;code&gt;year&lt;/code&gt;). For those, the proactive detector pattern is the only option: a &lt;code&gt;StdlibCommonAttrsDetector&lt;/code&gt; with ~210 of the most-commonly-accessed stdlib attribute names, applied unconditionally. The trade-off is real (user methods literally called &lt;code&gt;year&lt;/code&gt; won't be obfuscated either) but the alternative is a workspace that crashes on the first date in the codebase.&lt;/p&gt;




&lt;h2&gt;
  
  
  Comments and docstrings: line count is the load-bearing property
&lt;/h2&gt;

&lt;p&gt;Java obfuscation strips comments to &lt;code&gt;// Processed.&lt;/code&gt; while preserving line count, because the reverse-apply 3-way merge needs 1:1 line correspondence between the source and the obfuscated cache.&lt;/p&gt;

&lt;p&gt;Python has the same requirement but two distinct constructs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Line comments&lt;/strong&gt; (&lt;code&gt;# something&lt;/code&gt;) — analogous to Java's &lt;code&gt;// something&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docstrings&lt;/strong&gt; (&lt;code&gt;"""multi-line"""&lt;/code&gt;) — strings that are the first statement of a &lt;code&gt;Module&lt;/code&gt; / &lt;code&gt;FunctionDef&lt;/code&gt; / &lt;code&gt;ClassDef&lt;/code&gt; body.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stripping both is straightforward. The line-count preservation is what takes care:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Original                          # After obfuscation
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Module docstring                 &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="n"&gt;Processed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;spanning&lt;/span&gt; &lt;span class="n"&gt;four&lt;/span&gt;
&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;                                 &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
                                    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="n"&gt;newlines&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;same&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;                          &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mtd_xxx&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Function docstring.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;           &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Processed.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;                           &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;

&lt;span class="c1"&gt;# A line comment                    # Processed.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For multi-line docstrings the rule is: count the &lt;code&gt;\n&lt;/code&gt; characters in the original string value, emit &lt;code&gt;"""Processed.&lt;/code&gt; + N newlines + &lt;code&gt;"""&lt;/code&gt;. Stays on the same number of source lines so any &lt;code&gt;File "...", line 243&lt;/code&gt; in a traceback still points at the same source line in both versions.&lt;/p&gt;

&lt;p&gt;The first version of the docstring stripper had a subtle bug: it assumed &lt;code&gt;FunctionDef.body&lt;/code&gt; was always an &lt;code&gt;IndentedBlock&lt;/code&gt; (the multi-line form, &lt;code&gt;def foo():\n    body&lt;/code&gt;). One-liner functions like &lt;code&gt;def foo(): return 1&lt;/code&gt; use a &lt;code&gt;SimpleStatementSuite&lt;/code&gt; body — a totally different LibCST node type — and the stripper crashed with &lt;code&gt;'SimpleStatementSuite' object is not subscriptable&lt;/code&gt;. The exception was caught and the whole file was silently copied verbatim to the workspace, which manifested days later as &lt;code&gt;ImportError: cannot import name 'OdooClient'&lt;/code&gt; (the import line was preserved as-is in the verbatim copy while the class definition was renamed in the obfuscated &lt;code&gt;odoo_client.py&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The fix is mechanical (handle both body shapes), but the lesson is general: in Python obfuscation, the &lt;strong&gt;silent verbatim fallback&lt;/strong&gt; is a foot-gun. The diagnostic command is worth memorising:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Lists every .py in the workspace that has zero obfuscation markers&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;f &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;find ~/.promptcape/cache/&amp;lt;&lt;span class="nb"&gt;hash&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.py"&lt;/span&gt; &lt;span class="nt"&gt;-size&lt;/span&gt; +10c&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nv"&gt;count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"fld_&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;mtd_&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;Cls_&lt;/span&gt;&lt;span class="se"&gt;\|&lt;/span&gt;&lt;span class="s2"&gt;Processed"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
  &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$count&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"0"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"VERBATIM: &lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Files that come out are either empty placeholders (fine — &lt;code&gt;conftest.py&lt;/code&gt; is often empty in test suites) or fell through the fallback (file the bug).&lt;/p&gt;




&lt;h2&gt;
  
  
  The .env problem
&lt;/h2&gt;

&lt;p&gt;This is the question that splits Python obfuscation from Java obfuscation more than anything else.&lt;/p&gt;

&lt;p&gt;A Java workspace is typically &lt;strong&gt;read-only for the AI&lt;/strong&gt;. The developer obfuscates, the AI works in the obfuscated copy, the developer applies changes back to source, and the app runs from the source project (with the real &lt;code&gt;.env&lt;/code&gt;, the real &lt;code&gt;application.properties&lt;/code&gt;, the real DB). The obfuscated workspace's job is to be readable, not runnable.&lt;/p&gt;

&lt;p&gt;A Python workspace gets &lt;strong&gt;run&lt;/strong&gt; by the developer. They iterate. They open Streamlit. They run pytest. They start the dev server. That requires real config values at runtime — but &lt;code&gt;.env&lt;/code&gt; files are pure secrets: API keys, database URLs, OAuth client secrets. There is no "structure" to preserve in a &lt;code&gt;.env&lt;/code&gt; file the way there is in &lt;code&gt;application.properties&lt;/code&gt; (where keys are part of the architecture and values are leaf secrets). It's secrets all the way down.&lt;/p&gt;

&lt;p&gt;The first iteration of the Python pipeline ran the existing Java sanitizer on &lt;code&gt;.env&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="c"&gt;# Original .env
&lt;/span&gt;&lt;span class="py"&gt;DATABASE_URL&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;postgres://prod-db.acme.com:5432/myapp&lt;/span&gt;
&lt;span class="py"&gt;SECRET_KEY&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;hunter2&lt;/span&gt;
&lt;span class="py"&gt;ACTIVITY_MONTHS&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;6&lt;/span&gt;

&lt;span class="c"&gt;# Sanitized .env (copied to workspace)
&lt;/span&gt;&lt;span class="py"&gt;DATABASE_URL&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;REDACTED&lt;/span&gt;
&lt;span class="py"&gt;SECRET_KEY&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;REDACTED&lt;/span&gt;
&lt;span class="py"&gt;ACTIVITY_MONTHS&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;REDACTED&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first time the developer ran &lt;code&gt;streamlit run&lt;/code&gt; from the workspace, it crashed instantly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;ValueError: invalid literal for int() with base 10: 'REDACTED'
&lt;/span&gt;&lt;span class="gp"&gt;  File ".../dashboard.py", line 243, in &amp;lt;module&amp;gt;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="go"&gt;    ACTIVITY_MONTHS = int(os.getenv('ACTIVITY_MONTHS', '6'))
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ACTIVITY_MONTHS=6&lt;/code&gt; is not a secret. It's a config knob. But the sanitizer was uniform: redact everything because some entries are sensitive. That works for Java where the workspace doesn't run, but it instantly bricks the Python use case.&lt;/p&gt;

&lt;p&gt;Three options surfaced:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Option&lt;/th&gt;
&lt;th&gt;Workspace runs?&lt;/th&gt;
&lt;th&gt;AI sees secrets?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;A.&lt;/strong&gt; Copy &lt;code&gt;.env&lt;/code&gt; verbatim&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes (any tool that reads files sees them)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;B.&lt;/strong&gt; Sanitize all values&lt;/td&gt;
&lt;td&gt;No (crashes on first int/bool/URL parse)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;C.&lt;/strong&gt; Sanitize selectively (heuristics for "looks like a secret")&lt;/td&gt;
&lt;td&gt;Maybe (depends on heuristic quality)&lt;/td&gt;
&lt;td&gt;Mostly no&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A and B are bad in different ways. C is fragile — every secret format you don't think of becomes a leak, and every config value that happens to match the heuristic becomes a crash.&lt;/p&gt;

&lt;p&gt;The fix that actually worked is to recognise that the workspace &lt;strong&gt;doesn't need &lt;code&gt;.env&lt;/code&gt; on disk at all&lt;/strong&gt;. It needs the env vars at the moment a child process starts. There's a layer between "secrets at rest" and "secrets in the running app's environment" that PromptCape can sit on.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;code&gt;promptcape run&lt;/code&gt;: inject &lt;code&gt;.env&lt;/code&gt; at subprocess launch, never on disk
&lt;/h2&gt;

&lt;p&gt;The pattern is borrowed from how 12-factor apps deploy in containers: the orchestrator reads the secret store at container start time and exports keys into the process environment. The container image itself contains no secrets.&lt;/p&gt;

&lt;p&gt;Translated to PromptCape:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;promptcape obfuscate&lt;/code&gt; writes the workspace &lt;strong&gt;without&lt;/strong&gt; &lt;code&gt;.env&lt;/code&gt;. A small file &lt;code&gt;.env.promptcape-pointer&lt;/code&gt; is written instead, with the absolute path to the source &lt;code&gt;.env&lt;/code&gt; and instructions to use &lt;code&gt;promptcape run&lt;/code&gt;. The AI sees the pointer if it opens it — that's intentional; we want the indirection documented.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;promptcape run &amp;lt;command&amp;gt;&lt;/code&gt; is a wrapper that:

&lt;ul&gt;
&lt;li&gt;Resolves the source project from the current working directory (same mechanism as &lt;code&gt;promptcape apply&lt;/code&gt; / &lt;code&gt;promptcape status&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Parses &lt;code&gt;&amp;lt;source&amp;gt;/.env&lt;/code&gt; and &lt;code&gt;&amp;lt;source&amp;gt;/.env.local&lt;/code&gt; with a minimal python-dotenv-compatible parser.&lt;/li&gt;
&lt;li&gt;Spawns &lt;code&gt;&amp;lt;command&amp;gt;&lt;/code&gt; with &lt;code&gt;cwd = workspace&lt;/code&gt;, the child's environment populated from the current OS env layered with the parsed &lt;code&gt;.env&lt;/code&gt; entries.&lt;/li&gt;
&lt;li&gt;Inherits stdin/stdout/stderr so the child has a real TTY (colors, prompts, progress bars all work).&lt;/li&gt;
&lt;li&gt;Propagates the child's exit code.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Source project: ~/projects/my-streamlit-app/.env&lt;/span&gt;
&lt;span class="c"&gt;# DATABASE_URL=postgres://prod-db.acme.com:5432/myapp&lt;/span&gt;
&lt;span class="c"&gt;# SECRET_KEY=hunter2&lt;/span&gt;
&lt;span class="c"&gt;# ACTIVITY_MONTHS=6&lt;/span&gt;

&lt;span class="nb"&gt;cd&lt;/span&gt; ~/projects/my-streamlit-app
promptcape obfuscate &lt;span class="nt"&gt;--language&lt;/span&gt; python &lt;span class="nt"&gt;--verify&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;span class="c"&gt;# -&amp;gt; ~/.promptcape/cache/a1b2c3d4/&lt;/span&gt;
&lt;span class="c"&gt;#    ├── (the obfuscated code)&lt;/span&gt;
&lt;span class="c"&gt;#    └── .env.promptcape-pointer    (text file, no values)&lt;/span&gt;

&lt;span class="nb"&gt;cd&lt;/span&gt; ~/.promptcape/cache/a1b2c3d4
promptcape run streamlit run dashboard.py
&lt;span class="c"&gt;# 1. reads ~/projects/my-streamlit-app/.env&lt;/span&gt;
&lt;span class="c"&gt;# 2. spawns `streamlit run dashboard.py` in cwd=workspace&lt;/span&gt;
&lt;span class="c"&gt;# 3. child environment: OS env + DATABASE_URL=postgres://... + SECRET_KEY=hunter2 + ACTIVITY_MONTHS=6&lt;/span&gt;
&lt;span class="c"&gt;# 4. streamlit starts normally; os.getenv('DATABASE_URL') returns the real value&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For pytest, the same shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;promptcape run pytest &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;span class="c"&gt;# tests run against the obfuscated source, with real env vars injected at child launch&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Java apps using Spring Boot's relaxed binding (&lt;code&gt;DATABASE_PASSWORD&lt;/code&gt; env var overrides &lt;code&gt;database.password&lt;/code&gt; property), the SAME command works without any extra plumbing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;promptcape run mvn spring-boot:run
&lt;span class="c"&gt;# Spring Boot reads OS env vars (precedence rank 5) before application.properties (rank 8).&lt;/span&gt;
&lt;span class="c"&gt;# The sanitized application.properties in the workspace has database.password=REDACTED.&lt;/span&gt;
&lt;span class="c"&gt;# The OS env var DATABASE_PASSWORD=real overrides it. App starts with real credentials.&lt;/span&gt;
&lt;span class="c"&gt;# No .env ever copied to the workspace.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three properties this gets right:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Secrets never touch the AI-visible workspace directory.&lt;/strong&gt; An AI tool with file-read access can grep &lt;code&gt;~/.promptcape/cache/&amp;lt;hash&amp;gt;&lt;/code&gt; all it wants — there are no values to find.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explicit failure mode.&lt;/strong&gt; If the developer runs &lt;code&gt;pytest&lt;/code&gt; directly (without &lt;code&gt;promptcape run&lt;/code&gt;), the app starts with no env vars and crashes at the first &lt;code&gt;os.getenv('REQUIRED_KEY')&lt;/code&gt;. That's loud, it's traceable, and it's correct — they're missing the wrapper.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No code change in the user's project.&lt;/strong&gt; &lt;code&gt;load_dotenv()&lt;/code&gt; calls in the user's code become a graceful no-op (no &lt;code&gt;.env&lt;/code&gt; to find), but &lt;code&gt;os.getenv('KEY')&lt;/code&gt; finds the value in the child environment. The framework's startup path is unchanged.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The downside: developers have to remember to use &lt;code&gt;promptcape run&lt;/code&gt;. The mitigation is documentation (&lt;code&gt;.env.promptcape-pointer&lt;/code&gt; is the first place they look when something doesn't read env vars), the proxy/Cursor-terminal integration (which can wrap launches automatically), and a clear failure message when the wrapper is forgotten.&lt;/p&gt;




&lt;h2&gt;
  
  
  The complete Python cycle
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. pytest                              -&amp;gt; GREEN (source is healthy)
2. promptcape obfuscate --verify       -&amp;gt; Obfuscated workspace created
                                          .env NOT copied; pointer file written
3. promptcape run pytest               -&amp;gt; GREEN (workspace runs with real env vars
                                          injected at subprocess launch)
4. AI modifies obfuscated code
5. promptcape run pytest               -&amp;gt; GREEN (AI changes work in the runtime)
6. promptcape apply                    -&amp;gt; Changes applied to source
7. pytest                              -&amp;gt; GREEN (de-obfuscated changes work)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each step has a specific failure mode:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Step 2 → 3:&lt;/strong&gt; if the workspace fails to run, it's almost always a framework-name collision the detectors missed. The fix is to grep the obfuscated workspace for the obfuscated identifier (&lt;code&gt;grep -rn "mtd_098fd2b6" .&lt;/code&gt;) to see what real name the AI sees in context, then add it to the relevant detector. Real-world examples that surfaced this way: &lt;code&gt;cursor.close()&lt;/code&gt; (sqlite3 Cursor method), &lt;code&gt;today.year&lt;/code&gt; (datetime.date attribute), &lt;code&gt;df.value_counts().to_dict()&lt;/code&gt; (pandas chain), &lt;code&gt;engine.connect()&lt;/code&gt; (SQLAlchemy lifecycle). Each got added to the protected list once.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step 5 → 6:&lt;/strong&gt; the AI invented an obfuscated name that's not in the registry. The reverse-apply step has a hash-resolver that maps &lt;code&gt;Cls_e5f6a7b8&lt;/code&gt; patterns back to known real names. Same mechanism as Java.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Step 6 → 7:&lt;/strong&gt; rare — usually means the AI introduced a syntax error. The pre-apply &lt;code&gt;--compile-gate&lt;/code&gt; check (which for Python is the static import verifier) catches most of these.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  A note on what this does NOT protect
&lt;/h2&gt;

&lt;p&gt;It is worth being explicit about the threat boundary, because Python's open-source nature makes the question come up naturally: &lt;em&gt;if my distributed Python app ships as &lt;code&gt;.py&lt;/code&gt; files anyone can read, why bother obfuscating it for the AI in the first place?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The answer is that those are two different threats living in two different lifecycle stages.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Threat&lt;/th&gt;
&lt;th&gt;When&lt;/th&gt;
&lt;th&gt;Who reads the source&lt;/th&gt;
&lt;th&gt;What protects&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI-provider transit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Development sessions (Claude Code, Cursor, Aider…)&lt;/td&gt;
&lt;td&gt;Anthropic / OpenAI / Mistral on their servers&lt;/td&gt;
&lt;td&gt;PromptCape — obfuscate before sending, reverse-map the reply&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;End-user inspection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;After product release&lt;/td&gt;
&lt;td&gt;Anyone who installs the &lt;code&gt;.py&lt;/code&gt;, &lt;code&gt;.pyc&lt;/code&gt;, or PyInstaller bundle&lt;/td&gt;
&lt;td&gt;Native compilation (Nuitka, Cython), commercial obfuscators (PyArmor), or SaaS-only deployment&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;PromptCape's obfuscated workspace lives in &lt;code&gt;~/.promptcape/cache/&amp;lt;hash&amp;gt;/&lt;/code&gt; on the developer's own machine, only during AI sessions. It never ships with the product. After &lt;code&gt;promptcape apply&lt;/code&gt;, the developer's source tree is back to real names. Whatever the developer builds and distributes is independent of whether they used PromptCape that day or not.&lt;/p&gt;

&lt;p&gt;The two layers are also independent in the opposite direction: a Nuitka-compiled binary doesn't help the developer at all while they're prompting Claude with their real source code — that's not when end users are looking, that's when the AI provider's logs are being written. A developer who needs both protections uses both: PromptCape during development, Nuitka at release. The combination covers the full lifecycle.&lt;/p&gt;

&lt;p&gt;Specifically on Python distribution effort levels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;.py&lt;/code&gt; files&lt;/strong&gt;: readable as-is. Zero effort.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;.pyc&lt;/code&gt;-only&lt;/strong&gt; (&lt;code&gt;python -m compileall&lt;/code&gt;): decompiles cleanly in seconds with &lt;code&gt;decompyle3&lt;/code&gt; or &lt;code&gt;uncompyle6&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyInstaller / cx_Freeze / py2exe&lt;/strong&gt;: embed &lt;code&gt;.pyc&lt;/code&gt; inside a bundle that &lt;code&gt;pyinstxtractor&lt;/code&gt; cracks open in 5–10 minutes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyArmor&lt;/strong&gt; (commercial): custom encrypted loader. Hours-to-days of reverse-engineering effort depending on the obfuscation level chosen.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nuitka&lt;/strong&gt; or &lt;strong&gt;Cython&lt;/strong&gt;: compile Python through C to a real native binary. Days-to-weeks of effort for a determined reverser. The strongest open-source option.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SaaS / cloud-only&lt;/strong&gt;: the only mathematically tight answer — if the code never leaves your servers, no one can read it on their disk.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't a Python-specific issue. Java has the same shape — &lt;code&gt;.class&lt;/code&gt; files in a &lt;code&gt;.jar&lt;/code&gt; decompile cleanly with &lt;code&gt;jd-gui&lt;/code&gt; / CFR / Procyon, and the traditional answer is ProGuard or R8 name-mangling at release-build time, which is conceptually identical to what PromptCape does at AI-session time but applied at a different lifecycle point. The two layers don't replace each other; a Java product that ships obfuscated bytecode AND uses PromptCape during development covers both transit and distribution leaks.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Python obfuscation for AI assistants is not a port of the Java pipeline. The fundamental shift — the developer runs the workspace, not just reads it — changes every layer: what you protect (name contracts, not just identifiers), how you verify (file existence, not compilation), and how you handle secrets (inject at subprocess launch, never on disk).&lt;/p&gt;

&lt;p&gt;The three insights from building this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Names that double as strings are the hard cases, and they're proactive-only.&lt;/strong&gt; No compile error catches &lt;code&gt;def index()&lt;/code&gt; → &lt;code&gt;def mtd_xxx()&lt;/code&gt; when Flask looks up the endpoint string &lt;code&gt;"blog.index"&lt;/code&gt;. The detector has to know the framework's discovery rules in advance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A silent verbatim fallback hides bugs for days.&lt;/strong&gt; If a file fails to obfuscate, the engine must surface that loudly. The &lt;code&gt;.env.promptcape-pointer&lt;/code&gt; and the verbatim-detection grep snippet exist precisely because the failure mode is silent otherwise.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;.env&lt;/code&gt; doesn't need to be on disk in the workspace.&lt;/strong&gt; The wrapper-injection pattern (&lt;code&gt;promptcape run&lt;/code&gt;) gives the workspace real values at runtime without ever writing them to the AI-readable directory. This is the load-bearing pattern that makes the rest of the security story coherent: if the developer can run the workspace and the secrets never leave the source project, the AI assistant has zero attack surface on credentials.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;PromptCape ships open for trial at &lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;https://promptcape.com/&lt;/a&gt; — free for 3 months, no credit card required. The Python pipeline, the 16 framework detectors, and the &lt;code&gt;promptcape run&lt;/code&gt; wrapper ship in the same JAR as the Java pipeline; the language is auto-detected from the source tree.&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>security</category>
      <category>privacy</category>
    </item>
    <item>
      <title>The AI Code Protection Landscape: 13 Products Compared</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Mon, 01 Jun 2026 10:05:20 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/the-ai-code-protection-landscape-13-products-compared-4pg8</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/the-ai-code-protection-landscape-13-products-compared-4pg8</guid>
      <description>&lt;p&gt;&lt;em&gt;A practical comparison of tools that protect source code and sensitive data from leaking to AI assistants — across deployment model, target user, and lifecycle coverage.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;When developers use AI coding assistants — Claude Code, Cursor, GitHub Copilot, Cody, Aider — they implicitly send source code, comments, and configuration values to a remote server they do not control. For most companies that is a regulatory, contractual, or competitive risk: customer data inside test fixtures, IP inside class names, credentials inside config files, business logic spelled out in comments.&lt;/p&gt;

&lt;p&gt;A growing market of products tries to address some part of this. They are &lt;em&gt;not&lt;/em&gt; all the same product. Several are not even in the same category. This article maps 13 of them across three axes that actually matter at decision time:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Deployment model.&lt;/strong&gt; Does the data leave your network, and to whom?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Target user.&lt;/strong&gt; Is this a developer's tool that disappears into the IDE, or a CISO's tool that sits at the network gateway?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lifecycle coverage.&lt;/strong&gt; Does it just block or redact on the way out, or does it round-trip — obfuscate, let the AI work, apply changes back to the real code, and verify the build?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The third axis is where most products in this space stop short.&lt;/p&gt;




&lt;h2&gt;
  
  
  The three axes in detail
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Axis 1 — On-premise vs SaaS
&lt;/h3&gt;

&lt;p&gt;If your threat model is &lt;em&gt;"no proprietary code on third-party servers"&lt;/em&gt;, then a product that requires sending your prompts through &lt;em&gt;its own SaaS&lt;/em&gt; before forwarding to OpenAI/Anthropic is a substitution, not a solution: you swapped one third party for another. The configurations that actually fit this threat model are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Strict on-premise:&lt;/strong&gt; the product runs as a binary, container, or library inside your network. Examples: Presidio, LLM Guard, NeMo Guardrails, PromptCape (CLI mode), Quieta.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid proxy:&lt;/strong&gt; the engine runs locally and sanitizes prompts; the LLM call goes out to whatever provider you choose. Examples: PromptCape (proxy mode), ChatWall Box, ZeusLock self-hosted.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SaaS-only:&lt;/strong&gt; the product itself is a cloud service. Examples: GCP DLP, Lakera Guard, Skyflow, Cypher AI, Code Integrity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A SaaS-only choice is not wrong — it is a different trade-off. If your data already lives in GCP, GCP DLP is a sensible choice. If you cannot send code to a third-party cloud at all, it is a non-starter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Axis 2 — CISO tool vs developer tool
&lt;/h3&gt;

&lt;p&gt;Two product shapes coexist in this market and they are easy to confuse:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CISO/governance tools&lt;/strong&gt; sit at the network egress. They scan outbound prompts for PII, secrets, and policy violations; log hits; raise alerts; produce audit trails. They are bought by security teams and &lt;em&gt;imposed&lt;/em&gt; on developers — often visible to the dev only as "the thing that occasionally blocks my prompt." Adoption is enforced top-down.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developer tools&lt;/strong&gt; integrate where the developer already works (IDE terminal, CLI, plugin, library). They aim to be invisible — same workflow, same commands, the obfuscation happens transparently. Adoption is bottom-up, driven by zero added friction.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most products are clearly one or the other. A few try to be both and end up not great at either. When a developer tool adds friction (extra step, copy-paste, context switch), developers route around it. When a CISO tool is exposed too directly to developers without sandboxing, it gets disabled or bypassed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Axis 3 — Coverage of the full cycle
&lt;/h3&gt;

&lt;p&gt;For &lt;em&gt;non-code&lt;/em&gt; data (a customer name in a chat prompt), redact-on-the-way-out is enough. For &lt;em&gt;code&lt;/em&gt;, you need round-trip: the AI's response has to land in the repo with the original names, comments, and structure restored — and the result must compile and pass tests. Otherwise the developer manually patches the AI's output, which destroys the productivity gain that motivated the AI in the first place.&lt;/p&gt;

&lt;p&gt;Specifically, full-cycle code protection requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Forward obfuscation that does not break framework conventions (Spring Data derived queries, JPA entity names in JPQL, Lombok-generated accessors, Jackson serialization, Bean Validation, OpenAPI schemas)&lt;/li&gt;
&lt;li&gt;An obfuscated workspace where the code still compiles and tests still pass&lt;/li&gt;
&lt;li&gt;Reverse application that only modifies AI-changed lines (preserves comments and formatting on untouched lines)&lt;/li&gt;
&lt;li&gt;Resolution of names the AI invented during its work (variables it created based on the obfuscated identifiers it saw)&lt;/li&gt;
&lt;li&gt;Build/test verification on the de-obfuscated source&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most products in this list cover &lt;strong&gt;none&lt;/strong&gt; of that. That is not a flaw — they are solving a different problem. Conflating "we redact PII in chatbot prompts" with "we let developers safely use Cursor on a closed-source codebase" leads to bad procurement decisions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Comparison table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Product&lt;/th&gt;
&lt;th&gt;Deployment&lt;/th&gt;
&lt;th&gt;OSS&lt;/th&gt;
&lt;th&gt;Domain&lt;/th&gt;
&lt;th&gt;Target user&lt;/th&gt;
&lt;th&gt;Full code cycle?&lt;/th&gt;
&lt;th&gt;Distinguishing feature&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/chatwall/" rel="noopener noreferrer"&gt;ChatWall&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Browser extension only (Firefox)&lt;/td&gt;
&lt;td&gt;Partial (source-available)&lt;/td&gt;
&lt;td&gt;DLP for AI web chat (browser overlay)&lt;/td&gt;
&lt;td&gt;End-user&lt;/td&gt;
&lt;td&gt;Partial (chat-overlay reversal)&lt;/td&gt;
&lt;td&gt;Local-only token substitution for ChatGPT/Claude.ai/Gemini web UIs; very early adoption stage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://www.cypherai.ai/" rel="noopener noreferrer"&gt;Cypher AI&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hybrid (client-side encryption + vendor or sovereign compute)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Encrypted LLM inference (TFHE)&lt;/td&gt;
&lt;td&gt;CISO&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Customer-managed keys; multi-scheme FHE (TFHE / BFV / CKKS / Paillier); 128-bit post-quantum lattice-based; ~400× speedup vs Microsoft SEAL at 10M records; deployments in defense, Tier-1 finance, biometric infrastructure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://cloud.google.com/sensitive-data-protection" rel="noopener noreferrer"&gt;Google Cloud DLP&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SaaS (GCP)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Generic PII redaction + tokenization&lt;/td&gt;
&lt;td&gt;CISO&lt;/td&gt;
&lt;td&gt;Partial (de-id reversible, not code-aware)&lt;/td&gt;
&lt;td&gt;150+ infoTypes; native to BigQuery / GCS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://www.lakera.ai/lakera-guard" rel="noopener noreferrer"&gt;Lakera Guard&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SaaS&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Prompt-injection guardrails&lt;/td&gt;
&lt;td&gt;Mixed&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Real-time AI/agent attack detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://www.getlimina.ai/en" rel="noopener noreferrer"&gt;Limina AI&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;On-prem (VPC container)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;PII / PHI / PCI redaction&lt;/td&gt;
&lt;td&gt;CISO&lt;/td&gt;
&lt;td&gt;Partial (PII reversal, not code)&lt;/td&gt;
&lt;td&gt;Context-aware detection across 50+ entity types and 52 languages; healthcare/finance positioning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;&lt;a href="https://github.com/protectai/llm-guard" rel="noopener noreferrer"&gt;LLM Guard&lt;/a&gt;&lt;/strong&gt; (Protect AI)&lt;/td&gt;
&lt;td&gt;On-prem&lt;/td&gt;
&lt;td&gt;Yes (MIT)&lt;/td&gt;
&lt;td&gt;LLM I/O guardrails&lt;/td&gt;
&lt;td&gt;Developer&lt;/td&gt;
&lt;td&gt;Partial (code is &lt;em&gt;banned&lt;/em&gt;, not obfuscated)&lt;/td&gt;
&lt;td&gt;~35 input/output scanners around an LLM call&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://microsoft.github.io/presidio/" rel="noopener noreferrer"&gt;Microsoft Presidio&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;On-prem&lt;/td&gt;
&lt;td&gt;Yes (MIT)&lt;/td&gt;
&lt;td&gt;Generic PII redaction&lt;/td&gt;
&lt;td&gt;Mixed&lt;/td&gt;
&lt;td&gt;Partial (no code reverse)&lt;/td&gt;
&lt;td&gt;Pluggable PII recognizers in Python; broadly deployed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://github.com/NVIDIA/NeMo-Guardrails" rel="noopener noreferrer"&gt;NVIDIA NeMo Guardrails&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;On-prem&lt;/td&gt;
&lt;td&gt;Yes (Apache 2.0)&lt;/td&gt;
&lt;td&gt;Dialogue safety&lt;/td&gt;
&lt;td&gt;Developer&lt;/td&gt;
&lt;td&gt;No (out of scope)&lt;/td&gt;
&lt;td&gt;Programmable Colang DSL for conversational containment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;PromptCape&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;On-prem (CLI + proxy)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Code obfuscation&lt;/td&gt;
&lt;td&gt;Developer&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Java &amp;amp; Python; obfuscate → AI → apply → build verify; Cursor/VS Code terminal integration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://quieta.ai/en/" rel="noopener noreferrer"&gt;Quieta&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;On-prem (desktop app)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Generic PII redaction&lt;/td&gt;
&lt;td&gt;End-user&lt;/td&gt;
&lt;td&gt;Partial (copy-paste workflow)&lt;/td&gt;
&lt;td&gt;macOS/Windows app; "paste → mask → AI → restore" in one click, fully local&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://www.skyflow.com/" rel="noopener noreferrer"&gt;Skyflow&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SaaS&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Data vault + tokenization&lt;/td&gt;
&lt;td&gt;CISO&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;"LLM Privacy Vault" for sensitive customer data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://www.titanone.ai/" rel="noopener noreferrer"&gt;TitanOne&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Both (self-hosted + SaaS)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;PII redaction for AI&lt;/td&gt;
&lt;td&gt;CISO&lt;/td&gt;
&lt;td&gt;Partial (no code awareness)&lt;/td&gt;
&lt;td&gt;Context-preserving substitution + re-enrichment of LLM responses with original values&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;&lt;a href="https://zeuslock.ai/en/" rel="noopener noreferrer"&gt;ZeusLock&lt;/a&gt;&lt;/strong&gt; (Zeus DLP)&lt;/td&gt;
&lt;td&gt;Both (EU SaaS + sovereign on-prem)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;DLP + secrets + Shadow AI detection&lt;/td&gt;
&lt;td&gt;Mixed&lt;/td&gt;
&lt;td&gt;Partial (no source-code awareness)&lt;/td&gt;
&lt;td&gt;Browser extension + CLI + IDE + MCP coverage; blocks AI tools that train on data (Shadow AI Detection)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Reading the table by axis
&lt;/h2&gt;

&lt;h3&gt;
  
  
  By deployment
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Strict on-premise&lt;/strong&gt; (no third-party server in the loop except the LLM you choose to call): Presidio, LLM Guard, NeMo Guardrails, PromptCape, Quieta, Limina (VPC container), ChatWall (purely local extension) — plus the on-prem deployment options of TitanOne, ZeusLock, and Cypher AI (for defense and regulated industries).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SaaS-only&lt;/strong&gt; (your prompts/data hit the vendor's cloud first, then maybe go to an LLM): GCP DLP, Lakera Guard, Skyflow. Cypher AI is SaaS-by-default but with a cryptographic twist — prompts arrive encrypted under TFHE so the vendor sees no plaintext even on its own servers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hybrid proxy&lt;/strong&gt; (engine runs locally, LLM call goes out to whatever provider you pick): PromptCape, ZeusLock self-hosted. This configuration gives you both control and access to frontier models — the engine is yours, the model is theirs, and the bridge is yours.&lt;/p&gt;

&lt;h3&gt;
  
  
  By target user
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Developer-first (IDE-seamless):&lt;/strong&gt; PromptCape (Cursor/VS Code terminal profile), LLM Guard (library wrapping LLM calls in your code), NeMo Guardrails (library or local server for LLM apps), ZeusLock (browser extension + CLI + IDE plugins). These integrate where the developer already is and add little or no friction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CISO/governance-first:&lt;/strong&gt; Skyflow, GCP DLP, Lakera, Limina, TitanOne, Cypher AI. Strong dashboards, policies, audit trails, compliance certifications. Developers see them as the thing that gates their prompts; adoption requires top-down enforcement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;End-user / individual:&lt;/strong&gt; Quieta (desktop app for pre-paste anonymization), ChatWall (Firefox extension for browser chat overlays). Bought and installed by individual users one at a time, no enterprise console.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generic / mixed:&lt;/strong&gt; Presidio (library that can be wrapped in either direction).&lt;/p&gt;

&lt;h3&gt;
  
  
  By coverage of the full cycle
&lt;/h3&gt;

&lt;p&gt;For source code specifically, only &lt;strong&gt;PromptCape&lt;/strong&gt; advertises the full cycle: obfuscate the project → AI iterates on the obfuscated workspace → apply only AI-changed lines back → verify the source still compiles and tests still pass. Everything else in this list either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;(a) does redaction without round-trip (Presidio, GCP DLP, Lakera, ChatWall, ZeusLock when applied to code-containing prompts),&lt;/li&gt;
&lt;li&gt;(b) does round-trip but only on free-text data — not on code that has to compile against framework conventions (Limina, TitanOne, Quieta), or&lt;/li&gt;
&lt;li&gt;(c) is in a different category entirely (NeMo dialogue safety, Cypher FHE inference, LLM Guard input/output scanning).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That gap exists because the round-trip on code is much harder than on text. You have to handle Spring Data derived queries (&lt;code&gt;findByActiveTrue&lt;/code&gt; is the query), JPA entity name strings in &lt;code&gt;@Query&lt;/code&gt; annotations, Lombok-generated accessor names, comment stripping vs. preserving line counts, AI-invented variable names, build artifacts that should not flow back, and you have to verify the result compiles and passes tests. A redaction library is a few hundred lines; the round-trip is an order of magnitude more work.&lt;/p&gt;




&lt;h2&gt;
  
  
  Categorical, not directly comparable
&lt;/h2&gt;

&lt;p&gt;It is misleading to put all 13 products in one ranked list — they live in different categories. Roughly:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Products&lt;/th&gt;
&lt;th&gt;What they actually do&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Generic PII / DLP&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Presidio, GCP DLP, Limina, TitanOne, Quieta, ChatWall, ZeusLock&lt;/td&gt;
&lt;td&gt;Detect sensitive entities in text and redact, mask, or tokenize them&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LLM I/O guardrails&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;LLM Guard, Lakera Guard, NeMo Guardrails&lt;/td&gt;
&lt;td&gt;Sit between app and LLM; detect prompt injection, jailbreaks, scan input/output for policy violations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data vaults / tokenization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Skyflow&lt;/td&gt;
&lt;td&gt;Store sensitive data in a vault; replace with tokens for downstream use including AI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Encrypted inference&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cypher AI&lt;/td&gt;
&lt;td&gt;Run inference on data that stays encrypted end-to-end (FHE)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Code obfuscation for AI coding&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;PromptCape&lt;/td&gt;
&lt;td&gt;Obfuscate source before AI sees it; round-trip back with build verification&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If your problem is &lt;em&gt;"we send PII into a chatbot,"&lt;/em&gt; go to row 1. If your problem is &lt;em&gt;"prompt-injection or jailbreaks in our AI app,"&lt;/em&gt; go to row 2. If your problem is &lt;em&gt;"our source code is being trained on by a model vendor,"&lt;/em&gt; only row 5 answers the question.&lt;/p&gt;




&lt;h2&gt;
  
  
  A pragmatic decision matrix
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Protecting customer PII inside chat or RAG prompts → CISO-driven.&lt;/strong&gt; Skyflow, GCP DLP, Limina, TitanOne, ZeusLock. Pick on deployment constraints (regulated cloud vs. on-prem VPC), language coverage, and existing data infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Protecting against prompt-injection and jailbreaks in your AI app → developer-driven.&lt;/strong&gt; LLM Guard (free, OSS) or NeMo Guardrails (free, OSS) for in-process; Lakera (commercial SaaS) for managed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Protecting AI agents that call tools (MCP-style integrations) → developer + CISO.&lt;/strong&gt; NeMo Guardrails for input-side containment; ZeusLock for MCP-protocol monitoring at the network layer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cryptographic guarantee that even the LLM provider — and even the privacy vendor itself — cannot read the data → CISO-driven.&lt;/strong&gt; Cypher AI: prompts are encrypted client-side under TFHE with customer-managed keys; inference runs on encrypted tensors; only the user decrypts the output. Strong fit for defense and regulated finance. Alternative: self-host a private LLM via AWS Bedrock / Azure OpenAI / Vertex with private endpoints — gives you trust-based isolation rather than mathematical isolation, but no FHE overhead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Protecting your team's source code while still using Claude Code / Cursor / Copilot every day → developer-driven, full cycle.&lt;/strong&gt; PromptCape. Java &amp;amp; Python today, more languages on the roadmap. The unique slot in this list — no other product in our set round-trips real code through framework conventions and verifies the build.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Just a desktop or browser tool to clean text before pasting into a web AI → individual user.&lt;/strong&gt; Quieta (desktop, macOS/Windows), or ChatWall (Firefox extension; very early stage at the time of writing).&lt;/p&gt;




&lt;h2&gt;
  
  
  Honesty about gaps in this comparison
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The line between "AI security" and "data privacy" is fuzzy and many products straddle it. We have categorized by primary marketing claim, not by every adjacent capability.&lt;/li&gt;
&lt;li&gt;Pricing is omitted from the table because it changes constantly and most enterprise products do not publish it. Where pricing is publicly listed, it is mentioned in the text.&lt;/li&gt;
&lt;li&gt;This is not exhaustive. Notable adjacent products not included: GitHub Copilot Enterprise data controls, Anthropic / OpenAI zero-retention enterprise tiers, AWS Bedrock / Azure OpenAI / Vertex private deployments, Tabnine self-hosted, Cody Enterprise, Sourcegraph Cody on-prem.&lt;/li&gt;
&lt;li&gt;All products were assessed from public web pages on a single research pass. Vendor positioning shifts quickly in this market — verify before purchase.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The AI privacy / code-protection space is crowded but &lt;strong&gt;not duplicative&lt;/strong&gt;. Most products are solving genuinely different problems and only collide on the executive's PowerPoint slide labelled "AI Security."&lt;/p&gt;

&lt;p&gt;If you are a CISO setting policy on AI use across the organization, your shortlist is in the &lt;em&gt;generic PII/DLP&lt;/em&gt; and &lt;em&gt;LLM I/O guardrails&lt;/em&gt; rows. Pick the one that fits your existing stack and compliance regime — most of them are good at what they do.&lt;/p&gt;

&lt;p&gt;If you are a developer who wants to keep using AI coding assistants without sending real source code to a third party, the field narrows fast. The full cycle — obfuscate before, work transparently in the IDE, apply back, and verify the build — is currently only addressed end-to-end by PromptCape. That is a technical gap, not a marketing one: round-tripping code through framework conventions and a 3-way merge is harder than redacting names from free text, and the rest of the market has reasonably chosen the easier problem.&lt;/p&gt;

&lt;p&gt;Both directions are valid and not in competition with each other. A mature AI strategy probably involves one product from each row of the categorization above: a DLP layer at egress, a guardrail layer around your AI apps, a tokenization layer for stored sensitive data, and — if developers in your org code in the IDE every day with AI assistants — a code-obfuscation layer that closes the loop between the IDE and the model.&lt;/p&gt;

&lt;p&gt;PromptCape is open for trial at &lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;https://promptcape.com/&lt;/a&gt; — free for 3 months, no credit card required. The companion deep-dive on what makes the Java cycle hard is in &lt;a href="https://dev.to/genevieve_breton_cb795f52/java-code-obfuscation-for-ai-assistants-ensuring-the-full-cycle-works-d5"&gt;Java Code Obfuscation for AI Assistants: Ensuring the Full Cycle Works&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;p&gt;Links to the 13 products reviewed in this article, in the order they appear in the comparison table. All URLs verified May 2026.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://addons.mozilla.org/en-US/firefox/addon/chatwall/" rel="noopener noreferrer"&gt;ChatWall&lt;/a&gt; — Firefox extension for browser AI chat anonymization&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.cypherai.ai/" rel="noopener noreferrer"&gt;Cypher AI&lt;/a&gt; — TFHE-based encrypted LLM inference with customer-managed keys; multi-scheme FHE (TFHE/BFV/CKKS/Paillier); 128-bit post-quantum; ~400× faster than Microsoft SEAL on 10M records; NVIDIA Inception member, validated by 2 independent security agencies&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cloud.google.com/sensitive-data-protection" rel="noopener noreferrer"&gt;Google Cloud DLP / Sensitive Data Protection&lt;/a&gt; — managed PII redaction and tokenization&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.lakera.ai/lakera-guard" rel="noopener noreferrer"&gt;Lakera Guard&lt;/a&gt; — real-time AI/agent attack detection&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.getlimina.ai/en" rel="noopener noreferrer"&gt;Limina AI&lt;/a&gt; — context-aware PII / PHI / PCI redaction in VPC containers&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/protectai/llm-guard" rel="noopener noreferrer"&gt;LLM Guard&lt;/a&gt; — input/output scanners around LLM calls (Protect AI)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://microsoft.github.io/presidio/" rel="noopener noreferrer"&gt;Microsoft Presidio&lt;/a&gt; — open-source PII redaction framework&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/NVIDIA/NeMo-Guardrails" rel="noopener noreferrer"&gt;NVIDIA NeMo Guardrails&lt;/a&gt; — programmable conversational containment&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;PromptCape&lt;/a&gt; — Java &amp;amp; Python code obfuscation proxy for AI coding assistants&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://quieta.ai/en/" rel="noopener noreferrer"&gt;Quieta&lt;/a&gt; — local-only desktop anonymizer (macOS / Windows)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.skyflow.com/" rel="noopener noreferrer"&gt;Skyflow&lt;/a&gt; — data vault and tokenization&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.titanone.ai/" rel="noopener noreferrer"&gt;TitanOne&lt;/a&gt; — AI Data Firewall with context-preserving substitution and response re-enrichment&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://zeuslock.ai/en/" rel="noopener noreferrer"&gt;ZeusLock / Zeus DLP&lt;/a&gt; — DLP, secrets detection, and Shadow AI Detection for AI tooling&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>java</category>
      <category>security</category>
      <category>productivity</category>
    </item>
    <item>
      <title>PromptCape vs PromptBase: similar names, different products</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Tue, 26 May 2026 14:40:32 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/promptcape-vs-promptbase-similar-names-different-products-4pej</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/promptcape-vs-promptbase-similar-names-different-products-4pej</guid>
      <description>&lt;p&gt;I keep getting the same question: &lt;em&gt;"Is PromptCape the same thing as PromptBase?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;No. They're different products solving different problems for different audiences. The names look alike, the spaces overlap (both are AI-adjacent), and Google sometimes autocorrects one to the other. This short article exists to make the distinction explicit — for humans, and for search engines.&lt;/p&gt;

&lt;p&gt;If you're here because you typed "promptcape" into Google and landed on PromptBase, this article is the bridge.&lt;/p&gt;




&lt;h2&gt;
  
  
  PromptBase, in one paragraph
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://promptbase.com/" rel="noopener noreferrer"&gt;PromptBase&lt;/a&gt; is a marketplace for AI prompts. Designers, copywriters, and AI hobbyists list prompts they've engineered for image models (Midjourney, Stable Diffusion, DALL·E) and for text models (ChatGPT, Claude). Buyers download the prompts and use them to generate their own content. Think of it as Etsy for prompt engineering. It launched in 2022 and has been growing steadily as the "selling prompts as digital products" niche matured.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Target user:&lt;/strong&gt; anyone who uses AI tools to generate content (visual, marketing, copy) and wants higher-quality prompts than they could write themselves.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem solved:&lt;/strong&gt; distribution and monetization of prompt engineering as craft.&lt;/p&gt;




&lt;h2&gt;
  
  
  PromptCape, in one paragraph
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;PromptCape&lt;/a&gt; is a Java code obfuscation proxy for AI coding assistants. It sits between your IDE (Claude Code, Cursor, Mistral) and the AI API, renames every identifier in your source code — &lt;code&gt;InvoiceService&lt;/code&gt; becomes &lt;code&gt;Cls_a1b2c3d4&lt;/code&gt;, &lt;code&gt;customerName&lt;/code&gt; becomes &lt;code&gt;fld_e5f6a7b8&lt;/code&gt; — sends the obfuscated version to the AI, then reverses the rename on the way back. The AI works on the obfuscated code without ever seeing your real class names, package structure, or business domain language. It targets developers and teams who have IP protection clauses, NDAs, or compliance requirements that constrain what can be sent to cloud-based AI tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Target user:&lt;/strong&gt; Java developers and engineering teams who want to use AI coding assistants without exposing proprietary source code to AI training corpora or third-party logs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem solved:&lt;/strong&gt; keeping source code IP private while still benefiting from AI coding assistance.&lt;/p&gt;




&lt;h2&gt;
  
  
  Side by side
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;PromptBase&lt;/th&gt;
&lt;th&gt;PromptCape&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;What it is&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Marketplace for AI prompts&lt;/td&gt;
&lt;td&gt;Code obfuscation proxy for AI assistants&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Audience&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Content creators, designers, marketers&lt;/td&gt;
&lt;td&gt;Software developers and engineering teams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Primary value&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Buy/sell pre-engineered prompts&lt;/td&gt;
&lt;td&gt;Protect source code from AI cloud APIs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Used for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Image generation, copywriting&lt;/td&gt;
&lt;td&gt;Java coding with Claude Code / Cursor / Mistral&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Touches your code?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes — that's the whole point&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pricing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Per-prompt purchases&lt;/td&gt;
&lt;td&gt;Free for 3 months, then paid license&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Founded&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2022&lt;/td&gt;
&lt;td&gt;2026&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The overlap is zero. PromptBase is about the inputs (prompts as digital products). PromptCape is about the inputs &lt;em&gt;and&lt;/em&gt; the outputs of an AI coding loop, with a strong focus on what leaves your machine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why the names look so close
&lt;/h2&gt;

&lt;p&gt;Both names start with &lt;em&gt;"Prompt"&lt;/em&gt; because both are in the AI space. The follow-on word makes the difference:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PromptBase&lt;/strong&gt; — "Base" as in &lt;em&gt;database&lt;/em&gt;, &lt;em&gt;foundation&lt;/em&gt;, the collection where prompts live and are exchanged.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PromptCape&lt;/strong&gt; — "Cape" as in &lt;em&gt;the garment that shields&lt;/em&gt;; a cape over your code before it travels to the AI.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I picked "Cape" for the protection metaphor, knowing the proximity to existing names was a risk. The metaphor is the whole product positioning: &lt;em&gt;your code is protected when it goes out and comes back.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Which one do you actually need?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;"I want to find a great Midjourney prompt for a vaporwave cityscape"&lt;/em&gt; → &lt;strong&gt;PromptBase&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;"I want to find a great ChatGPT prompt for cold sales emails"&lt;/em&gt; → &lt;strong&gt;PromptBase&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;"I want to use Claude Code on a private Java repo without sending real class names to Anthropic"&lt;/em&gt; → &lt;strong&gt;PromptCape&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;"My client added a no-AI-assistants clause and I want to comply without giving up AI productivity"&lt;/em&gt; → &lt;strong&gt;PromptCape&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;"I work in a regulated industry (banking, health, defense) and need to obfuscate source identifiers before AI calls"&lt;/em&gt; → &lt;strong&gt;PromptCape&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're a developer and the latter sounds relevant, the landing page is at &lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;promptcape.com&lt;/a&gt;. If you're looking for prompts to buy, you want &lt;a href="https://promptbase.com/" rel="noopener noreferrer"&gt;promptbase.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;No rivalry, no overlap. Just two products with names that happen to start with the same five letters.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>java</category>
      <category>security</category>
    </item>
    <item>
      <title>Building a transparent terminal-based proxy for Claude Code in Cursor (or any IDE)</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Thu, 21 May 2026 16:18:14 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/building-a-transparent-terminal-based-proxy-for-claude-code-in-cursor-or-any-ide-3k1p</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/building-a-transparent-terminal-based-proxy-for-claude-code-in-cursor-or-any-ide-3k1p</guid>
      <description>&lt;p&gt;The previous two articles in this series (&lt;a href="https://dev.to/genevieve_breton_cb795f52/java-code-obfuscation-for-ai-assistants-ensuring-the-full-cycle-works-d5"&gt;part 1: obfuscation&lt;/a&gt;, &lt;a href="https://dev.to/genevieve_breton_cb795f52/reverse-applying-ai-changes-to-obfuscated-code-a-3-way-merge-that-actually-works-15gm"&gt;part 2: the 3-way merge&lt;/a&gt;) were about what happens to your code. This one is about what happens to your &lt;em&gt;developer&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;I had a CLI that could obfuscate a Java project, send it to Claude, and merge the changes back. The pipeline worked. But the actual day-to-day flow was: run a CLI command to obfuscate, copy the obfuscated workspace path, paste it into Claude Code, work in Claude, copy the AI's output back, run another CLI command to merge. Five context switches per AI interaction. Nobody — including me — was going to use it twice.&lt;/p&gt;

&lt;p&gt;The friction was the integration. Every IDE has its own way of talking to Claude or to OpenAI. Cursor has its own Claude pane, JetBrains has its own AI assistant, VS Code has Copilot. I was &lt;em&gt;not&lt;/em&gt; going to build a plugin for each one, maintain it, watch them break every release.&lt;/p&gt;

&lt;p&gt;The shortcut that solved it: a transparent localhost HTTP proxy. About 200 lines of code, no IDE plugin, no Cursor extension, no fork of anything. The developer types &lt;code&gt;claude&lt;/code&gt; in Cursor's built-in terminal and PromptCape is silently between them and the API.&lt;/p&gt;

&lt;p&gt;This article is the &lt;em&gt;how&lt;/em&gt; of that proxy: the architectural choice, the five traps that made it harder than I expected, and why this approach generalizes to almost anything that talks to an LLM.&lt;/p&gt;




&lt;h2&gt;
  
  
  The decision: don't wrap the IDE, wrap the network
&lt;/h2&gt;

&lt;p&gt;When you set out to integrate a tool into an IDE, the obvious-looking path is to write a plugin. JetBrains has its plugin API, VS Code has its extension model, Cursor has its own integrations. You quickly realize:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each IDE has its own API, packaging, and review process.&lt;/li&gt;
&lt;li&gt;AI features inside each IDE evolve fast — every release threatens to move where the conversation hooks live.&lt;/li&gt;
&lt;li&gt;For a tool that has to see &lt;em&gt;every&lt;/em&gt; prompt and &lt;em&gt;every&lt;/em&gt; response, you end up reimplementing the wire protocol per IDE anyway.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The shortcut nobody mentions: &lt;strong&gt;every modern AI coding assistant respects a base URL environment variable.&lt;/strong&gt; Claude Code uses &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt;. The OpenAI ecosystem (which Cursor and many others speak) uses &lt;code&gt;OPENAI_BASE_URL&lt;/code&gt;. Set it, and the client points at &lt;em&gt;your&lt;/em&gt; server instead of &lt;code&gt;api.anthropic.com&lt;/code&gt; or &lt;code&gt;api.openai.com&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That collapses the integration problem from "write N IDE plugins" to "run a reverse proxy on localhost." One code path. Every IDE that respects the env var works for free.&lt;/p&gt;

&lt;p&gt;The mental model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; Cursor terminal               PromptCape proxy        Anthropic API
 ┌─────────────┐  obfuscation  ┌──────────────┐  HTTPS  ┌──────────┐
 │   claude    │ ────────────► │  localhost   │ ───────►│  real    │
 │   (CLI)     │ ◄──────────── │   :8077      │ ◄───────│  API     │
 └─────────────┘   de-obf'd    └──────────────┘   obf'd └──────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From Cursor's point of view, the user opened a terminal and ran &lt;code&gt;claude&lt;/code&gt;. There is no extension. There is no patched binary. The proxy is invisible to the IDE because the IDE was never the integration point — the network was.&lt;/p&gt;




&lt;h2&gt;
  
  
  The bare minimum
&lt;/h2&gt;

&lt;p&gt;Stripped of the obfuscation logic, the proxy is uncomfortably simple. A Javalin-based catch-all that takes any POST, rewrites the body, forwards it to the real API, and pipes the response back:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;post&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/*"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;rewritten&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;interceptRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="nc"&gt;HttpRequest&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newBuilder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;URI&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;targetBaseUrl&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;POST&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BodyPublishers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rewritten&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="c1"&gt;// ... forward headers, minus hop-by-hop&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;send&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                             &lt;span class="nc"&gt;BodyHandlers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;statusCode&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;result&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;interceptResponse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
&lt;span class="o"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your "interception" is a no-op, that's a transparent proxy. The two interception methods are where obfuscation happens — translating real names → obfuscated names on the way out, and obfuscated → real on the way back.&lt;/p&gt;

&lt;p&gt;The thing that surprised me is how &lt;em&gt;little&lt;/em&gt; IDE knowledge is needed. The IDE never sees the proxy, never knows the URL was rewritten, never knows the conversation passed through anything. The contract is HTTP and a base URL.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trap 1: streaming responses
&lt;/h2&gt;

&lt;p&gt;The first version handled responses with &lt;code&gt;BodyHandlers.ofString()&lt;/code&gt; — buffer the whole response, transform, return. Claude Code uses streaming responses (SSE — server-sent events). The first time I tested under real load, the user-visible behavior was: silence for 8 seconds, then the entire answer dumped at once.&lt;/p&gt;

&lt;p&gt;Streaming isn't a nice-to-have. Developers expect tokens to flow as they're generated; that's a big chunk of what "feels like AI" is. You have to forward chunks as they arrive &lt;em&gt;and&lt;/em&gt; de-obfuscate them on the fly.&lt;/p&gt;

&lt;p&gt;The Java HTTP client supports &lt;code&gt;BodyHandlers.ofInputStream()&lt;/code&gt;, which gives you an open socket. You read SSE events line by line, run each one through the de-obfuscation pass, write it back to the client's output stream, flush after each event boundary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;InputStream&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;send&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                         &lt;span class="nc"&gt;BodyHandlers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofInputStream&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;BufferedReader&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BufferedReader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; 
                &lt;span class="nc"&gt;InputStreamReader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="no"&gt;UTF_8&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
     &lt;span class="nc"&gt;OutputStream&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;outputStream&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readLine&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;processed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;processLine&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;write&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;processed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getBytes&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;UTF_8&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;write&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'\n'&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;flush&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;&lt;span class="c1"&gt;// SSE event boundary&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The subtlety is in &lt;code&gt;processor.processLine&lt;/code&gt;. SSE events look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;event:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;content_block_delta&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;data:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"content_block_delta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"delta"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"text_delta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"InvoiceService"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can't just regex-replace on the raw line — &lt;code&gt;InvoiceService&lt;/code&gt; might be split across two chunks (&lt;code&gt;Invoice&lt;/code&gt; in one, &lt;code&gt;Service&lt;/code&gt; in the next) by the server's tokenizer. The processor maintains a small carry-over buffer that holds the trailing bit of the previous chunk, joins it with the new chunk, runs the replacement, then writes everything except a tail of length max-mapping-length back out.&lt;/p&gt;

&lt;p&gt;This is the kind of thing that doesn't show up in unit tests with full strings but breaks the moment a real API tokenizes mid-identifier. The fix is mechanical once you see it — but you'll only see it if you test against the real API, not a mocked one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trap 2: accept-encoding
&lt;/h2&gt;

&lt;p&gt;This one cost me a day. The proxy was buffering responses fine, but the de-obfuscation logic was matching zero identifiers. The response body looked like binary garbage in the logs.&lt;/p&gt;

&lt;p&gt;The cause: I was faithfully forwarding the IDE's request headers — including &lt;code&gt;accept-encoding: gzip, br&lt;/code&gt;. The real API obliged and returned a gzipped response. My text-based interceptor parsed the gzipped bytes as if they were JSON, found no identifiers to replace, and forwarded the still-gzipped bytes to the client. The client decompressed them on its end, so the user saw a plausible response — but with no obfuscation reversal.&lt;/p&gt;

&lt;p&gt;The fix is one line: strip &lt;code&gt;accept-encoding&lt;/code&gt; from the forwarded request. Now the API returns uncompressed JSON, the interceptor sees text, the round trip works.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;HOP_BY_HOP_HEADERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"host"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"connection"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"keep-alive"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"transfer-encoding"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"te"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"trailer"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"upgrade"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"content-length"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"accept-encoding"&lt;/span&gt; &lt;span class="c1"&gt;// ← critical: keep responses uncompressed&lt;/span&gt;
&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Worth a half-line comment in the code. It's the kind of single-character mistake that produces a &lt;em&gt;silently wrong&lt;/em&gt; system, not a noisy crash.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trap 3: don't translate tool blocks
&lt;/h2&gt;

&lt;p&gt;Claude's API content isn't a flat string. It's a list of typed blocks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Refactor InvoiceService to use Optional"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tool_result"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"tool_use_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"package com.acme; ..."&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tool_use"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"read_file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The user-typed text needs translation: &lt;code&gt;InvoiceService&lt;/code&gt; → &lt;code&gt;Cls_a1b2c3d4&lt;/code&gt;. But the &lt;code&gt;tool_result&lt;/code&gt; block contains the contents of a file the AI just read — &lt;em&gt;from the obfuscated workspace&lt;/em&gt;. It's already obfuscated. If I run it through the translator, nothing visibly happens (the obfuscated names don't match the real-name patterns), but the moment a real name accidentally appears in a comment that survived stripping, you've now obfuscated something inside a string that came from an already-obfuscated context. It rapidly gets harder to round-trip.&lt;/p&gt;

&lt;p&gt;The fix: walk the content array, look at the &lt;code&gt;type&lt;/code&gt; field, only translate &lt;code&gt;"text"&lt;/code&gt; blocks. Leave &lt;code&gt;"tool_result"&lt;/code&gt; and &lt;code&gt;"tool_use"&lt;/code&gt; blocks untouched.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;JsonNode&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;contentArray&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;asText&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;has&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;ObjectNode&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; 
          &lt;span class="n"&gt;translateText&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;asText&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;//tool_use, tool_result → leave alone, already in obfuscated space&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the corollary of the bigger architectural choice: the AI works in an obfuscated &lt;em&gt;workspace&lt;/em&gt;, not just on obfuscated &lt;em&gt;prompts&lt;/em&gt;. The file system the AI sees through &lt;code&gt;read_file&lt;/code&gt; is the obfuscated cache directory. Everything it reads is already obfuscated. The proxy only needs to translate the human-readable channel: what the user types, and what the AI replies in text.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trap 4: HTTP/2 pseudo-headers
&lt;/h2&gt;

&lt;p&gt;This was the obscure one. The Java HTTP client speaks HTTP/2 to modern APIs. HTTP/2 has pseudo-headers — &lt;code&gt;:status&lt;/code&gt;, &lt;code&gt;:method&lt;/code&gt;, &lt;code&gt;:path&lt;/code&gt; — that are legal at the protocol layer but illegal in HTTP/1.1 responses. My proxy was happily copying &lt;em&gt;every&lt;/em&gt; response header from the API back to the Cursor terminal, including &lt;code&gt;:status&lt;/code&gt;. Some clients tolerate this; some (Claude Code) reject the response.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;apiResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;forEach&lt;/span&gt;&lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;lower&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toLowerCase&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;startsWith&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;":"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;&lt;span class="c1"&gt;// skip HTTP/2 pseudo-headers&lt;/span&gt;
    &lt;span class="c1"&gt;// ... forward the rest&lt;/span&gt;
&lt;span class="o"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One of those bugs that exists at the protocol seam between two HTTP versions. The Java HTTP client gives you the HTTP/2 headers in their HTTP/2 form, and you're shipping them to a client that may or may not be reading HTTP/2 framing. Filter aggressively.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trap 5: making it forget-about-it-able
&lt;/h2&gt;

&lt;p&gt;A foreground proxy in a terminal works for a demo. For daily use, developers want the proxy running quietly in the background so they can open a new terminal and &lt;code&gt;claude&lt;/code&gt; immediately. So the CLI grew a &lt;code&gt;--detach&lt;/code&gt; mode:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spawn a child JVM running the proxy in the foreground.&lt;/li&gt;
&lt;li&gt;Inherit env (so the license key propagates).&lt;/li&gt;
&lt;li&gt;Redirect stdout/stderr to &lt;code&gt;~/.promptcape/proxy.log&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Write the child PID to &lt;code&gt;~/.promptcape/proxy.pid&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Wait up to 5 seconds for the port to come up, then exit.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;ProcessBuilder&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ProcessBuilder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;redirectErrorStream&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;redirectOutput&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ProcessBuilder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Redirect&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;appendTo&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logFile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toFile&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;redirectInput&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;File&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;isWin&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="s"&gt;"NUL"&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"/dev/null"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

&lt;span class="nc"&gt;Process&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="nc"&gt;Files&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;writeString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pidFile&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;valueOf&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;pid&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Plus a &lt;code&gt;--stop&lt;/code&gt; that's idempotent (returns 0 if the proxy is already gone — a stale PID file isn't an error), and a &lt;code&gt;--logs&lt;/code&gt; that tails the log file with the rotation handling you'd expect.&lt;/p&gt;

&lt;p&gt;These are the kinds of features users discover they need three days in. &lt;em&gt;"How do I see what the proxy is doing without restarting it in the foreground?"&lt;/em&gt; — &lt;code&gt;--logs&lt;/code&gt;. &lt;em&gt;"I don't remember if the proxy is running, can I just run &lt;code&gt;--stop&lt;/code&gt; to be safe?"&lt;/em&gt; — yes, it's idempotent. None of this is technically deep, but skipping it makes the tool feel rough.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Cursor angle: there is no Cursor angle
&lt;/h2&gt;

&lt;p&gt;Here's the punchline. Once you have a localhost reverse proxy that respects &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt;, integrating with Cursor isn't a feature. It's the absence of one.&lt;/p&gt;

&lt;p&gt;The workflow inside Cursor:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open Cursor.&lt;/li&gt;
&lt;li&gt;Open the built-in terminal (Ctrl+`).&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;promptcape proxy --detach&lt;/code&gt; (or have it running already from a startup script).&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;ANTHROPIC_BASE_URL=http://localhost:8077 claude&lt;/code&gt; — or just &lt;code&gt;claude&lt;/code&gt; if you exported the env var.&lt;/li&gt;
&lt;li&gt;Use Claude Code normally.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There is no Cursor plugin to install. There is no JSON config to edit. There is no &lt;code&gt;.cursorrules&lt;/code&gt; file to set up. The terminal is just a shell, the shell respects environment variables, the env var changes the API endpoint, the proxy does the rest.&lt;/p&gt;

&lt;p&gt;That's the win. The integration cost — for me, for the user, for every future IDE — collapsed to nothing.&lt;/p&gt;

&lt;p&gt;You can wrap it up as a small launcher script. I called mine &lt;code&gt;pcc&lt;/code&gt; (PromptCape Claude). It does &lt;code&gt;export ANTHROPIC_BASE_URL=...; exec claude "$@"&lt;/code&gt;. Three lines. The user types &lt;code&gt;pcc&lt;/code&gt; instead of &lt;code&gt;claude&lt;/code&gt; and everything is obfuscated end to end.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this generalizes
&lt;/h2&gt;

&lt;p&gt;I think the broader takeaway is worth more than the specific implementation:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If a tool you want to integrate with reads an HTTP endpoint, write a reverse proxy before you write a plugin.&lt;/strong&gt; The endpoint is the integration point. The plugin is at best a config helper around the same indirection.&lt;/p&gt;

&lt;p&gt;This applies far beyond AI tooling. Anything that talks to a SaaS API and respects a base URL — analytics, observability, payments — can be sandboxed, intercepted, transformed, or replayed with the same pattern. Plugins are per-IDE; proxies are per-protocol. Per-protocol wins.&lt;/p&gt;

&lt;p&gt;The specific lesson for AI tooling: &lt;strong&gt;the prompt and the workspace are different channels.&lt;/strong&gt; Translating the workspace (the file system the AI reads through tools) and translating the prompt (the human-typed text) are two different problems. Conflate them and you double-obfuscate. Keep them separate, type-tagged content blocks make this trivial, and the proxy stays small.&lt;/p&gt;




&lt;p&gt;If you want to see the proxy code in full, the streaming SSE processor, and the conversation samples (real-name in, obfuscated-name on the wire, real-name back), the worked examples are in &lt;a href="https://gitlab.com/gbreton7/promptcape-docs" rel="noopener noreferrer"&gt;gitlab.com/gbreton7/promptcape-docs&lt;/a&gt;. This is the third and last article of the &lt;strong&gt;&lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;PromptCape&lt;/a&gt;&lt;/strong&gt; series — obfuscation pipeline, 3-way merge, transparent proxy. MRs welcome on the docs repo if you've integrated this pattern with an IDE I haven't tried.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>security</category>
    </item>
    <item>
      <title>Building a transparent terminal-based proxy for Claude Code in Cursor (or any IDE)</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Thu, 21 May 2026 16:18:14 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/building-a-transparent-terminal-based-proxy-for-claude-code-in-cursor-or-any-ide-3i9j</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/building-a-transparent-terminal-based-proxy-for-claude-code-in-cursor-or-any-ide-3i9j</guid>
      <description>&lt;p&gt;The previous two articles in this series (&lt;a href="https://dev.to/genevieve_breton_cb795f52/java-code-obfuscation-for-ai-assistants-ensuring-the-full-cycle-works-d5"&gt;part 1: obfuscation&lt;/a&gt;, &lt;a href="https://dev.to/genevieve_breton_cb795f52/reverse-applying-ai-changes-to-obfuscated-code-a-3-way-merge-that-actually-works-15gm"&gt;part 2: the 3-way merge&lt;/a&gt;) were about what happens to your code. This one is about what happens to your &lt;em&gt;developer&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;I had a CLI that could obfuscate a Java project, send it to Claude, and merge the changes back. The pipeline worked. But the actual day-to-day flow was: run a CLI command to obfuscate, copy the obfuscated workspace path, paste it into Claude Code, work in Claude, copy the AI's output back, run another CLI command to merge. Five context switches per AI interaction. Nobody — including me — was going to use it twice.&lt;/p&gt;

&lt;p&gt;The friction was the integration. Every IDE has its own way of talking to Claude or to OpenAI. Cursor has its own Claude pane, JetBrains has its own AI assistant, VS Code has Copilot. I was &lt;em&gt;not&lt;/em&gt; going to build a plugin for each one, maintain it, watch them break every release.&lt;/p&gt;

&lt;p&gt;The shortcut that solved it: a transparent localhost HTTP proxy. About 200 lines of code, no IDE plugin, no Cursor extension, no fork of anything. The developer types &lt;code&gt;claude&lt;/code&gt; in Cursor's built-in terminal and PromptCape is silently between them and the API.&lt;/p&gt;

&lt;p&gt;This article is the &lt;em&gt;how&lt;/em&gt; of that proxy: the architectural choice, the five traps that made it harder than I expected, and why this approach generalizes to almost anything that talks to an LLM.&lt;/p&gt;




&lt;h2&gt;
  
  
  The decision: don't wrap the IDE, wrap the network
&lt;/h2&gt;

&lt;p&gt;When you set out to integrate a tool into an IDE, the obvious-looking path is to write a plugin. JetBrains has its plugin API, VS Code has its extension model, Cursor has its own integrations. You quickly realize:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each IDE has its own API, packaging, and review process.&lt;/li&gt;
&lt;li&gt;AI features inside each IDE evolve fast — every release threatens to move where the conversation hooks live.&lt;/li&gt;
&lt;li&gt;For a tool that has to see &lt;em&gt;every&lt;/em&gt; prompt and &lt;em&gt;every&lt;/em&gt; response, you end up reimplementing the wire protocol per IDE anyway.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The shortcut nobody mentions: &lt;strong&gt;every modern AI coding assistant respects a base URL environment variable.&lt;/strong&gt; Claude Code uses &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt;. The OpenAI ecosystem (which Cursor and many others speak) uses &lt;code&gt;OPENAI_BASE_URL&lt;/code&gt;. Set it, and the client points at &lt;em&gt;your&lt;/em&gt; server instead of &lt;code&gt;api.anthropic.com&lt;/code&gt; or &lt;code&gt;api.openai.com&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That collapses the integration problem from "write N IDE plugins" to "run a reverse proxy on localhost." One code path. Every IDE that respects the env var works for free.&lt;/p&gt;

&lt;p&gt;The mental model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; Cursor terminal               PromptCape proxy        Anthropic API
 ┌─────────────┐  obfuscation  ┌──────────────┐  HTTPS  ┌──────────┐
 │   claude    │ ────────────► │  localhost   │ ───────►│  real    │
 │   (CLI)     │ ◄──────────── │   :8077      │ ◄───────│  API     │
 └─────────────┘   de-obf'd    └──────────────┘   obf'd └──────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From Cursor's point of view, the user opened a terminal and ran &lt;code&gt;claude&lt;/code&gt;. There is no extension. There is no patched binary. The proxy is invisible to the IDE because the IDE was never the integration point — the network was.&lt;/p&gt;




&lt;h2&gt;
  
  
  The bare minimum
&lt;/h2&gt;

&lt;p&gt;Stripped of the obfuscation logic, the proxy is uncomfortably simple. A Javalin-based catch-all that takes any POST, rewrites the body, forwards it to the real API, and pipes the response back:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;post&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/*"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;rewritten&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;interceptRequest&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="nc"&gt;HttpRequest&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newBuilder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;URI&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;targetBaseUrl&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;POST&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BodyPublishers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rewritten&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="c1"&gt;// ... forward headers, minus hop-by-hop&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;send&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                             &lt;span class="nc"&gt;BodyHandlers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;statusCode&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;result&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;interceptResponse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
&lt;span class="o"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your "interception" is a no-op, that's a transparent proxy. The two interception methods are where obfuscation happens — translating real names → obfuscated names on the way out, and obfuscated → real on the way back.&lt;/p&gt;

&lt;p&gt;The thing that surprised me is how &lt;em&gt;little&lt;/em&gt; IDE knowledge is needed. The IDE never sees the proxy, never knows the URL was rewritten, never knows the conversation passed through anything. The contract is HTTP and a base URL.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trap 1: streaming responses
&lt;/h2&gt;

&lt;p&gt;The first version handled responses with &lt;code&gt;BodyHandlers.ofString()&lt;/code&gt; — buffer the whole response, transform, return. Claude Code uses streaming responses (SSE — server-sent events). The first time I tested under real load, the user-visible behavior was: silence for 8 seconds, then the entire answer dumped at once.&lt;/p&gt;

&lt;p&gt;Streaming isn't a nice-to-have. Developers expect tokens to flow as they're generated; that's a big chunk of what "feels like AI" is. You have to forward chunks as they arrive &lt;em&gt;and&lt;/em&gt; de-obfuscate them on the fly.&lt;/p&gt;

&lt;p&gt;The Java HTTP client supports &lt;code&gt;BodyHandlers.ofInputStream()&lt;/code&gt;, which gives you an open socket. You read SSE events line by line, run each one through the de-obfuscation pass, write it back to the client's output stream, flush after each event boundary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;InputStream&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;send&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                         &lt;span class="nc"&gt;BodyHandlers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofInputStream&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;BufferedReader&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BufferedReader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; 
                &lt;span class="nc"&gt;InputStreamReader&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="no"&gt;UTF_8&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
     &lt;span class="nc"&gt;OutputStream&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;outputStream&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readLine&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;processed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;processLine&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;write&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;processed&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getBytes&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;UTF_8&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;write&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'\n'&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isEmpty&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;flush&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;&lt;span class="c1"&gt;// SSE event boundary&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The subtlety is in &lt;code&gt;processor.processLine&lt;/code&gt;. SSE events look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;event:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;content_block_delta&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;data:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"content_block_delta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
       &lt;/span&gt;&lt;span class="nl"&gt;"delta"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"text_delta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"InvoiceService"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can't just regex-replace on the raw line — &lt;code&gt;InvoiceService&lt;/code&gt; might be split across two chunks (&lt;code&gt;Invoice&lt;/code&gt; in one, &lt;code&gt;Service&lt;/code&gt; in the next) by the server's tokenizer. The processor maintains a small carry-over buffer that holds the trailing bit of the previous chunk, joins it with the new chunk, runs the replacement, then writes everything except a tail of length max-mapping-length back out.&lt;/p&gt;

&lt;p&gt;This is the kind of thing that doesn't show up in unit tests with full strings but breaks the moment a real API tokenizes mid-identifier. The fix is mechanical once you see it — but you'll only see it if you test against the real API, not a mocked one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trap 2: accept-encoding
&lt;/h2&gt;

&lt;p&gt;This one cost me a day. The proxy was buffering responses fine, but the de-obfuscation logic was matching zero identifiers. The response body looked like binary garbage in the logs.&lt;/p&gt;

&lt;p&gt;The cause: I was faithfully forwarding the IDE's request headers — including &lt;code&gt;accept-encoding: gzip, br&lt;/code&gt;. The real API obliged and returned a gzipped response. My text-based interceptor parsed the gzipped bytes as if they were JSON, found no identifiers to replace, and forwarded the still-gzipped bytes to the client. The client decompressed them on its end, so the user saw a plausible response — but with no obfuscation reversal.&lt;/p&gt;

&lt;p&gt;The fix is one line: strip &lt;code&gt;accept-encoding&lt;/code&gt; from the forwarded request. Now the API returns uncompressed JSON, the interceptor sees text, the round trip works.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;HOP_BY_HOP_HEADERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"host"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"connection"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"keep-alive"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"transfer-encoding"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"te"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"trailer"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"upgrade"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"content-length"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"accept-encoding"&lt;/span&gt; &lt;span class="c1"&gt;// ← critical: keep responses uncompressed&lt;/span&gt;
&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Worth a half-line comment in the code. It's the kind of single-character mistake that produces a &lt;em&gt;silently wrong&lt;/em&gt; system, not a noisy crash.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trap 3: don't translate tool blocks
&lt;/h2&gt;

&lt;p&gt;Claude's API content isn't a flat string. It's a list of typed blocks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"messages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Refactor InvoiceService to use Optional"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tool_result"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"tool_use_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"package com.acme; ..."&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tool_use"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"read_file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; 
       &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The user-typed text needs translation: &lt;code&gt;InvoiceService&lt;/code&gt; → &lt;code&gt;Cls_a1b2c3d4&lt;/code&gt;. But the &lt;code&gt;tool_result&lt;/code&gt; block contains the contents of a file the AI just read — &lt;em&gt;from the obfuscated workspace&lt;/em&gt;. It's already obfuscated. If I run it through the translator, nothing visibly happens (the obfuscated names don't match the real-name patterns), but the moment a real name accidentally appears in a comment that survived stripping, you've now obfuscated something inside a string that came from an already-obfuscated context. It rapidly gets harder to round-trip.&lt;/p&gt;

&lt;p&gt;The fix: walk the content array, look at the &lt;code&gt;type&lt;/code&gt; field, only translate &lt;code&gt;"text"&lt;/code&gt; blocks. Leave &lt;code&gt;"tool_result"&lt;/code&gt; and &lt;code&gt;"tool_use"&lt;/code&gt; blocks untouched.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;JsonNode&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;contentArray&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;asText&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;has&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;ObjectNode&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; 
          &lt;span class="n"&gt;translateText&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;asText&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;//tool_use, tool_result → leave alone, already in obfuscated space&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the corollary of the bigger architectural choice: the AI works in an obfuscated &lt;em&gt;workspace&lt;/em&gt;, not just on obfuscated &lt;em&gt;prompts&lt;/em&gt;. The file system the AI sees through &lt;code&gt;read_file&lt;/code&gt; is the obfuscated cache directory. Everything it reads is already obfuscated. The proxy only needs to translate the human-readable channel: what the user types, and what the AI replies in text.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trap 4: HTTP/2 pseudo-headers
&lt;/h2&gt;

&lt;p&gt;This was the obscure one. The Java HTTP client speaks HTTP/2 to modern APIs. HTTP/2 has pseudo-headers — &lt;code&gt;:status&lt;/code&gt;, &lt;code&gt;:method&lt;/code&gt;, &lt;code&gt;:path&lt;/code&gt; — that are legal at the protocol layer but illegal in HTTP/1.1 responses. My proxy was happily copying &lt;em&gt;every&lt;/em&gt; response header from the API back to the Cursor terminal, including &lt;code&gt;:status&lt;/code&gt;. Some clients tolerate this; some (Claude Code) reject the response.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="n"&gt;apiResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;forEach&lt;/span&gt;&lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;values&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;lower&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toLowerCase&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;startsWith&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;":"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;&lt;span class="c1"&gt;// skip HTTP/2 pseudo-headers&lt;/span&gt;
    &lt;span class="c1"&gt;// ... forward the rest&lt;/span&gt;
&lt;span class="o"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One of those bugs that exists at the protocol seam between two HTTP versions. The Java HTTP client gives you the HTTP/2 headers in their HTTP/2 form, and you're shipping them to a client that may or may not be reading HTTP/2 framing. Filter aggressively.&lt;/p&gt;




&lt;h2&gt;
  
  
  Trap 5: making it forget-about-it-able
&lt;/h2&gt;

&lt;p&gt;A foreground proxy in a terminal works for a demo. For daily use, developers want the proxy running quietly in the background so they can open a new terminal and &lt;code&gt;claude&lt;/code&gt; immediately. So the CLI grew a &lt;code&gt;--detach&lt;/code&gt; mode:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spawn a child JVM running the proxy in the foreground.&lt;/li&gt;
&lt;li&gt;Inherit env (so the license key propagates).&lt;/li&gt;
&lt;li&gt;Redirect stdout/stderr to &lt;code&gt;~/.promptcape/proxy.log&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Write the child PID to &lt;code&gt;~/.promptcape/proxy.pid&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Wait up to 5 seconds for the port to come up, then exit.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;ProcessBuilder&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ProcessBuilder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;redirectErrorStream&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;redirectOutput&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ProcessBuilder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Redirect&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;appendTo&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logFile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toFile&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;redirectInput&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;File&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;isWin&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="s"&gt;"NUL"&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"/dev/null"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

&lt;span class="nc"&gt;Process&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="nc"&gt;Files&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;writeString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pidFile&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;valueOf&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;pid&lt;/span&gt;&lt;span class="o"&gt;()));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Plus a &lt;code&gt;--stop&lt;/code&gt; that's idempotent (returns 0 if the proxy is already gone — a stale PID file isn't an error), and a &lt;code&gt;--logs&lt;/code&gt; that tails the log file with the rotation handling you'd expect.&lt;/p&gt;

&lt;p&gt;These are the kinds of features users discover they need three days in. &lt;em&gt;"How do I see what the proxy is doing without restarting it in the foreground?"&lt;/em&gt; — &lt;code&gt;--logs&lt;/code&gt;. &lt;em&gt;"I don't remember if the proxy is running, can I just run &lt;code&gt;--stop&lt;/code&gt; to be safe?"&lt;/em&gt; — yes, it's idempotent. None of this is technically deep, but skipping it makes the tool feel rough.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Cursor angle: there is no Cursor angle
&lt;/h2&gt;

&lt;p&gt;Here's the punchline. Once you have a localhost reverse proxy that respects &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt;, integrating with Cursor isn't a feature. It's the absence of one.&lt;/p&gt;

&lt;p&gt;The workflow inside Cursor:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open Cursor.&lt;/li&gt;
&lt;li&gt;Open the built-in terminal (Ctrl+`).&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;promptcape proxy --detach&lt;/code&gt; (or have it running already from a startup script).&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;ANTHROPIC_BASE_URL=http://localhost:8077 claude&lt;/code&gt; — or just &lt;code&gt;claude&lt;/code&gt; if you exported the env var.&lt;/li&gt;
&lt;li&gt;Use Claude Code normally.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There is no Cursor plugin to install. There is no JSON config to edit. There is no &lt;code&gt;.cursorrules&lt;/code&gt; file to set up. The terminal is just a shell, the shell respects environment variables, the env var changes the API endpoint, the proxy does the rest.&lt;/p&gt;

&lt;p&gt;That's the win. The integration cost — for me, for the user, for every future IDE — collapsed to nothing.&lt;/p&gt;

&lt;p&gt;You can wrap it up as a small launcher script. I called mine &lt;code&gt;pcc&lt;/code&gt; (PromptCape Claude). It does &lt;code&gt;export ANTHROPIC_BASE_URL=...; exec claude "$@"&lt;/code&gt;. Three lines. The user types &lt;code&gt;pcc&lt;/code&gt; instead of &lt;code&gt;claude&lt;/code&gt; and everything is obfuscated end to end.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this generalizes
&lt;/h2&gt;

&lt;p&gt;I think the broader takeaway is worth more than the specific implementation:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If a tool you want to integrate with reads an HTTP endpoint, write a reverse proxy before you write a plugin.&lt;/strong&gt; The endpoint is the integration point. The plugin is at best a config helper around the same indirection.&lt;/p&gt;

&lt;p&gt;This applies far beyond AI tooling. Anything that talks to a SaaS API and respects a base URL — analytics, observability, payments — can be sandboxed, intercepted, transformed, or replayed with the same pattern. Plugins are per-IDE; proxies are per-protocol. Per-protocol wins.&lt;/p&gt;

&lt;p&gt;The specific lesson for AI tooling: &lt;strong&gt;the prompt and the workspace are different channels.&lt;/strong&gt; Translating the workspace (the file system the AI reads through tools) and translating the prompt (the human-typed text) are two different problems. Conflate them and you double-obfuscate. Keep them separate, type-tagged content blocks make this trivial, and the proxy stays small.&lt;/p&gt;




&lt;p&gt;If you want to see the proxy code in full, the streaming SSE processor, and the conversation samples (real-name in, obfuscated-name on the wire, real-name back), the worked examples are in &lt;a href="https://gitlab.com/gbreton7/promptcape-docs" rel="noopener noreferrer"&gt;gitlab.com/gbreton7/promptcape-docs&lt;/a&gt;. This is the third and last article of the &lt;strong&gt;&lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;PromptCape&lt;/a&gt;&lt;/strong&gt; series — obfuscation pipeline, 3-way merge, transparent proxy. MRs welcome on the docs repo if you've integrated this pattern with an IDE I haven't tried.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>security</category>
    </item>
    <item>
      <title>Reverse-applying AI changes to obfuscated code: a 3-way merge that actually works</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Tue, 19 May 2026 20:02:23 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/reverse-applying-ai-changes-to-obfuscated-code-a-3-way-merge-that-actually-works-15gm</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/reverse-applying-ai-changes-to-obfuscated-code-a-3-way-merge-that-actually-works-15gm</guid>
      <description>&lt;p&gt;In the &lt;a href="https://dev.to/genevieve_breton_cb795f52/java-code-obfuscation-for-ai-assistants-ensuring-the-full-cycle-works-d5"&gt;last article&lt;/a&gt; I went through what breaks when you obfuscate Java code before sending it to an AI assistant — Spring Data, JPA, Lombok, the whole framework iceberg. That was about getting the obfuscated source &lt;em&gt;out&lt;/em&gt; in a state the AI can work on.&lt;/p&gt;

&lt;p&gt;This one is about the much subtler half: getting the AI's changes back &lt;em&gt;in&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;It looks trivial. You sent &lt;code&gt;Cls_a1b2c3d4&lt;/code&gt; to the AI, the AI returned a modified &lt;code&gt;Cls_a1b2c3d4&lt;/code&gt;, you have a mapping table, just walk the file and replace each obfuscated identifier with its original. Done in twenty lines of code.&lt;/p&gt;

&lt;p&gt;Except your real file — the one a human will read tomorrow morning — now has no comments, no Javadoc, no blank lines between methods, no formatting choices you made over six months. The obfuscation pipeline stripped all of that on the way out. Reversing the rename doesn't bring it back.&lt;/p&gt;

&lt;p&gt;This is the story of why naive reverse-translation is wrong, why &lt;strong&gt;it's a 3-way merge problem, not a translation problem&lt;/strong&gt;, and what the merge actually has to handle in practice.&lt;/p&gt;




&lt;h2&gt;
  
  
  The naive reverse
&lt;/h2&gt;

&lt;p&gt;Here's what most people try first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;aiOutput&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;readAiResponse&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;realSource&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aiOutput&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Mapping&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;mappings&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;realSource&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;realSource&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;replace&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;obfuscated&lt;/span&gt;&lt;span class="o"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;real&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;writeRealFile&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;realSource&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set aside that you also need word-boundary regex and longest-match-first ordering to avoid prefix collisions — assume you handled all of that. The output is still wrong.&lt;/p&gt;

&lt;p&gt;Why? Because the file you sent to the AI was not just &lt;em&gt;renamed&lt;/em&gt;. It was also:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Comment-stripped.&lt;/strong&gt; Sending Javadoc and inline comments to the AI is gratuitous leakage — they contain plain-English domain language. So they get replaced with blank-equivalent lines before transmission.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reformatted in subtle ways.&lt;/strong&gt; Multi-line string literals get sanitized. Annotations on separate lines get coalesced. Blank lines are preserved but only by accident.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you reverse-translate the AI's output and write it back, you're overwriting your real source with the &lt;em&gt;obfuscation-pipeline-shaped&lt;/em&gt; version of itself, plus the AI's changes. Every comment you wrote is gone. Every formatting choice. Every blank line at the right place.&lt;/p&gt;

&lt;p&gt;The first time I ran this end-to-end on a real project, I tested it on a service class. The AI added one method. I diffed the result against my source: &lt;strong&gt;312 lines changed.&lt;/strong&gt; One of them was the AI's new method. The other 311 were comments and formatting I had just nuked.&lt;/p&gt;




&lt;h2&gt;
  
  
  The mental shift: it's a merge, not a translation
&lt;/h2&gt;

&lt;p&gt;Here's the model that finally clicked. The obfuscated file is not the canonical version of your source. It is a &lt;em&gt;projection&lt;/em&gt; of your source — one that lost information on purpose. You can't reconstruct your source from the projection alone. You need both.&lt;/p&gt;

&lt;p&gt;In &lt;code&gt;git&lt;/code&gt; terms: this is a 3-way merge.&lt;/p&gt;

&lt;p&gt;Three inputs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Snapshot&lt;/strong&gt; — the obfuscated version of your code &lt;em&gt;before&lt;/em&gt; the AI touched it. (Your "common ancestor.")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache&lt;/strong&gt; — the obfuscated version &lt;em&gt;after&lt;/em&gt; the AI's changes. (The "their" side.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real&lt;/strong&gt; — your actual source file, with all comments and formatting intact. (The "ours" side.)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The output is your real file, with only the AI's changes applied.&lt;/p&gt;

&lt;p&gt;The merge logic, in one sentence: &lt;strong&gt;for each line, if the AI didn't change it (snapshot line == cache line), keep your real line; if the AI changed it, de-obfuscate the cache line and use that.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Stated like that, it's almost obvious. The implementation has interesting corners.&lt;/p&gt;




&lt;h2&gt;
  
  
  The easy case: same line count
&lt;/h2&gt;

&lt;p&gt;When the AI modifies lines without adding or removing any, the line indices line up across all three files. The merge is one pass:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;snapshotLines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;snapshot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;split&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\n"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;cacheLines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;split&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\n"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;realLines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;real&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;split&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"\n"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="nc"&gt;StringBuilder&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StringBuilder&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;cacheLines&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;'\n'&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snapshotLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;].&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;]))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// AI didn't touch this line — keep the real version (with comments, formatting)&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;realLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;]);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// AI changed this line — de-obfuscate it&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deobfuscate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;]));&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. The whole trick is the &lt;code&gt;snapshotLines[i].equals(cacheLines[i])&lt;/code&gt; check. Equal obfuscated lines mean the AI didn't write here, so your real line — comments, blank lines, formatting — survives untouched. Only the cells the AI actually changed get the de-obfuscated translation.&lt;/p&gt;

&lt;p&gt;This single trick made the merge usable. On a typical edit (the AI adds a parameter, changes a return type, inserts a guard clause), it touches 5–20 lines and the rest of the file stays bit-for-bit identical to my source. No phantom formatting changes, no destroyed Javadoc.&lt;/p&gt;




&lt;h2&gt;
  
  
  The hard case: AI added or removed lines
&lt;/h2&gt;

&lt;p&gt;When the AI adds an &lt;code&gt;if&lt;/code&gt; block or removes a redundant method, line counts diverge between snapshot and cache. Now indices don't line up — line &lt;em&gt;N&lt;/em&gt; of the cache might correspond to line &lt;em&gt;N+3&lt;/em&gt; of the snapshot, or to nothing at all.&lt;/p&gt;

&lt;p&gt;You can pull in &lt;a href="https://github.com/java-diff-utils/java-diff-utils" rel="noopener noreferrer"&gt;java-diff-utils&lt;/a&gt; and run a real LCS-based diff here. I tried that first. It works, but it adds a dependency, the diff format needs translation, and for the size of edits the AI typically makes (5–50 lines), a homegrown linear walker is faster and easier to reason about.&lt;/p&gt;

&lt;p&gt;The walker keeps three indices — one per file — and decides at each step whether the current cache line is an unchanged line (advance all three), a modification (advance all three, but de-obfuscate the cache line), or an insertion (advance only the cache index):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;si&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ci&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ci&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;cacheLines&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;si&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;snapshotLines&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;snapshotLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;si&lt;/span&gt;&lt;span class="o"&gt;].&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="o"&gt;]))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// unchanged → keep real&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;realLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ri&lt;/span&gt;&lt;span class="o"&gt;]);&lt;/span&gt;
        &lt;span class="n"&gt;si&lt;/span&gt;&lt;span class="o"&gt;++;&lt;/span&gt; &lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="o"&gt;++;&lt;/span&gt; &lt;span class="n"&gt;ri&lt;/span&gt;&lt;span class="o"&gt;++;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// changed: modification or insertion?&lt;/span&gt;
        &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;isInsertion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;si&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;snapshotLines&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;look&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ci&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;look&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;cacheLines&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;look&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;ci&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;look&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snapshotLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;si&lt;/span&gt;&lt;span class="o"&gt;].&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;look&lt;/span&gt;&lt;span class="o"&gt;]))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;isInsertion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
                    &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
                &lt;span class="o"&gt;}&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;isInsertion&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// AI inserted a new line before the next snapshot line&lt;/span&gt;
            &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deobfuscate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="o"&gt;]));&lt;/span&gt;
            &lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="o"&gt;++;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// AI modified or replaced this line&lt;/span&gt;
            &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deobfuscate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheLines&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="o"&gt;]));&lt;/span&gt;
            &lt;span class="n"&gt;si&lt;/span&gt;&lt;span class="o"&gt;++;&lt;/span&gt; &lt;span class="n"&gt;ci&lt;/span&gt;&lt;span class="o"&gt;++;&lt;/span&gt; &lt;span class="n"&gt;ri&lt;/span&gt;&lt;span class="o"&gt;++;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The 50-line look-ahead window deserves a comment. It's the heuristic that decides "did the AI insert new code before this snapshot line, or did it modify this snapshot line?" If the next snapshot line shows up within the next 50 cache lines, treat the current cache line as an insertion. Otherwise treat it as a modification.&lt;/p&gt;

&lt;p&gt;Why 50? Two reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cost.&lt;/strong&gt; A full O(N²) LCS on a 2000-line file does ~4M comparisons. Bounded look-ahead caps each step at 50 comparisons → 100k total. On a real file this is microseconds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Realism about edit shapes.&lt;/strong&gt; AI assistants rarely insert 50+ contiguous lines without also modifying surrounding code. When they do, you're outside "merge a small edit" territory and you should be re-running the obfuscation pipeline on the result anyway.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It is a heuristic. On pathological inputs (the AI rewrites the entire file), it degrades to "treat everything as modification" which produces a usable but heavily de-obfuscated file. That's the right failure mode — you'll lose formatting on the affected stretch, but you won't lose data.&lt;/p&gt;




&lt;h2&gt;
  
  
  Three traps that the test suite found
&lt;/h2&gt;

&lt;p&gt;The merge in the previous sections is the version that &lt;em&gt;works&lt;/em&gt;. Getting there involved walking face-first into a few traps that aren't obvious from the algorithm.&lt;/p&gt;

&lt;h3&gt;
  
  
  Trap 1: snapshot and cache disagree on line count even when the AI didn't add lines
&lt;/h3&gt;

&lt;p&gt;Early on I was confused by failures that looked like the AI had inserted lines, when the diff in the assistant's output clearly hadn't.&lt;/p&gt;

&lt;p&gt;What was happening: the snapshot had been written months earlier, when the obfuscation pipeline's comment-stripping pass replaced multi-line &lt;code&gt;/* ... */&lt;/code&gt; comments with a single empty line. The current version replaces each line of the comment with its own empty line — preserving line count. So a snapshot from version &lt;em&gt;v1&lt;/em&gt; and a cache from version &lt;em&gt;v2&lt;/em&gt; could disagree by dozens of lines for the same source file, just because of comment-stripping format drift.&lt;/p&gt;

&lt;p&gt;The fix: when snapshot and cache disagree on line count, &lt;em&gt;re-obfuscate the real file on the fly&lt;/em&gt; to get a fresh snapshot in the current format, and use that as the merge ancestor. Only &lt;code&gt;.java&lt;/code&gt; files — sanitizers for &lt;code&gt;.properties&lt;/code&gt;, &lt;code&gt;.yml&lt;/code&gt;, and &lt;code&gt;pom.xml&lt;/code&gt; preserve line count by construction, so any line-count drift on those files is a genuine AI edit, not a format mismatch.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snapshotLines&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;cacheLines&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;obfRelativePath&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;endsWith&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".java"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;freshObfuscated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;obfuscateContent&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;realContent&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;freshObfuscated&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;snapshotContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;freshObfuscated&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I did &lt;em&gt;not&lt;/em&gt; gate this on the obfuscation pipeline version — the cost of re-obfuscating one file on demand is negligible, and the alternative (storing version metadata per snapshot and migrating on read) was complexity I didn't want.&lt;/p&gt;

&lt;h3&gt;
  
  
  Trap 2: don't run the Java obfuscation pipeline on a .properties file
&lt;/h3&gt;

&lt;p&gt;There's a sharp corner in the fix above. The re-obfuscation call is &lt;code&gt;engine.obfuscateContent(realContent)&lt;/code&gt;. That method runs the &lt;em&gt;Java&lt;/em&gt; pipeline — JavaParser AST walk, identifier replacement, comment stripping, reflection-string post-processing.&lt;/p&gt;

&lt;p&gt;If I run it on a &lt;code&gt;.properties&lt;/code&gt; file, it produces a near-identity transformation (no Java identifiers to rename, no comments to strip the same way). The output is almost-but-not-quite the same as the real file. Now I have a "snapshot" that diverges from the cache on every single line, because the &lt;em&gt;properties sanitizer&lt;/em&gt; (a different pipeline) produced the cache, while the Java pipeline produced this fresh "snapshot."&lt;/p&gt;

&lt;p&gt;The merge then concludes that the AI rewrote every line of the properties file, and helpfully writes the sanitized (&lt;code&gt;REDACTED&lt;/code&gt; placeholder) values back into the real &lt;code&gt;application.properties&lt;/code&gt;. That's not a corrupted file — that's a data exfiltration risk inverted: the redaction now overwrites the real secret.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;.endsWith(".java")&lt;/code&gt; guard above isn't a perf optimization. It's a correctness boundary.&lt;/p&gt;

&lt;h3&gt;
  
  
  Trap 3: AI-created files don't have a snapshot at all
&lt;/h3&gt;

&lt;p&gt;When the AI creates a new file — &lt;code&gt;Cls_a1b2c3d4Test.java&lt;/code&gt;, say — there's no snapshot to merge against. There's no real file either. You just have the cache.&lt;/p&gt;

&lt;p&gt;This case is simpler in some ways (full de-obfuscation of the content, no merge) but it has its own corner: the &lt;em&gt;filename itself&lt;/em&gt; contains obfuscated identifiers. &lt;code&gt;Cls_a1b2c3d4Test.java&lt;/code&gt; needs to become &lt;code&gt;InvoiceServiceTest.java&lt;/code&gt; — the AI used the obfuscated class name as a prefix to a new identifier, and the path resolver has to recognize the embedded mapping.&lt;/p&gt;

&lt;p&gt;The strategy: try a full-filename match against known class mappings first (&lt;code&gt;Cls_a1b2c3d4.java&lt;/code&gt; → &lt;code&gt;InvoiceService.java&lt;/code&gt;). If that fails, run the standard line de-obfuscation on the filename without extension and treat whatever comes out as the real name (&lt;code&gt;Cls_a1b2c3d4Test&lt;/code&gt; → &lt;code&gt;InvoiceServiceTest&lt;/code&gt;). Same for package path segments.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(!&lt;/span&gt;&lt;span class="n"&gt;matched&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;fileName&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;endsWith&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".java"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;stem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fileName&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;substring&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fileName&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;fileName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;deobfuscateLine&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stem&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;".java"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The same machinery you wrote to de-obfuscate file &lt;em&gt;contents&lt;/em&gt; solves the file &lt;em&gt;path&lt;/em&gt; problem if you feed it the path as a string. Once I noticed this, several other corner cases I'd been hand-rolling (paths in stack traces, file references in error messages) collapsed into the same call.&lt;/p&gt;




&lt;h2&gt;
  
  
  What about deletions?
&lt;/h2&gt;

&lt;p&gt;The AI sometimes "cleans up" by deleting a file. I do not auto-apply deletions. The merge reports them — &lt;code&gt;applied: 3, created: 1, deletedByAi: 1&lt;/code&gt; — and the developer decides whether to follow through.&lt;/p&gt;

&lt;p&gt;This is not a technical limitation. It's a deliberate asymmetry. The cost of accidentally creating a file is a &lt;code&gt;git rm&lt;/code&gt; away. The cost of accidentally deleting a file the developer hadn't checked in yet is unrecoverable. The merge plays defense on the irreversible side.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this generalizes
&lt;/h2&gt;

&lt;p&gt;I started building this for obfuscation because I had to. But the pattern — &lt;em&gt;projection, transformation in the projected space, merge back into the original&lt;/em&gt; — shows up in a lot of places once you look for it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Source maps in JavaScript bundlers.&lt;/strong&gt; The bundled file is a projection; the original sources are the real version; you map errors back via the source map.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AST-based refactoring tools.&lt;/strong&gt; The AST is a projection; the textual source has comments and formatting the AST doesn't; round-tripping requires a 3-way merge.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notebook → script extraction and back.&lt;/strong&gt; Strip cells to a script for review; merge edits back into the notebook.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anything that asks an LLM to edit code with stripped context.&lt;/strong&gt; Hide secrets, hide proprietary names, hide internal comments — and now you own a merge problem.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The takeaway from six months of breaking my own merge: &lt;strong&gt;don't think of the projection as a translation. Think of it as a branch.&lt;/strong&gt; Once you call it a branch, you stop trying to invent a clever inverse and you start writing a merge — which is a problem the industry has spent decades solving.&lt;/p&gt;




&lt;p&gt;If you want to see the merge running on real Java edits — including the tricky cases — the example diffs and test fixtures live in &lt;a href="https://gitlab.com/gbreton7/promptcape-docs" rel="noopener noreferrer"&gt;gitlab.com/gbreton7/promptcape-docs&lt;/a&gt;. It's the docs and worked-examples companion to &lt;em&gt;**&lt;a href="https://promptcape.com" rel="noopener noreferrer"&gt;PromptCape&lt;/a&gt;&lt;/em&gt;*, the obfuscation proxy I'm building for Claude Code and Cursor. MRs welcome if you've run into a merge case I haven't.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>java</category>
      <category>security</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Java Code Obfuscation for AI Assistants: Ensuring the Full Cycle Works</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Mon, 04 May 2026 18:14:55 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/java-code-obfuscation-for-ai-assistants-ensuring-the-full-cycle-works-d5</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/java-code-obfuscation-for-ai-assistants-ensuring-the-full-cycle-works-d5</guid>
      <description>&lt;p&gt;&lt;em&gt;How to obfuscate Java code for AI coding tools while guaranteeing that compilation, tests, and reverse-application all succeed.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;AI coding assistants (Claude Code, Cursor, GitHub Copilot) need access to your source code to help you. But sending proprietary code to an LLM means exposing your business domain, architecture, and intellectual property, and configuration data, even personal data.&lt;/p&gt;

&lt;p&gt;Code obfuscation can solve this: rename identifiers before the AI sees the code, let the AI work on the obfuscated version, then reverse the changes back. Simple in theory. In practice, Java's rich ecosystem of frameworks, annotations, and conventions makes this a minefield.&lt;/p&gt;

&lt;p&gt;This article describes what a Java obfuscation tool must handle to guarantee the &lt;strong&gt;full cycle&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Source compiles &amp;amp; tests pass
    -&amp;gt; Obfuscation
        -&amp;gt; AI modifies code
            -&amp;gt; Obfuscated code compiles &amp;amp; tests pass
                -&amp;gt; De-obfuscation (apply)
                    -&amp;gt; Source compiles &amp;amp; tests pass
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each transition can break. Here is what you need to address at each step, and how PromptCape solves it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Source -&amp;gt; Obfuscation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1.1 What to rename
&lt;/h3&gt;

&lt;p&gt;A Java obfuscator for AI must rename:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Element&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Package names&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;com.acme.billing&lt;/code&gt; -&amp;gt; &lt;code&gt;pkg_a1b2c3d4&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Reveals company and domain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Class names&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;InvoiceService&lt;/code&gt; -&amp;gt; &lt;code&gt;Cls_e5f6a7b8&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Reveals business concepts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Method names&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;calculateDiscount&lt;/code&gt; -&amp;gt; &lt;code&gt;mtd_1a2b3c4d&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Reveals business logic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Field names&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;customerName&lt;/code&gt; -&amp;gt; &lt;code&gt;fld_9e8d7c6b&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Reveals data model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Comments&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;// Apply VAT to invoice&lt;/code&gt; -&amp;gt; &lt;code&gt;// Processed.&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Reveals business context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Javadoc&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;/** Calculates the total with tax */&lt;/code&gt; -&amp;gt; &lt;code&gt;/** Processed. */&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Config values&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;jdbc:postgresql://prod.acme.com&lt;/code&gt; -&amp;gt; &lt;code&gt;REDACTED&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Reveals infrastructure&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  1.2 What NOT to rename
&lt;/h3&gt;

&lt;p&gt;This is where most naive approaches fail. The following must be preserved:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JDK types and methods:&lt;/strong&gt; &lt;code&gt;String&lt;/code&gt;, &lt;code&gt;List&lt;/code&gt;, &lt;code&gt;Map&lt;/code&gt;, &lt;code&gt;Optional&lt;/code&gt;, &lt;code&gt;toString&lt;/code&gt;, &lt;code&gt;equals&lt;/code&gt;, &lt;code&gt;hashCode&lt;/code&gt;, &lt;code&gt;main&lt;/code&gt;, &lt;code&gt;stream&lt;/code&gt;, &lt;code&gt;forEach&lt;/code&gt;...&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Framework annotations:&lt;/strong&gt; &lt;code&gt;@Autowired&lt;/code&gt;, &lt;code&gt;@Entity&lt;/code&gt;, &lt;code&gt;@RestController&lt;/code&gt;, &lt;code&gt;@GetMapping&lt;/code&gt;, &lt;code&gt;@JsonProperty&lt;/code&gt;, &lt;code&gt;@Data&lt;/code&gt;, &lt;code&gt;@Builder&lt;/code&gt;...&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Framework-specific identifiers&lt;/strong&gt; that carry semantic meaning for the framework at runtime:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Framework&lt;/th&gt;
&lt;th&gt;What breaks if renamed&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Spring Data JPA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Derived query methods&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;findByActiveTrue()&lt;/code&gt; -&amp;gt; the method name IS the query. Renaming it to &lt;code&gt;mtd_xxx&lt;/code&gt; makes Spring fail with "No property mtd found"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JPA/Hibernate&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Entity names in JPQL&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;@Query("SELECT e FROM Invoice e")&lt;/code&gt; — the string &lt;code&gt;Invoice&lt;/code&gt; must match the entity class name&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Lombok&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Generated accessor names&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;@Data&lt;/code&gt; generates &lt;code&gt;getName()&lt;/code&gt; from field &lt;code&gt;name&lt;/code&gt;. If &lt;code&gt;name&lt;/code&gt; is renamed to &lt;code&gt;fld_xxx&lt;/code&gt;, Lombok generates &lt;code&gt;getFld_xxx()&lt;/code&gt; — but code calling &lt;code&gt;getName()&lt;/code&gt; is also renamed to &lt;code&gt;getMtd_xxx()&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Jackson&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;JSON field mapping&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;@JsonProperty&lt;/code&gt; fields, or fields in DTOs in &lt;code&gt;model&lt;/code&gt;/&lt;code&gt;dto&lt;/code&gt; packages — renaming breaks serialization/deserialization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Spring Config&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Property binding&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;@ConfigurationProperties&lt;/code&gt; binds YAML keys to field names&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bean Validation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Field references&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;@NotBlank&lt;/code&gt; on a field — the constraint message references the field name&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The solution: framework detection (Pass 0).&lt;/strong&gt; Before collecting identifiers, scan the entire project for framework annotations and produce exclusion rules. Each framework has a dedicated detector:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Project scan -&amp;gt; LombokDetector       -&amp;gt; exclude fields + get/set/is accessors
             -&amp;gt; SpringDataDetector   -&amp;gt; exclude findByXxx, countByXxx, existsByXxx methods
             -&amp;gt; JacksonDetector      -&amp;gt; exclude @Entity/@JsonProperty fields
             -&amp;gt; JpaHibernateDetector -&amp;gt; exclude @MappedSuperclass/@Embeddable fields
             -&amp;gt; SpringConfigDetector -&amp;gt; exclude @ConfigurationProperties fields
             -&amp;gt; ValidationDetector   -&amp;gt; exclude @NotBlank/@Min/@Size fields
             -&amp;gt; OpenApiDetector      -&amp;gt; exclude @Schema/@Operation fields and methods
             -&amp;gt; SpringBootDetector   -&amp;gt; track @SpringBootApplication for test fixing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  1.3 String literals: a hidden trap
&lt;/h3&gt;

&lt;p&gt;Code replacement must skip string literals to avoid breaking values like &lt;code&gt;"Hello World"&lt;/code&gt; or &lt;code&gt;"/api/v1/users"&lt;/code&gt;. But some strings DO reference identifiers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Context&lt;/th&gt;
&lt;th&gt;String content&lt;/th&gt;
&lt;th&gt;Must be updated?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@Query("SELECT e FROM Invoice e")&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;JPQL entity name&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Class.forName("com.acme.InvoiceService")&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fully qualified class name&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;getMethod("calculateTotal")&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Reflection method name&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@ComponentScan("com.acme.service")&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Package name&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;"Hello World"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;User-facing string&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;"/api/v1/invoices"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;REST endpoint&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The obfuscator must apply identifier replacement INSIDE specific string contexts while leaving general strings untouched. This requires post-processing passes for &lt;code&gt;@Query&lt;/code&gt;, reflection calls, and package annotations.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.4 Comment stripping and special characters
&lt;/h3&gt;

&lt;p&gt;Comments contain business context that reveals your domain. But stripping them introduces two problems:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Line count changes:&lt;/strong&gt; A multi-line Javadoc becomes a single-line &lt;code&gt;/** Processed. */&lt;/code&gt;, breaking line-number correspondence between obfuscated and original files.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Special characters in comments:&lt;/strong&gt; French (and other languages) comments contain apostrophes (&lt;code&gt;// Service d'injection&lt;/code&gt;), accented characters, and other non-ASCII text. A character-by-character scanner that treats &lt;code&gt;'&lt;/code&gt; as a Java char literal delimiter will be confused by &lt;code&gt;l'injection&lt;/code&gt;, potentially skipping code after the comment.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Process comments before string/char literal scanning. Replace line comments (&lt;code&gt;//&lt;/code&gt;) in-place (one line in, one line out). For multi-line Javadoc and block comments, accept the line count change and handle it during the reverse-apply step with a 3-way merge.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Obfuscated code -&amp;gt; AI modification -&amp;gt; Compilation &amp;amp; tests
&lt;/h2&gt;

&lt;h3&gt;
  
  
  2.1 The obfuscated code must compile
&lt;/h3&gt;

&lt;p&gt;This seems obvious but is surprisingly hard. Even with framework detection, some identifiers cause compilation failures that can only be detected by actually compiling. Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A method name that collides with a JDK method after obfuscation&lt;/li&gt;
&lt;li&gt;A field name that matches a Java keyword&lt;/li&gt;
&lt;li&gt;An annotation processor that generates code based on identifier names&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Solution: auto-fix loop.&lt;/strong&gt; Compile the obfuscated code. If it fails, parse the compiler errors, reverse-map the broken identifiers, add them to an exclusion list, and re-obfuscate. Repeat until green or max iterations reached. Persist exclusions for future runs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Obfuscate -&amp;gt; Compile -&amp;gt; Parse errors -&amp;gt; Exclude broken identifiers -&amp;gt; Re-obfuscate -&amp;gt; Compile -&amp;gt; ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2.2 Tests must pass on obfuscated code
&lt;/h3&gt;

&lt;p&gt;Compilation is necessary but not sufficient. Tests exercise the runtime behavior where framework conventions matter most:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Spring context loading:&lt;/strong&gt; &lt;code&gt;@SpringBootTest&lt;/code&gt; boots the full application context. A broken repository method or missing bean crashes the entire test suite.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spring Data query derivation:&lt;/strong&gt; happens at context startup, not at compile time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JPA schema generation:&lt;/strong&gt; Hibernate creates tables from &lt;code&gt;@Entity&lt;/code&gt; classes. If JPQL &lt;code&gt;@Query&lt;/code&gt; strings reference the original entity name but the class is renamed, the context fails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;H2 compatibility:&lt;/strong&gt; Test profiles often use H2 instead of PostgreSQL. Database-specific types (&lt;code&gt;JSONB&lt;/code&gt;, &lt;code&gt;ARRAY&lt;/code&gt;) in column definitions fail on H2 regardless of obfuscation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; If the source tests pass and the obfuscated tests don't, the obfuscation broke something. The auto-fix loop should use &lt;code&gt;mvn test-compile&lt;/code&gt; (or even &lt;code&gt;mvn test&lt;/code&gt;) as the build command to catch these failures.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 The AI must be able to work effectively
&lt;/h3&gt;

&lt;p&gt;The AI needs to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read and understand the code structure (even with obfuscated names)&lt;/li&gt;
&lt;li&gt;Create new files, classes, and methods&lt;/li&gt;
&lt;li&gt;Modify existing code&lt;/li&gt;
&lt;li&gt;Run builds and tests to verify its work&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The obfuscated names should be &lt;strong&gt;deterministic&lt;/strong&gt; (same input always produces the same hash) so the AI can learn patterns across files. Prefixes (&lt;code&gt;Cls_&lt;/code&gt;, &lt;code&gt;mtd_&lt;/code&gt;, &lt;code&gt;fld_&lt;/code&gt;, &lt;code&gt;pkg_&lt;/code&gt;) help the AI understand the identifier type.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: De-obfuscation (apply) -&amp;gt; Source compiles &amp;amp; tests pass
&lt;/h2&gt;

&lt;p&gt;This is where most obfuscation tools stop — they handle the forward direction but not the reverse. For AI coding, the reverse is just as critical.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Only apply what the AI changed
&lt;/h3&gt;

&lt;p&gt;The naive approach: read the obfuscated file, de-obfuscate all identifiers, overwrite the real file. This breaks because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Comments were stripped during obfuscation.&lt;/strong&gt; The de-obfuscated file has &lt;code&gt;/** Processed. */&lt;/code&gt; where the original had full Javadoc. If the AI didn't touch that line, the original comment should be preserved.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Formatting may differ.&lt;/strong&gt; The obfuscated file may have different whitespace or line endings.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Solution: 3-way merge.&lt;/strong&gt; Compare the snapshot (obfuscated, pre-AI) with the cache (obfuscated, post-AI) line by line:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lines unchanged by the AI -&amp;gt; keep the original source line&lt;/li&gt;
&lt;li&gt;Lines modified by the AI -&amp;gt; de-obfuscate the new version
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Snapshot line == Cache line?
    Yes -&amp;gt; keep original source line (preserves comments, formatting)
    No  -&amp;gt; de-obfuscate cache line (AI changed it)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For added/removed lines, use chunk-based alignment to find sync points and apply the changes surgically.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 Handle AI-generated variable names
&lt;/h3&gt;

&lt;p&gt;When the AI creates a new variable for an obfuscated class, it invents a name based on what it sees:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// AI writes:&lt;/span&gt;
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="n"&gt;Cls_f45371c4&lt;/span&gt; &lt;span class="n"&gt;fld_f45371c4&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Standard de-obfuscation produces:&lt;/span&gt;
&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;ZipBuilderService&lt;/span&gt; &lt;span class="n"&gt;fld_f45371c4&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// class de-obfuscated, but variable name is unreadable&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The variable name &lt;code&gt;fld_f45371c4&lt;/code&gt; is not in the mapping registry — the AI invented it. But the hash &lt;code&gt;f45371c4&lt;/code&gt; matches the known class &lt;code&gt;ZipBuilderService&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; After standard de-obfuscation, scan for remaining &lt;code&gt;fld_XXXXXXXX&lt;/code&gt;/&lt;code&gt;cls_XXXXXXXX&lt;/code&gt;/&lt;code&gt;mtd_XXXXXXXX&lt;/code&gt; patterns. If the hash matches a known entry, generate a camelCase variable name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;ZipBuilderService&lt;/span&gt; &lt;span class="n"&gt;zipBuilderService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// readable&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Track each unique token across the file to ensure consistent renaming (declaration and all usages get the same name).&lt;/p&gt;

&lt;h3&gt;
  
  
  3.3 Don't apply build artifacts
&lt;/h3&gt;

&lt;p&gt;The AI may run &lt;code&gt;mvn package&lt;/code&gt; in the obfuscated workspace, creating &lt;code&gt;target/&lt;/code&gt; with compiled &lt;code&gt;.class&lt;/code&gt; files, &lt;code&gt;.jar&lt;/code&gt; archives, and test reports. These must be excluded from the diff detection:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Skip directories: &lt;code&gt;target/&lt;/code&gt;, &lt;code&gt;build/&lt;/code&gt;, &lt;code&gt;node_modules/&lt;/code&gt;, &lt;code&gt;.idea/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Skip binary files: &lt;code&gt;.class&lt;/code&gt;, &lt;code&gt;.jar&lt;/code&gt;, &lt;code&gt;.war&lt;/code&gt;, images, fonts&lt;/li&gt;
&lt;li&gt;These patterns match what the obfuscation engine already skips&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3.4 Snapshot management
&lt;/h3&gt;

&lt;p&gt;The apply command needs a "before" snapshot to detect what the AI changed. After a successful apply, the snapshot is updated. But if the apply fails or the user reverts with &lt;code&gt;git restore&lt;/code&gt;, the snapshot is out of sync.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Don't update the snapshot when the apply has errors&lt;/li&gt;
&lt;li&gt;Provide a &lt;code&gt;--reset-snapshot&lt;/code&gt; option that re-obfuscates the source into the snapshot directory without touching the cache&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The complete cycle
&lt;/h2&gt;

&lt;p&gt;Here is what must work end-to-end:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. mvn test                       -&amp;gt; GREEN (source is healthy)
2. promptcape obfuscate --verify  -&amp;gt; Obfuscated workspace created
3. mvn test (in workspace)        -&amp;gt; GREEN (obfuscation didn't break anything)
4. AI modifies obfuscated code
5. mvn test (in workspace)        -&amp;gt; GREEN (AI changes work)
6. promptcape apply               -&amp;gt; Changes applied to source
7. mvn test                       -&amp;gt; GREEN (de-obfuscated changes work)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each transition requires specific handling:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Transition&lt;/th&gt;
&lt;th&gt;Challenge&lt;/th&gt;
&lt;th&gt;Solution&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1 -&amp;gt; 2&lt;/td&gt;
&lt;td&gt;Framework identifiers break&lt;/td&gt;
&lt;td&gt;Framework detection (8 detectors)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1 -&amp;gt; 2&lt;/td&gt;
&lt;td&gt;Some identifiers cause compile errors&lt;/td&gt;
&lt;td&gt;Auto-fix loop with exclusion persistence&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2 -&amp;gt; 3&lt;/td&gt;
&lt;td&gt;JPQL strings reference original names&lt;/td&gt;
&lt;td&gt;Post-processing: replace entity names in &lt;code&gt;@Query&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2 -&amp;gt; 3&lt;/td&gt;
&lt;td&gt;Reflection strings reference original names&lt;/td&gt;
&lt;td&gt;Post-processing: replace in &lt;code&gt;getMethod()&lt;/code&gt;, &lt;code&gt;forName()&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2 -&amp;gt; 3&lt;/td&gt;
&lt;td&gt;Spring Data query derivation fails&lt;/td&gt;
&lt;td&gt;Repository method name protection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4 -&amp;gt; 5&lt;/td&gt;
&lt;td&gt;AI must understand the code&lt;/td&gt;
&lt;td&gt;Deterministic naming, type prefixes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5 -&amp;gt; 6&lt;/td&gt;
&lt;td&gt;Comments stripped during obfuscation&lt;/td&gt;
&lt;td&gt;3-way merge (only apply AI-changed lines)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5 -&amp;gt; 6&lt;/td&gt;
&lt;td&gt;AI invents unreadable variable names&lt;/td&gt;
&lt;td&gt;Hash-based name resolution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5 -&amp;gt; 6&lt;/td&gt;
&lt;td&gt;Build artifacts in workspace&lt;/td&gt;
&lt;td&gt;Directory and binary file filtering&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6 -&amp;gt; 7&lt;/td&gt;
&lt;td&gt;Applied changes don't compile&lt;/td&gt;
&lt;td&gt;User review + re-apply capability&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What PromptCape implements
&lt;/h2&gt;

&lt;p&gt;PromptCape is a Java-first obfuscation tool designed for this exact cycle. Here is what it covers today:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Obfuscation engine:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AST-based identifier collection via JavaParser (packages, classes, methods, fields, enums, records)&lt;/li&gt;
&lt;li&gt;Deterministic HMAC-SHA256 naming with type prefixes&lt;/li&gt;
&lt;li&gt;Package hierarchy flattening&lt;/li&gt;
&lt;li&gt;Word-boundary replacement (&lt;code&gt;\b&lt;/code&gt;) with longest-match-first ordering&lt;/li&gt;
&lt;li&gt;String literal preservation with post-processing for &lt;code&gt;@Query&lt;/code&gt;, reflection, &lt;code&gt;@ComponentScan&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Full comment stripping (Javadoc, block, and line comments)&lt;/li&gt;
&lt;li&gt;POM, properties, YAML, and XML file sanitization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Framework detection (8 detectors):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lombok: field + accessor protection&lt;/li&gt;
&lt;li&gt;Spring Boot: application class tracking, test annotation fixing&lt;/li&gt;
&lt;li&gt;Spring Data: repository derived query method protection&lt;/li&gt;
&lt;li&gt;JPA/Hibernate: entity field protection, JPQL entity name replacement&lt;/li&gt;
&lt;li&gt;Jackson: DTO/entity field protection&lt;/li&gt;
&lt;li&gt;Spring Config: property-bound field protection&lt;/li&gt;
&lt;li&gt;Validation: constraint field protection&lt;/li&gt;
&lt;li&gt;OpenAPI: schema field and method protection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Auto-fix:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compile-and-fix loop with configurable build command&lt;/li&gt;
&lt;li&gt;Compiler error parsing and reverse mapping&lt;/li&gt;
&lt;li&gt;Persistent exclusion lists across runs&lt;/li&gt;
&lt;li&gt;Source verification option&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reverse application:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;3-way merge (preserve original lines for unchanged content)&lt;/li&gt;
&lt;li&gt;AI-generated variable name resolution (hash-based)&lt;/li&gt;
&lt;li&gt;Build artifact and binary file exclusion&lt;/li&gt;
&lt;li&gt;Snapshot management with reset capability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Two modes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CLI workspace (obfuscate -&amp;gt; AI works -&amp;gt; apply)&lt;/li&gt;
&lt;li&gt;HTTP proxy (transparent interception for IDE-based tools — see below)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Metrics:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Final identifier and duration counters at the end of every run, for instance:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;+-------------------------------+----------+
| Final Summary                 |          |
+-------------------------------+----------+
| Iterations                    |       4  |
| Identifiers obfuscated        |    3287  |
| Packages (flattened)          |      74  |
| Exclusions loaded (previous)  |       0  |
| Exclusions added (this run)   |     152  |
| Exclusions total              |     152  |
| Verification time             |  106,1s  |
| Total time                    |  224,5s  |
+-------------------------------+----------+
| Compilation                   |    OK    |
+-------------------------------+----------+
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Seamless IDE integration
&lt;/h2&gt;

&lt;p&gt;The obfuscation cycle described above can run as a one-shot CLI workflow, but friction kills adoption. Developers don't want to leave their IDE, run &lt;code&gt;promptcape obfuscate&lt;/code&gt;, switch to a workspace folder, ask the AI to do something, then run &lt;code&gt;promptcape apply&lt;/code&gt; and switch back. They want the assistant they already use, in the IDE they already use, with the obfuscation invisible.&lt;/p&gt;

&lt;p&gt;PromptCape provides this via an &lt;strong&gt;HTTP proxy mode&lt;/strong&gt; that intercepts traffic to the AI provider and applies the same forward/reverse cycle on the fly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;IDE -&amp;gt; Claude Code -&amp;gt; [PromptCape proxy] -&amp;gt; Anthropic API
                          obfuscates the prompt going out
                          de-obfuscates the response coming back
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The "PromptCape Claude" terminal in Cursor
&lt;/h3&gt;

&lt;p&gt;The simplest integration is a dedicated terminal profile. In Cursor (and equally in VS Code or any IDE that supports terminal profiles), you create a profile named &lt;strong&gt;PromptCape Claude&lt;/strong&gt; that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Starts the proxy in the background if it is not already running&lt;/li&gt;
&lt;li&gt;Sets &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; (and equivalent variables) to point Claude Code at the local proxy&lt;/li&gt;
&lt;li&gt;Launches &lt;code&gt;claude&lt;/code&gt; (the Claude Code CLI) inside that environment&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;From the developer's perspective, this is just &lt;strong&gt;another terminal in the IDE sidebar&lt;/strong&gt;. They open the &lt;em&gt;PromptCape Claude&lt;/em&gt; terminal instead of the default one, type their request to Claude as usual, and watch the AI work on their codebase. Behind the scenes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Outbound prompt:&lt;/strong&gt; identifiers, comments, and config values are obfuscated before leaving the machine&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inbound response:&lt;/strong&gt; file edits, suggestions, and explanations are de-obfuscated before reaching the IDE&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build artifacts and binaries&lt;/strong&gt; are filtered out of the cycle&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No workflow change. No &lt;code&gt;obfuscate&lt;/code&gt; or &lt;code&gt;apply&lt;/code&gt; command to remember. The same Claude Code experience, with the obfuscation guaranteeing that &lt;strong&gt;what reaches the provider is not your real source code&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why a terminal profile is the right shape for this
&lt;/h3&gt;

&lt;p&gt;The CLI workspace is the right primitive — it gives full control and fits CI/CD or one-shot review use cases. But for daily AI-assisted coding, friction wins or loses the security battle. A proxy that hooks into the existing tool's trust chain (env vars, &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt;) gives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero training cost:&lt;/strong&gt; developers keep using Claude Code exactly as before — same commands, same outputs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero forgotten steps:&lt;/strong&gt; there is no &lt;code&gt;apply&lt;/code&gt; to forget — the response is reverse-mapped on the wire&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-project configuration:&lt;/strong&gt; terminal profiles ship in &lt;code&gt;.vscode/settings.json&lt;/code&gt;, &lt;code&gt;.cursor/&lt;/code&gt;, or JetBrains run configurations, so opening a project pre-configures the secure terminal automatically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auditability by default:&lt;/strong&gt; every prompt and response transits the proxy, which can log, redact, or block on policy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The same pattern extends to any AI tool that respects a base-URL override (Cursor's built-in chat, Aider, Continue.dev, OpenAI-compatible clients, etc.). The IDE doesn't need a plugin and the AI tool doesn't need to know the proxy exists — the integration is just a terminal away.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Java obfuscation for AI coding assistants is not just about renaming identifiers. It requires deep understanding of how Java frameworks use naming conventions, how annotation processors derive behavior from names, and how to surgically apply AI changes without losing information that was stripped during obfuscation.&lt;/p&gt;

&lt;p&gt;The key insight: &lt;strong&gt;framework detection before obfuscation is more effective than reactive error fixing after.&lt;/strong&gt; Proactively protecting Spring Data repository methods, JPA entity fields, and Lombok-generated accessors eliminates most compilation failures before they happen.&lt;/p&gt;

&lt;p&gt;The second insight: &lt;strong&gt;the reverse direction is just as hard as the forward.&lt;/strong&gt; A 3-way merge that only applies AI-changed lines, combined with hash-based resolution of AI-invented names, makes the de-obfuscated code readable and correct.&lt;/p&gt;

&lt;p&gt;The third insight: &lt;strong&gt;friction kills adoption, so the obfuscation has to disappear into the IDE.&lt;/strong&gt; A dedicated terminal profile (the &lt;em&gt;PromptCape Claude&lt;/em&gt; terminal in Cursor) that boots Claude Code through the proxy turns the entire cycle into a transparent operation — same tool, same commands, no extra steps. Security that requires discipline gets bypassed; security that ships as a terminal in the sidebar gets used.&lt;/p&gt;

&lt;p&gt;PromptCape is open for trial at &lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;promptcape.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>java</category>
      <category>privacy</category>
      <category>security</category>
    </item>
    <item>
      <title>Why Your Source Code Is at Risk When Using AI Coding Assistants, but no dev future without AI coding!</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Fri, 01 May 2026 16:41:29 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/why-your-source-code-is-at-risk-when-using-ai-coding-assistants-but-no-dev-future-without-ai-5513</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/why-your-source-code-is-at-risk-when-using-ai-coding-assistants-but-no-dev-future-without-ai-5513</guid>
      <description>&lt;p&gt;&lt;em&gt;Every line you send to an AI coding tool leaves your control. Here's what that means for your business, your clients, and your legal obligations.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  You are sending your source code to a foreign server
&lt;/h2&gt;

&lt;p&gt;When you use Claude Code, Cursor, GitHub Copilot, ChatGPT, Mistral Vibe, or any LLM-based coding assistant, your source code is sent over HTTPS to a remote API. That API runs on servers you don't control, in a jurisdiction you didn't choose, operated by a company whose data practices you've accepted by clicking "I agree."&lt;/p&gt;

&lt;p&gt;Let's be specific about where your code goes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;API provider&lt;/th&gt;
&lt;th&gt;Server locations&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude Code / Cursor (Claude)&lt;/td&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;td&gt;US (AWS us-east, us-west)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GitHub Copilot&lt;/td&gt;
&lt;td&gt;Microsoft / OpenAI&lt;/td&gt;
&lt;td&gt;US (Azure data centers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ChatGPT&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;US (Azure data centers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cursor (OpenAI mode)&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mistral Vibe / Le Chat&lt;/td&gt;
&lt;td&gt;Mistral AI&lt;/td&gt;
&lt;td&gt;EU (France, via cloud providers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek&lt;/td&gt;
&lt;td&gt;DeepSeek&lt;/td&gt;
&lt;td&gt;China&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini Code Assist&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;US (GCP data centers)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most developers don't think twice about this. They open their IDE, the AI suggests code, they accept. Behind the scenes, the IDE sent the contents of the current file — and often surrounding files, imports, and project context — to a server thousands of kilometers away.&lt;/p&gt;




&lt;h2&gt;
  
  
  What exactly is being sent?
&lt;/h2&gt;

&lt;p&gt;It's not just "a few lines of code." Modern AI coding tools send rich context to produce better suggestions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The current file&lt;/strong&gt; — full content, not just the cursor position&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open tabs and imported files&lt;/strong&gt; — the AI reads your project structure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File paths&lt;/strong&gt; — revealing your package hierarchy (&lt;code&gt;com.acme.billing.service.InvoiceService&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configuration files&lt;/strong&gt; — &lt;code&gt;application.yml&lt;/code&gt;, &lt;code&gt;pom.xml&lt;/code&gt;, &lt;code&gt;.env&lt;/code&gt; with database URLs, API keys, internal hostnames&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Comments and Javadoc&lt;/strong&gt; — containing business logic descriptions, TODO items, bug references&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test files&lt;/strong&gt; — revealing edge cases, business rules, validation logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Git context&lt;/strong&gt; — commit messages, branch names, sometimes diffs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A single prompt to an AI coding assistant can contain more context about your business than a 10-page architecture document.&lt;/p&gt;




&lt;h2&gt;
  
  
  The risks are real and specific
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Source code leakage
&lt;/h3&gt;

&lt;p&gt;Your code is transmitted to and processed on third-party infrastructure. Even if the provider promises not to train on your data (and many do), the code still:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Transits through networks&lt;/strong&gt; you don't control — intermediate proxies, load balancers, logging systems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Is stored temporarily&lt;/strong&gt; for processing — cache layers, request logs, debugging infrastructure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;May be retained for abuse detection&lt;/strong&gt; — most providers log requests for safety monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Could be subpoenaed&lt;/strong&gt; — US providers are subject to US law enforcement requests, including the CLOUD Act which allows cross-border data access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The question is not "will the provider deliberately steal my code?" It's "how many systems touch my code between my IDE and the model, and who has access to those systems?"&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Intellectual property exposure
&lt;/h3&gt;

&lt;p&gt;Source code is a trade secret. Once exposed, trade secret protection can be lost permanently — unlike patents or copyrights, trade secrets only have value as long as they remain secret.&lt;/p&gt;

&lt;p&gt;What your code reveals:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Element&lt;/th&gt;
&lt;th&gt;What it exposes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Class and method names&lt;/td&gt;
&lt;td&gt;Your business domain and capabilities (&lt;code&gt;FraudDetector&lt;/code&gt;, &lt;code&gt;TaxCalculator&lt;/code&gt;, &lt;code&gt;PatentAnalyzer&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Package structure&lt;/td&gt;
&lt;td&gt;Your architecture and module boundaries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Algorithm implementations&lt;/td&gt;
&lt;td&gt;Your competitive advantage (pricing logic, recommendation engines, risk models)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database schema&lt;/td&gt;
&lt;td&gt;Your data model and relationships&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API endpoints&lt;/td&gt;
&lt;td&gt;Your service surface and capabilities&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Configuration&lt;/td&gt;
&lt;td&gt;Your infrastructure topology&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Comments&lt;/td&gt;
&lt;td&gt;Your business rules in plain language&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A competitor with access to your AI provider's logs could reconstruct your product's architecture, business rules, and technical approach without ever seeing your actual repository.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Client code exposure (integrators and freelancers)
&lt;/h3&gt;

&lt;p&gt;If you're a &lt;strong&gt;consulting firm&lt;/strong&gt;, &lt;strong&gt;systems integrator&lt;/strong&gt;, or &lt;strong&gt;freelance developer&lt;/strong&gt;, the risk multiplies. You're not just exposing your own code — you're exposing your client's code.&lt;/p&gt;

&lt;p&gt;Consider the scenarios:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You customize an ERP for a bank.&lt;/strong&gt; You send controller code to Claude that contains transaction processing logic, compliance rules, and internal API endpoints. That code belongs to the bank, not to you.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You build a SaaS platform for a healthcare company.&lt;/strong&gt; You use Copilot while working on patient data models. HIPAA-regulated data structures are now on Microsoft's servers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You maintain a defense contractor's codebase.&lt;/strong&gt; You use an AI to debug a networking module. The code may be subject to ITAR export controls — sending it to a US cloud provider may technically comply, but sending it to a Chinese provider (DeepSeek) would be a violation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most client contracts include clauses about code confidentiality and data handling. Using AI coding tools on client code may violate these contracts — and the client may never know until a breach occurs. But if it occurs and you are the one in charge of the code, this may a very bad stone in your shoe.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Regulatory and compliance risks
&lt;/h3&gt;

&lt;p&gt;Depending on your industry and jurisdiction, sending source code to external AI services can create compliance issues:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Regulation&lt;/th&gt;
&lt;th&gt;Risk&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;GDPR&lt;/strong&gt; (EU)&lt;/td&gt;
&lt;td&gt;If your code processes personal data and the code itself contains PII patterns, field names, or test data, sending it to a US server may violate data transfer rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SOC 2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Requires documented controls over data access. Using AI tools without DLP controls may fail audit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ISO 27001&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Requires risk assessment for third-party data processing. AI coding tools are a new attack vector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;HIPAA&lt;/strong&gt; (US healthcare)&lt;/td&gt;
&lt;td&gt;Code containing PHI field names, validation rules, or test fixtures with patient data patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PCI DSS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Code handling payment card data, encryption keys, or tokenization logic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;ITAR&lt;/strong&gt; (US defense)&lt;/td&gt;
&lt;td&gt;Export-controlled technical data cannot be shared with foreign persons or servers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;NIS2&lt;/strong&gt; (EU)&lt;/td&gt;
&lt;td&gt;Critical infrastructure operators must control their software supply chain&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Even if you're not in a regulated industry, your clients might be. And their auditors will ask how their code is protected.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. The training data question
&lt;/h3&gt;

&lt;p&gt;Most AI providers now offer policies like "we don't train on your data." But:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Policies change.&lt;/strong&gt; OpenAI initially trained on API data, then reversed course after backlash. What's the policy today may not be tomorrow's policy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policies have exceptions.&lt;/strong&gt; Abuse detection, safety monitoring, and model evaluation may still use your data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Free tiers have different rules.&lt;/strong&gt; ChatGPT Free explicitly trains on your conversations. Many developers prototype with the free tier before switching to paid.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Subprocessors matter.&lt;/strong&gt; The AI provider may not train on your data, but what about their cloud provider? Their logging vendor? Their CDN?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data breaches happen.&lt;/strong&gt; Samsung's semiconductor division leaked proprietary chip designs through ChatGPT in 2023. OpenAI suffered a data breach in March 2023 where users could see other users' chat titles. Even claude code has recently leaked!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The safest assumption: anything you send to an AI service should be treated as if it could become public.&lt;/p&gt;




&lt;h2&gt;
  
  
  The false sense of security
&lt;/h2&gt;

&lt;h3&gt;
  
  
  "But we use the enterprise plan"
&lt;/h3&gt;

&lt;p&gt;Enterprise plans typically offer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No training on your data&lt;/li&gt;
&lt;li&gt;Data processing agreements (DPAs)&lt;/li&gt;
&lt;li&gt;SOC 2 compliance of the provider&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What they don't offer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Control over where the data is processed&lt;/li&gt;
&lt;li&gt;Guarantees about intermediate systems&lt;/li&gt;
&lt;li&gt;Protection against subpoenas or government data requests&lt;/li&gt;
&lt;li&gt;Deletion verification (you can't audit what you can't see)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  "But we use a self-hosted model"
&lt;/h3&gt;

&lt;p&gt;Self-hosted models (Llama, Mistral, CodeLlama) solve the data residency problem but introduce others:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dramatically lower code quality compared to frontier models&lt;/li&gt;
&lt;li&gt;Significant infrastructure costs&lt;/li&gt;
&lt;li&gt;No access to the latest model capabilities (Claude Opus, GPT-4o)&lt;/li&gt;
&lt;li&gt;Still requires GPU infrastructure that someone must maintain&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  "But we only send small snippets"
&lt;/h3&gt;

&lt;p&gt;AI coding tools send more context than you think. And even small snippets reveal information:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// "Just a small function"&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="nf"&gt;calculateRoyalty&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Contract&lt;/span&gt; &lt;span class="n"&gt;contract&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;SalesReport&lt;/span&gt; &lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="n"&gt;baseRate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;contract&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getRoyaltyRate&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getNetSales&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;subtract&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getReturns&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contract&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hasMinimumGuarantee&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;multiply&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;baseRate&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;max&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contract&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMinimumGuarantee&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;multiply&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;baseRate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This "small snippet" reveals: you have a royalty calculation business, contracts have minimum guarantees, you track returns separately from net sales, and your financial model uses &lt;code&gt;BigDecimal&lt;/code&gt; precision. A competitor now knows your pricing model structure.&lt;/p&gt;




&lt;h2&gt;
  
  
  The solution: pseudonimyse and obfuscate before sending
&lt;/h2&gt;

&lt;p&gt;The principle is simple: &lt;strong&gt;rename everything that reveals business meaning before the AI sees it, then reverse the renaming when applying the AI's changes.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your code:                          What the AI sees:
calculateRoyalty()          -&amp;gt;      mtd_a1b2c3d4()
Contract contract           -&amp;gt;      Cls_e5f6a7b8 fld_9c8d7e6f
getRoyaltyRate()            -&amp;gt;      mtd_1a2b3c4d()
hasMinimumGuarantee()       -&amp;gt;      mtd_5e6f7a8b()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI can still:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understand the code structure (types, control flow, patterns)&lt;/li&gt;
&lt;li&gt;Suggest refactorings and bug fixes&lt;/li&gt;
&lt;li&gt;Add new functionality&lt;/li&gt;
&lt;li&gt;Write tests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What it cannot do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Infer your business domain&lt;/li&gt;
&lt;li&gt;Reconstruct your architecture from meaningful names&lt;/li&gt;
&lt;li&gt;Extract business rules from comments (stripped)&lt;/li&gt;
&lt;li&gt;Identify your company from package names (flattened)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What a proper obfuscation tool must handle
&lt;/h3&gt;

&lt;p&gt;It's not as simple as find-and-replace. Java's framework ecosystem means certain identifiers carry semantic meaning for the runtime:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Spring Data&lt;/strong&gt; repository methods (&lt;code&gt;findByName&lt;/code&gt;) derive SQL queries from the method name&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lombok&lt;/strong&gt; generates accessor methods from field names&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JPA&lt;/strong&gt; uses entity class names in JPQL query strings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Jackson&lt;/strong&gt; derives JSON field names from Java field names&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spring Config&lt;/strong&gt; binds YAML keys to field names&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A good obfuscation tool detects these frameworks and protects the identifiers that would break. Everything else gets renamed.&lt;/p&gt;

&lt;h3&gt;
  
  
  The full cycle must work
&lt;/h3&gt;

&lt;p&gt;Obfuscation is only useful if the cycle is complete:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Source compiles     -&amp;gt; Obfuscate -&amp;gt; Obfuscated compiles
                                 -&amp;gt; AI modifies -&amp;gt; Still compiles
                                                -&amp;gt; Apply back -&amp;gt; Source still compiles
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every transition can break. Framework detection, JPQL string updating, comment stripping, 3-way merge for reverse-application — all are necessary for a production-ready workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  What you should do today
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Immediate steps
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Audit what your AI tools send.&lt;/strong&gt; Enable request logging or use a proxy to see what context is transmitted. You'll likely be surprised.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Check your client contracts.&lt;/strong&gt; Look for clauses about code confidentiality, data processing, and third-party tools. Many contracts written before 2023 don't explicitly address AI coding tools — which doesn't mean they allow them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Establish an AI coding policy.&lt;/strong&gt; Define which projects can use AI tools, which cannot (client code, regulated code), and what safeguards are required.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Consider obfuscation.&lt;/strong&gt; For projects where AI assistance is valuable but code exposure is unacceptable, obfuscation provides the best of both worlds: AI productivity without IP exposure.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  For regulated industries
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Document your AI tool usage&lt;/strong&gt; in your risk register. Auditors will ask.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Include AI tools in your data processing agreements&lt;/strong&gt; with clients.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Evaluate data residency requirements.&lt;/strong&gt; If your data must stay in the EU, most US-based AI providers don't qualify without additional safeguards.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  For integrators and freelancers
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Get explicit written consent&lt;/strong&gt; from clients before using AI tools on their code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use obfuscation by default&lt;/strong&gt; on client projects. It's a competitive advantage: "We use AI to deliver faster, and we protect your code while doing it."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Include AI tool policies in your contracts.&lt;/strong&gt; Define what tools you use, how code is protected, and what the client's options are.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;AI coding assistants are transformative tools. They make developers faster, reduce boilerplate, and help navigate unfamiliar codebases. But they come with a fundamental trade-off: to help you, the AI needs to see your code. And "seeing your code" means transmitting it to infrastructure you don't control, in jurisdictions you didn't choose, with data handling practices you can't verify.&lt;/p&gt;

&lt;p&gt;The answer is not to stop using AI tools. The answer is to stop sending your code in clear text.&lt;/p&gt;

&lt;p&gt;Obfuscate your identifiers. Strip your comments. Sanitize your configuration. Let the AI work on the structure of your code without knowing what your code does. You get the productivity benefits. Your intellectual property stays yours.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;PromptCape is a Java code obfuscation tool designed for AI coding workflows. It handles framework detection, compilation verification, and smart reverse-application. Free trial at &lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;promptcape.com&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Why Your Source Code Is at Risk When Using AI Coding Assistants</title>
      <dc:creator>Genevieve Breton</dc:creator>
      <pubDate>Fri, 10 Apr 2026 06:35:02 +0000</pubDate>
      <link>https://dev.to/genevieve_breton_cb795f52/why-your-source-code-is-at-risk-when-using-ai-coding-assistants-29hn</link>
      <guid>https://dev.to/genevieve_breton_cb795f52/why-your-source-code-is-at-risk-when-using-ai-coding-assistants-29hn</guid>
      <description>&lt;p&gt;&lt;em&gt;Every line you send to an AI coding tool leaves your control. Here's what that means for your business, your clients, and your legal obligations.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  You are sending your source code to a foreign server
&lt;/h2&gt;

&lt;p&gt;When you use Claude Code, Cursor, GitHub Copilot, ChatGPT, Mistral Vibe, or any LLM-based coding assistant, your source code is sent over HTTPS to a remote API. That API runs on servers you don't control, in a jurisdiction you didn't choose, operated by a company whose data practices you've accepted by clicking "I agree."&lt;/p&gt;

&lt;p&gt;Let's be specific about where your code goes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;API provider&lt;/th&gt;
&lt;th&gt;Server locations&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude Code / Cursor (Claude)&lt;/td&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;td&gt;US (AWS us-east, us-west)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GitHub Copilot&lt;/td&gt;
&lt;td&gt;Microsoft / OpenAI&lt;/td&gt;
&lt;td&gt;US (Azure data centers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ChatGPT&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;US (Azure data centers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cursor (OpenAI mode)&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mistral Vibe / Le Chat&lt;/td&gt;
&lt;td&gt;Mistral AI&lt;/td&gt;
&lt;td&gt;EU (France, via cloud providers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek&lt;/td&gt;
&lt;td&gt;DeepSeek&lt;/td&gt;
&lt;td&gt;China&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini Code Assist&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;US (GCP data centers)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most developers don't think twice about this. They open their IDE, the AI suggests code, they accept. Behind the scenes, the IDE sent the contents of the current file — and often surrounding files, imports, and project context — to a server thousands of kilometers away.&lt;/p&gt;




&lt;h2&gt;
  
  
  What exactly is being sent?
&lt;/h2&gt;

&lt;p&gt;It's not just "a few lines of code." Modern AI coding tools send rich context to produce better suggestions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The current file&lt;/strong&gt; — full content, not just the cursor position&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open tabs and imported files&lt;/strong&gt; — the AI reads your project structure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File paths&lt;/strong&gt; — revealing your package hierarchy (&lt;code&gt;com.acme.billing.service.InvoiceService&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configuration files&lt;/strong&gt; — &lt;code&gt;application.yml&lt;/code&gt;, &lt;code&gt;pom.xml&lt;/code&gt;, &lt;code&gt;.env&lt;/code&gt; with database URLs, API keys, internal hostnames&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Comments and Javadoc&lt;/strong&gt; — containing business logic descriptions, TODO items, bug references&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test files&lt;/strong&gt; — revealing edge cases, business rules, validation logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Git context&lt;/strong&gt; — commit messages, branch names, sometimes diffs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A single prompt to an AI coding assistant can contain more context about your business than a 10-page architecture document.&lt;/p&gt;




&lt;h2&gt;
  
  
  The risks are real and specific
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Source code leakage
&lt;/h3&gt;

&lt;p&gt;Your code is transmitted to and processed on third-party infrastructure. Even if the provider promises not to train on your data (and many do), the code still:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Transits through networks&lt;/strong&gt; you don't control — intermediate proxies, load balancers, logging systems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Is stored temporarily&lt;/strong&gt; for processing — cache layers, request logs, debugging infrastructure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;May be retained for abuse detection&lt;/strong&gt; — most providers log requests for safety monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Could be subpoenaed&lt;/strong&gt; — US providers are subject to US law enforcement requests, including the CLOUD Act which allows cross-border data access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The question is not "will the provider deliberately steal my code?" It's "how many systems touch my code between my IDE and the model, and who has access to those systems?"&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Intellectual property exposure
&lt;/h3&gt;

&lt;p&gt;Source code is a trade secret. Once exposed, trade secret protection can be lost permanently — unlike patents or copyrights, trade secrets only have value as long as they remain secret.&lt;/p&gt;

&lt;p&gt;What your code reveals:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Element&lt;/th&gt;
&lt;th&gt;What it exposes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Class and method names&lt;/td&gt;
&lt;td&gt;Your business domain and capabilities (&lt;code&gt;FraudDetector&lt;/code&gt;, &lt;code&gt;TaxCalculator&lt;/code&gt;, &lt;code&gt;PatentAnalyzer&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Package structure&lt;/td&gt;
&lt;td&gt;Your architecture and module boundaries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Algorithm implementations&lt;/td&gt;
&lt;td&gt;Your competitive advantage (pricing logic, recommendation engines, risk models)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database schema&lt;/td&gt;
&lt;td&gt;Your data model and relationships&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API endpoints&lt;/td&gt;
&lt;td&gt;Your service surface and capabilities&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Configuration&lt;/td&gt;
&lt;td&gt;Your infrastructure topology&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Comments&lt;/td&gt;
&lt;td&gt;Your business rules in plain language&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A competitor with access to your AI provider's logs could reconstruct your product's architecture, business rules, and technical approach without ever seeing your actual repository.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Client code exposure (integrators and freelancers)
&lt;/h3&gt;

&lt;p&gt;If you're a &lt;strong&gt;consulting firm&lt;/strong&gt;, &lt;strong&gt;systems integrator&lt;/strong&gt;, or &lt;strong&gt;freelance developer&lt;/strong&gt;, the risk multiplies. You're not just exposing your own code — you're exposing your client's code.&lt;/p&gt;

&lt;p&gt;Consider the scenarios:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You customize an ERP for a bank.&lt;/strong&gt; You send controller code to Claude that contains transaction processing logic, compliance rules, and internal API endpoints. That code belongs to the bank, not to you.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You build a SaaS platform for a healthcare company.&lt;/strong&gt; You use Copilot while working on patient data models. HIPAA-regulated data structures are now on Microsoft's servers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You maintain a defense contractor's codebase.&lt;/strong&gt; You use an AI to debug a networking module. The code may be subject to ITAR export controls — sending it to a US cloud provider may technically comply, but sending it to a Chinese provider (DeepSeek) would be a violation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most client contracts include clauses about code confidentiality and data handling. Using AI coding tools on client code may violate these contracts — and the client may never know until a breach occurs. But if it occurs and you are the one in charge of the code, this may a very bad stone in your shoe.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Regulatory and compliance risks
&lt;/h3&gt;

&lt;p&gt;Depending on your industry and jurisdiction, sending source code to external AI services can create compliance issues:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Regulation&lt;/th&gt;
&lt;th&gt;Risk&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;GDPR&lt;/strong&gt; (EU)&lt;/td&gt;
&lt;td&gt;If your code processes personal data and the code itself contains PII patterns, field names, or test data, sending it to a US server may violate data transfer rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SOC 2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Requires documented controls over data access. Using AI tools without DLP controls may fail audit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ISO 27001&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Requires risk assessment for third-party data processing. AI coding tools are a new attack vector&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;HIPAA&lt;/strong&gt; (US healthcare)&lt;/td&gt;
&lt;td&gt;Code containing PHI field names, validation rules, or test fixtures with patient data patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PCI DSS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Code handling payment card data, encryption keys, or tokenization logic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;ITAR&lt;/strong&gt; (US defense)&lt;/td&gt;
&lt;td&gt;Export-controlled technical data cannot be shared with foreign persons or servers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;NIS2&lt;/strong&gt; (EU)&lt;/td&gt;
&lt;td&gt;Critical infrastructure operators must control their software supply chain&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Even if you're not in a regulated industry, your clients might be. And their auditors will ask how their code is protected.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. The training data question
&lt;/h3&gt;

&lt;p&gt;Most AI providers now offer policies like "we don't train on your data." But:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Policies change.&lt;/strong&gt; OpenAI initially trained on API data, then reversed course after backlash. What's the policy today may not be tomorrow's policy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policies have exceptions.&lt;/strong&gt; Abuse detection, safety monitoring, and model evaluation may still use your data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Free tiers have different rules.&lt;/strong&gt; ChatGPT Free explicitly trains on your conversations. Many developers prototype with the free tier before switching to paid.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Subprocessors matter.&lt;/strong&gt; The AI provider may not train on your data, but what about their cloud provider? Their logging vendor? Their CDN?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data breaches happen.&lt;/strong&gt; Samsung's semiconductor division leaked proprietary chip designs through ChatGPT in 2023. OpenAI suffered a data breach in March 2023 where users could see other users' chat titles. Even claude code has recently leaked!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The safest assumption: anything you send to an AI service should be treated as if it could become public.&lt;/p&gt;




&lt;h2&gt;
  
  
  The false sense of security
&lt;/h2&gt;

&lt;h3&gt;
  
  
  "But we use the enterprise plan"
&lt;/h3&gt;

&lt;p&gt;Enterprise plans typically offer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No training on your data&lt;/li&gt;
&lt;li&gt;Data processing agreements (DPAs)&lt;/li&gt;
&lt;li&gt;SOC 2 compliance of the provider&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What they don't offer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Control over where the data is processed&lt;/li&gt;
&lt;li&gt;Guarantees about intermediate systems&lt;/li&gt;
&lt;li&gt;Protection against subpoenas or government data requests&lt;/li&gt;
&lt;li&gt;Deletion verification (you can't audit what you can't see)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  "But we use a self-hosted model"
&lt;/h3&gt;

&lt;p&gt;Self-hosted models (Llama, Mistral, CodeLlama) solve the data residency problem but introduce others:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dramatically lower code quality compared to frontier models&lt;/li&gt;
&lt;li&gt;Significant infrastructure costs&lt;/li&gt;
&lt;li&gt;No access to the latest model capabilities (Claude Opus, GPT-4o)&lt;/li&gt;
&lt;li&gt;Still requires GPU infrastructure that someone must maintain&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  "But we only send small snippets"
&lt;/h3&gt;

&lt;p&gt;AI coding tools send more context than you think. And even small snippets reveal information:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// "Just a small function"&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="nf"&gt;calculateRoyalty&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Contract&lt;/span&gt; &lt;span class="n"&gt;contract&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;SalesReport&lt;/span&gt; &lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="n"&gt;baseRate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;contract&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getRoyaltyRate&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getNetSales&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;subtract&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;report&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getReturns&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contract&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hasMinimumGuarantee&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;multiply&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;baseRate&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;max&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contract&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMinimumGuarantee&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;multiply&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;baseRate&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This "small snippet" reveals: you have a royalty calculation business, contracts have minimum guarantees, you track returns separately from net sales, and your financial model uses &lt;code&gt;BigDecimal&lt;/code&gt; precision. A competitor now knows your pricing model structure.&lt;/p&gt;




&lt;h2&gt;
  
  
  The solution: obfuscate before sending
&lt;/h2&gt;

&lt;p&gt;The principle is simple: &lt;strong&gt;rename everything that reveals business meaning before the AI sees it, then reverse the renaming when applying the AI's changes.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your code:                          What the AI sees:
calculateRoyalty()          -&amp;gt;      mtd_a1b2c3d4()
Contract contract           -&amp;gt;      Cls_e5f6a7b8 fld_9c8d7e6f
getRoyaltyRate()            -&amp;gt;      mtd_1a2b3c4d()
hasMinimumGuarantee()       -&amp;gt;      mtd_5e6f7a8b()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI can still:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understand the code structure (types, control flow, patterns)&lt;/li&gt;
&lt;li&gt;Suggest refactorings and bug fixes&lt;/li&gt;
&lt;li&gt;Add new functionality&lt;/li&gt;
&lt;li&gt;Write tests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What it cannot do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Infer your business domain&lt;/li&gt;
&lt;li&gt;Reconstruct your architecture from meaningful names&lt;/li&gt;
&lt;li&gt;Extract business rules from comments (stripped)&lt;/li&gt;
&lt;li&gt;Identify your company from package names (flattened)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What a proper obfuscation tool must handle
&lt;/h3&gt;

&lt;p&gt;It's not as simple as find-and-replace. Java's framework ecosystem means certain identifiers carry semantic meaning for the runtime:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Spring Data&lt;/strong&gt; repository methods (&lt;code&gt;findByName&lt;/code&gt;) derive SQL queries from the method name&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lombok&lt;/strong&gt; generates accessor methods from field names&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JPA&lt;/strong&gt; uses entity class names in JPQL query strings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Jackson&lt;/strong&gt; derives JSON field names from Java field names&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spring Config&lt;/strong&gt; binds YAML keys to field names&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A good obfuscation tool detects these frameworks and protects the identifiers that would break. Everything else gets renamed.&lt;/p&gt;

&lt;h3&gt;
  
  
  The full cycle must work
&lt;/h3&gt;

&lt;p&gt;Obfuscation is only useful if the cycle is complete:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Source compiles     -&amp;gt; Obfuscate -&amp;gt; Obfuscated compiles
                                 -&amp;gt; AI modifies -&amp;gt; Still compiles
                                                -&amp;gt; Apply back -&amp;gt; Source still compiles
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every transition can break. Framework detection, JPQL string updating, comment stripping, 3-way merge for reverse-application — all are necessary for a production-ready workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  What you should do today
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Immediate steps
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Audit what your AI tools send.&lt;/strong&gt; Enable request logging or use a proxy to see what context is transmitted. You'll likely be surprised.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Check your client contracts.&lt;/strong&gt; Look for clauses about code confidentiality, data processing, and third-party tools. Many contracts written before 2023 don't explicitly address AI coding tools — which doesn't mean they allow them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Establish an AI coding policy.&lt;/strong&gt; Define which projects can use AI tools, which cannot (client code, regulated code), and what safeguards are required.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Consider obfuscation.&lt;/strong&gt; For projects where AI assistance is valuable but code exposure is unacceptable, obfuscation provides the best of both worlds: AI productivity without IP exposure.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  For regulated industries
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Document your AI tool usage&lt;/strong&gt; in your risk register. Auditors will ask.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Include AI tools in your data processing agreements&lt;/strong&gt; with clients.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Evaluate data residency requirements.&lt;/strong&gt; If your data must stay in the EU, most US-based AI providers don't qualify without additional safeguards.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  For integrators and freelancers
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Get explicit written consent&lt;/strong&gt; from clients before using AI tools on their code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use obfuscation by default&lt;/strong&gt; on client projects. It's a competitive advantage: "We use AI to deliver faster, and we protect your code while doing it."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Include AI tool policies in your contracts.&lt;/strong&gt; Define what tools you use, how code is protected, and what the client's options are.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;AI coding assistants are transformative tools. They make developers faster, reduce boilerplate, and help navigate unfamiliar codebases. But they come with a fundamental trade-off: to help you, the AI needs to see your code. And "seeing your code" means transmitting it to infrastructure you don't control, in jurisdictions you didn't choose, with data handling practices you can't verify.&lt;/p&gt;

&lt;p&gt;The answer is not to stop using AI tools. The answer is to stop sending your code in clear text.&lt;/p&gt;

&lt;p&gt;Obfuscate your identifiers. Strip your comments. Sanitize your configuration. Let the AI work on the structure of your code without knowing what your code does. You get the productivity benefits. Your intellectual property stays yours.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;PromptCape is a Java code obfuscation tool designed for AI coding workflows. It handles framework detection, compilation verification, and smart reverse-application. Free trial at &lt;a href="https://promptcape.com/" rel="noopener noreferrer"&gt;PromptCape&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>privacy</category>
      <category>promptcape</category>
      <category>java</category>
    </item>
  </channel>
</rss>
