Genevieve Breton

Posted on Jun 3

Python obfuscation for AI assistants: runnable workspaces and off-disk secrets

#python #ai #security #privacy

Why obfuscating Python for AI tools requires a different mental model than Java — and how .env handling becomes the load-bearing question.

Java vs Python: a different relationship with the workspace

Obfuscating Java for an AI assistant is — at heart — about producing a workspace that still compiles. The developer rarely runs the obfuscated workspace directly; they let the AI work in it, apply the changes back to source, and run the app from there. Compilation is the contract. If mvn test-compile passes after obfuscation, you're 95% done.

Python is a fundamentally different game. There is no compile step. The workspace's "validation" happens at runtime, when the developer fires up:

streamlit run dashboard.py
pytest -v
python main.py
uvicorn main:app --reload

If a framework introspects a class name, a function name, a string in a URL pattern, or a Pydantic field — and that name was rewritten by the obfuscator — the error surfaces only when Python tries to call it. There is no compiler to catch it for you.

That changes the obfuscator's job in three concrete ways:

What you protect changes. Identifier names that double as string identifiers (template references, JSON keys, test discovery names) become the primary battle, not the secondary one.
What you check after obfuscation changes. A Python --verify step can't compile; it has to do static import resolution and let runtime catch the rest.
What you do with secrets changes. A Java workspace is read-only for the AI. A Python workspace is run by the developer, which means real values have to be available somewhere — and the naive choice (copy .env to the workspace) instantly defeats the obfuscation's purpose.

This article walks through each of those, then explains the promptcape run pattern: how to give the Python workspace the env vars it needs at launch time without ever writing them to disk in the AI-visible location.

Names that double as string contracts

In Java, framework conventions usually leave a compile-time trace: a missing getName() from a Lombok-renamed field throws cannot find symbol at javac time. You can detect it, you can auto-fix it. Spring Data derived queries (findByActiveTrue) are the rare exception that bites at startup, not at compile time — and that's already documented as a hard case.

Python frameworks are full of conventions like Spring Data. Names are silent contracts:

Framework	Identifier	Contract
Pydantic v2	`class User(BaseModel): email: str`	`email` is the JSON key in every `user.model_dump()` call. Rename it, every API consumer breaks silently.
Flask	`def index(): ...` decorated with `@bp.route("/")`	The function name becomes the default endpoint string for `url_for("blog.index")` and `{{ url_for('blog.index') }}`. Rename it, every redirect and template link 500s with `werkzeug.routing.BuildError`.
Django	`class Post(models.Model)`	The class name drives the DB table (`app_label_post`) AND every migration reference. Rename it, your `INSERT` query targets a table that doesn't exist.
SQLAlchemy	`id = Column(Integer, primary_key=True)`	`id` is the column name on the table. Plus it's accessed as `instance.id` everywhere.
pytest	`def test_login_succeeds(...)`	pytest discovers tests by the `test_` prefix. Rename to `mtd_xxx`, pytest collects 0 tests — your CI silently passes with no signal.
dataclass / attrs	`@dataclass class Post: title: str`	Field names are accessed as `obj.title`, dumped via `asdict()` and rendered in Jinja templates as `{{ post.title }}`.
Django forms / DRF serializers	`def clean_email(self): ...`	Django discovers field-level validators by the literal `clean_<field>` / `validate_<field>` name. Rename it, your validation silently disappears.
Celery	`@shared_task def send_email(recipient, subject)`	`send_email.delay(recipient="alice@…")` serialises the kwarg name through the broker (Redis, RabbitMQ). The worker reconstructs the call as `send_email(recipient=…)`; rename the parameter to `p_xxx` and the worker raises `TypeError: got an unexpected keyword argument 'recipient'`. Affects function name AND every parameter name.
Click / Typer	`@click.option("--config") def run(config)`	Click maps the CLI option `--config` to the Python kwarg `config` by string. Rename the parameter and the CLI call `run --config foo.yaml` raises `TypeError: got an unexpected keyword argument 'config'`. Affects every option/argument parameter.

All of these are invisible at "compile" time (which doesn't exist anyway). They fail at runtime, often in the form of a 500 in the second route the AI touches.

The fix has to be proactive detection, not reactive. For each framework, scan the project for the relevant declarations and add the discovered names to a project-wide exclusion list before identifier collection. The PromptCape codebase has 16 Python detectors today, 11 of which run an AST scan (the rest are pure import-check + fixed name lists):

PydanticDetector       AST scan: every BaseModel/RootModel field name
SqlalchemyDetector     AST scan: every declarative-model column / relationship
StreamlitDetector      AST scan: every top-level callable in streamlit scripts
FlaskDetector          AST scan: every @bp.route / @bp.get / @bp.errorhandler view
DataclassDetector      AST scan: every @dataclass / @attrs.define field
PytestDetector         AST scan: every def test_* / class Test* in the project
DjangoDetector         AST scan: model class+fields, CBV+FBV view names, form fields + clean_X methods
CeleryDetector         AST scan: every @app.task / @shared_task — function name + every parameter
ClickTyperDetector     AST scan: every @click.command / @app.command — function name + every parameter
RequestsHttpxDetector  Fixed list: ~110 names (Response attrs, request kwargs, exceptions); no AST scan
StdlibCommonAttrsDetector   ~230 stdlib method names (close, year, keys, items, split, …) — fixed list

The AST scans run via a bundled Python sidecar that parses each candidate file with LibCST and emits the discovered names back to the Java engine as JSON. The engine merges every detector's output into a single exclusion set before the obfuscation pass starts.

--verify for Python: import resolution, not compilation

Java's --verify runs mvn test-compile and reads javac output. There's a one-line equivalent on the Python side: there is none.

Python's closest analogue is importlib.util.find_spec(...). Given a dotted name like staffing.database, it returns None if the module can't be located, or a ModuleSpec if it can. The catch: it executes the parent package's __init__.py while looking. If staffing/__init__.py does from .database import sqlalchemy_stuff, then find_spec transitively imports SQLAlchemy, your DB driver, and probably half your app.

That's a non-starter for an obfuscation verification step: you don't want to import the user's code, you don't want the user's third-party dependencies installed in the sidecar's Python interpreter, and you definitely don't want side effects (database connections opened at module import time — a real Python anti-pattern, but common).

The strategy PromptCape ended up with classifies each import statement at the AST level and routes it to a different check:

Import shape	Check
`import xmlrpc.client` (top-level is a stdlib name)	`importlib.util.find_spec("xmlrpc.client")` — safe, stdlib never has side effects
`from staffing.database import X` (top-level is a workspace-local directory)	Check that `workspace/staffing/database.py` or `workspace/staffing/database/__init__.py` exists on disk. Never imports the file.
`import sqlalchemy` (third-party)	Skipped. Can't verify without the project's virtualenv installed alongside the sidecar — too much false-positive noise. Trust it.

This catches the canonical bug — import xmlrpc.client rewritten to import xmlrpc.fld_b8460726 when a user identifier client lands in the registry — without needing any project dependencies to be installed where the obfuscator runs.

It does NOT catch runtime AttributeError on stdlib instances (e.g. today.year where year was renamed because the user has a function called year). For those, the proactive detector pattern is the only option: a StdlibCommonAttrsDetector with ~210 of the most-commonly-accessed stdlib attribute names, applied unconditionally. The trade-off is real (user methods literally called year won't be obfuscated either) but the alternative is a workspace that crashes on the first date in the codebase.

Comments and docstrings: line count is the load-bearing property

Java obfuscation strips comments to // Processed. while preserving line count, because the reverse-apply 3-way merge needs 1:1 line correspondence between the source and the obfuscated cache.

Python has the same requirement but two distinct constructs:

Line comments (# something) — analogous to Java's // something.
Docstrings ("""multi-line""") — strings that are the first statement of a Module / FunctionDef / ClassDef body.

Stripping both is straightforward. The line-count preservation is what takes care:

# Original                          # After obfuscation
"""Module docstring                 """Processed.
spanning four
lines.
"""                                 """
                                    (4 newlines, same span)

def foo():                          def mtd_xxx():
    """Function docstring."""           """Processed."""
    return 42                           return 42

# A line comment                    # Processed.

For multi-line docstrings the rule is: count the \n characters in the original string value, emit """Processed. + N newlines + """. Stays on the same number of source lines so any File "...", line 243 in a traceback still points at the same source line in both versions.

The first version of the docstring stripper had a subtle bug: it assumed FunctionDef.body was always an IndentedBlock (the multi-line form, def foo():\n body). One-liner functions like def foo(): return 1 use a SimpleStatementSuite body — a totally different LibCST node type — and the stripper crashed with 'SimpleStatementSuite' object is not subscriptable. The exception was caught and the whole file was silently copied verbatim to the workspace, which manifested days later as ImportError: cannot import name 'OdooClient' (the import line was preserved as-is in the verbatim copy while the class definition was renamed in the obfuscated odoo_client.py).

The fix is mechanical (handle both body shapes), but the lesson is general: in Python obfuscation, the silent verbatim fallback is a foot-gun. The diagnostic command is worth memorising:

# Lists every .py in the workspace that has zero obfuscation markers
for f in $(find ~/.promptcape/cache/<hash> -name "*.py" -size +10c); do
  count=$(grep -c "fld_\|mtd_\|Cls_\|Processed" "$f")
  [ "$count" = "0" ] && echo "VERBATIM: $f"
done

Files that come out are either empty placeholders (fine — conftest.py is often empty in test suites) or fell through the fallback (file the bug).

The .env problem

This is the question that splits Python obfuscation from Java obfuscation more than anything else.

A Java workspace is typically read-only for the AI. The developer obfuscates, the AI works in the obfuscated copy, the developer applies changes back to source, and the app runs from the source project (with the real .env, the real application.properties, the real DB). The obfuscated workspace's job is to be readable, not runnable.

A Python workspace gets run by the developer. They iterate. They open Streamlit. They run pytest. They start the dev server. That requires real config values at runtime — but .env files are pure secrets: API keys, database URLs, OAuth client secrets. There is no "structure" to preserve in a .env file the way there is in application.properties (where keys are part of the architecture and values are leaf secrets). It's secrets all the way down.

The first iteration of the Python pipeline ran the existing Java sanitizer on .env:

# Original .env
DATABASE_URL=postgres://prod-db.acme.com:5432/myapp
SECRET_KEY=hunter2
ACTIVITY_MONTHS=6

# Sanitized .env (copied to workspace)
DATABASE_URL=REDACTED
SECRET_KEY=REDACTED
ACTIVITY_MONTHS=REDACTED

The first time the developer ran streamlit run from the workspace, it crashed instantly:

ValueError: invalid literal for int() with base 10: 'REDACTED'
  File ".../dashboard.py", line 243, in <module>
    ACTIVITY_MONTHS = int(os.getenv('ACTIVITY_MONTHS', '6'))

ACTIVITY_MONTHS=6 is not a secret. It's a config knob. But the sanitizer was uniform: redact everything because some entries are sensitive. That works for Java where the workspace doesn't run, but it instantly bricks the Python use case.

Three options surfaced:

Option	Workspace runs?	AI sees secrets?
A. Copy `.env` verbatim	Yes	Yes (any tool that reads files sees them)
B. Sanitize all values	No (crashes on first int/bool/URL parse)	No
C. Sanitize selectively (heuristics for "looks like a secret")	Maybe (depends on heuristic quality)	Mostly no

A and B are bad in different ways. C is fragile — every secret format you don't think of becomes a leak, and every config value that happens to match the heuristic becomes a crash.

The fix that actually worked is to recognise that the workspace doesn't need .env on disk at all. It needs the env vars at the moment a child process starts. There's a layer between "secrets at rest" and "secrets in the running app's environment" that PromptCape can sit on.

`promptcape run`: inject `.env` at subprocess launch, never on disk

The pattern is borrowed from how 12-factor apps deploy in containers: the orchestrator reads the secret store at container start time and exports keys into the process environment. The container image itself contains no secrets.

Translated to PromptCape:

promptcape obfuscate writes the workspace without .env. A small file .env.promptcape-pointer is written instead, with the absolute path to the source .env and instructions to use promptcape run. The AI sees the pointer if it opens it — that's intentional; we want the indirection documented.
promptcape run <command> is a wrapper that:
- Resolves the source project from the current working directory (same mechanism as promptcape apply / promptcape status).
- Parses <source>/.env and <source>/.env.local with a minimal python-dotenv-compatible parser.
- Spawns <command> with cwd = workspace, the child's environment populated from the current OS env layered with the parsed .env entries.
- Inherits stdin/stdout/stderr so the child has a real TTY (colors, prompts, progress bars all work).
- Propagates the child's exit code.

The flow:

# Source project: ~/projects/my-streamlit-app/.env
# DATABASE_URL=postgres://prod-db.acme.com:5432/myapp
# SECRET_KEY=hunter2
# ACTIVITY_MONTHS=6

cd ~/projects/my-streamlit-app
promptcape obfuscate --language python --verify .
# -> ~/.promptcape/cache/a1b2c3d4/
#    ├── (the obfuscated code)
#    └── .env.promptcape-pointer    (text file, no values)

cd ~/.promptcape/cache/a1b2c3d4
promptcape run streamlit run dashboard.py
# 1. reads ~/projects/my-streamlit-app/.env
# 2. spawns `streamlit run dashboard.py` in cwd=workspace
# 3. child environment: OS env + DATABASE_URL=postgres://... + SECRET_KEY=hunter2 + ACTIVITY_MONTHS=6
# 4. streamlit starts normally; os.getenv('DATABASE_URL') returns the real value

For pytest, the same shape:

promptcape run pytest -v
# tests run against the obfuscated source, with real env vars injected at child launch

For Java apps using Spring Boot's relaxed binding (DATABASE_PASSWORD env var overrides database.password property), the SAME command works without any extra plumbing:

promptcape run mvn spring-boot:run
# Spring Boot reads OS env vars (precedence rank 5) before application.properties (rank 8).
# The sanitized application.properties in the workspace has database.password=REDACTED.
# The OS env var DATABASE_PASSWORD=real overrides it. App starts with real credentials.
# No .env ever copied to the workspace.

Three properties this gets right:

Secrets never touch the AI-visible workspace directory. An AI tool with file-read access can grep ~/.promptcape/cache/<hash> all it wants — there are no values to find.
Explicit failure mode. If the developer runs pytest directly (without promptcape run), the app starts with no env vars and crashes at the first os.getenv('REQUIRED_KEY'). That's loud, it's traceable, and it's correct — they're missing the wrapper.
No code change in the user's project. load_dotenv() calls in the user's code become a graceful no-op (no .env to find), but os.getenv('KEY') finds the value in the child environment. The framework's startup path is unchanged.

The downside: developers have to remember to use promptcape run. The mitigation is documentation (.env.promptcape-pointer is the first place they look when something doesn't read env vars), the proxy/Cursor-terminal integration (which can wrap launches automatically), and a clear failure message when the wrapper is forgotten.

The complete Python cycle

1. pytest                              -> GREEN (source is healthy)
2. promptcape obfuscate --verify       -> Obfuscated workspace created
                                          .env NOT copied; pointer file written
3. promptcape run pytest               -> GREEN (workspace runs with real env vars
                                          injected at subprocess launch)
4. AI modifies obfuscated code
5. promptcape run pytest               -> GREEN (AI changes work in the runtime)
6. promptcape apply                    -> Changes applied to source
7. pytest                              -> GREEN (de-obfuscated changes work)

Each step has a specific failure mode:

Step 2 → 3: if the workspace fails to run, it's almost always a framework-name collision the detectors missed. The fix is to grep the obfuscated workspace for the obfuscated identifier (grep -rn "mtd_098fd2b6" .) to see what real name the AI sees in context, then add it to the relevant detector. Real-world examples that surfaced this way: cursor.close() (sqlite3 Cursor method), today.year (datetime.date attribute), df.value_counts().to_dict() (pandas chain), engine.connect() (SQLAlchemy lifecycle). Each got added to the protected list once.
Step 5 → 6: the AI invented an obfuscated name that's not in the registry. The reverse-apply step has a hash-resolver that maps Cls_e5f6a7b8 patterns back to known real names. Same mechanism as Java.
Step 6 → 7: rare — usually means the AI introduced a syntax error. The pre-apply --compile-gate check (which for Python is the static import verifier) catches most of these.

A note on what this does NOT protect

It is worth being explicit about the threat boundary, because Python's open-source nature makes the question come up naturally: if my distributed Python app ships as .py files anyone can read, why bother obfuscating it for the AI in the first place?

The answer is that those are two different threats living in two different lifecycle stages.

Threat	When	Who reads the source	What protects
AI-provider transit	Development sessions (Claude Code, Cursor, Aider…)	Anthropic / OpenAI / Mistral on their servers	PromptCape — obfuscate before sending, reverse-map the reply
End-user inspection	After product release	Anyone who installs the `.py`, `.pyc`, or PyInstaller bundle	Native compilation (Nuitka, Cython), commercial obfuscators (PyArmor), or SaaS-only deployment

PromptCape's obfuscated workspace lives in ~/.promptcape/cache/<hash>/ on the developer's own machine, only during AI sessions. It never ships with the product. After promptcape apply, the developer's source tree is back to real names. Whatever the developer builds and distributes is independent of whether they used PromptCape that day or not.

The two layers are also independent in the opposite direction: a Nuitka-compiled binary doesn't help the developer at all while they're prompting Claude with their real source code — that's not when end users are looking, that's when the AI provider's logs are being written. A developer who needs both protections uses both: PromptCape during development, Nuitka at release. The combination covers the full lifecycle.

Specifically on Python distribution effort levels:

.py files: readable as-is. Zero effort.
.pyc-only (python -m compileall): decompiles cleanly in seconds with decompyle3 or uncompyle6.
PyInstaller / cx_Freeze / py2exe: embed .pyc inside a bundle that pyinstxtractor cracks open in 5–10 minutes.
PyArmor (commercial): custom encrypted loader. Hours-to-days of reverse-engineering effort depending on the obfuscation level chosen.
Nuitka or Cython: compile Python through C to a real native binary. Days-to-weeks of effort for a determined reverser. The strongest open-source option.
SaaS / cloud-only: the only mathematically tight answer — if the code never leaves your servers, no one can read it on their disk.

This isn't a Python-specific issue. Java has the same shape — .class files in a .jar decompile cleanly with jd-gui / CFR / Procyon, and the traditional answer is ProGuard or R8 name-mangling at release-build time, which is conceptually identical to what PromptCape does at AI-session time but applied at a different lifecycle point. The two layers don't replace each other; a Java product that ships obfuscated bytecode AND uses PromptCape during development covers both transit and distribution leaks.

Conclusion

Python obfuscation for AI assistants is not a port of the Java pipeline. The fundamental shift — the developer runs the workspace, not just reads it — changes every layer: what you protect (name contracts, not just identifiers), how you verify (file existence, not compilation), and how you handle secrets (inject at subprocess launch, never on disk).

The three insights from building this:

Names that double as strings are the hard cases, and they're proactive-only. No compile error catches def index() → def mtd_xxx() when Flask looks up the endpoint string "blog.index". The detector has to know the framework's discovery rules in advance.
A silent verbatim fallback hides bugs for days. If a file fails to obfuscate, the engine must surface that loudly. The .env.promptcape-pointer and the verbatim-detection grep snippet exist precisely because the failure mode is silent otherwise.
.env doesn't need to be on disk in the workspace. The wrapper-injection pattern (promptcape run) gives the workspace real values at runtime without ever writing them to the AI-readable directory. This is the load-bearing pattern that makes the rest of the security story coherent: if the developer can run the workspace and the secrets never leave the source project, the AI assistant has zero attack surface on credentials.

PromptCape ships open for trial at https://promptcape.com/ — free for 3 months, no credit card required. The Python pipeline, the 16 framework detectors, and the promptcape run wrapper ship in the same JAR as the Java pipeline; the language is auto-detected from the source tree.

DEV Community

Python obfuscation for AI assistants: runnable workspaces and off-disk secrets

Java vs Python: a different relationship with the workspace

Names that double as string contracts

--verify for Python: import resolution, not compilation

Comments and docstrings: line count is the load-bearing property

The .env problem

`promptcape run`: inject `.env` at subprocess launch, never on disk

The complete Python cycle

A note on what this does NOT protect

Conclusion

Top comments (0)

Java vs Python: a different relationship with the workspace

Names that double as string contracts

--verify for Python: import resolution, not compilation

Comments and docstrings: line count is the load-bearing property

The .env problem

promptcape run: inject .env at subprocess launch, never on disk

The complete Python cycle

A note on what this does NOT protect

Conclusion

`promptcape run`: inject `.env` at subprocess launch, never on disk