Labyrinx

Posted on Jun 18 • Edited on Jul 1 • Originally published at labyrinx-dev.github.io

How to Protect Python Code in 2026 — Obfuscation, Compilation, and Encryption That Actually Works

#python #security #tutorial #cython

Python ships as readable source code. Even when bundled into an EXE, well-known extraction tools recover your source in under a minute. If you're selling desktop Python software, distributing internal tools, or shipping code you'd rather not hand over as plaintext — you need protection.

After spending months building a Python code protection tool (Labyrinx), here's what I learned about what actually works — and what doesn't.

The 30-Second Attack

# Attacker recovers your entire source in under a minute:
> pyinstxtractor your_app.exe       # extracts .pyc bytecode
> uncompyle6 extracted/main.pyc      # decompiles to readable Python
# Output: your original source. Names, strings, comments — everything.

This isn't a Python flaw. Interpreted languages are inherently open. Protection means adding layers that each make reverse engineering harder.

The Four Levels of Python Code Protection

Level 1: Obfuscation (Free)

Rename every variable, function, and class to random tokens. Strip comments and docstrings. The code runs identically but reads like gibberish.

# Before
def calculate_discount(price, user_tier):
    if user_tier == "premium":
        return price * 0.8

# After name obfuscation
def _x7f3a(_a1, _a2):
    if _a2 == _l_s(0x17):
        return _a1 * 0.8

Stops: Casual reading. A competitor opening your .py file.
Doesn't stop: Anyone with a debugger. Names are cosmetic.

Level 2: Compilation to Native Code

Python → C translation → C compiler → .pyd (native x64 shared library). No bytecode to decompile. An attacker gets assembly, not Python.

# Native compilation example
cython my_module.pyx
cl.exe /LD my_module.c /I Python313/include
# Output: my_module.cp313-win_amd64.pyd  ← assembly, not bytecode

Stops: pycdc, uncompyle6, every Python decompiler.
Doesn't stop: strings.exe finds every string literal. An experienced reverse engineer can trace the assembly.

Level 3: String and Module Encryption (AES-256)

Encrypt every string literal so strings.exe finds nothing. Encrypt entire modules so they're opaque ciphertext on disk.

# Before encryption
> strings.exe my_app.pyd | grep "api_key"
sk-proj-4f8a3b2c1d9e6f7a8b3c2d1e9f6a7b8c  # ← found!

# After AES-256 string encryption
> strings.exe my_app.pyd | grep "api_key"
# Nothing. Every string encrypted at rest.

Stops: Static analysis tools, strings.exe, disk forensics.
Doesn't stop: A memory dump captured at the exact moment of decryption.

Level 4: Code Virtualization (Custom VM)

Replace Python bytecode with a custom bytecode that only a custom VM understands. The VM's instruction set is randomized per build — reversing one build teaches nothing about the next.

Instead of import my_module loading standard Python bytecode, it loads instructions for a VM only Labyrinx can interpret. The opcode for "add" is 0x4F in build #1 and 0x93 in build #2.

Stops: Every standard reverse engineering tool. IDA Pro and Ghidra understand x64 — not Labyrinx VM bytecode.
Doesn't stop: A dedicated attacker who traces the VM interpreter itself. This takes weeks, not seconds.

The Stack in Practice: Labyrinx

I built Labyrinx to chain all four levels:

Source → Name Obfuscation → Control Flow Flattening → String Encryption
  → Module Encryption → Cython → MSVC → .pyd → Custom VM → Output Folder

The output is a self-contained folder:

MyApp/
├── MyApp.exe              ← 30 KB launcher
├── python313.dll          ← embedded Python (no install)
├── my_app.pyd             ← your code (6 layers deep)
└── Lib/site-packages/     ← dependencies

Zip it and ship it. No Python install required on the target machine.

Why Folders, Not Single EXEs

I used single-file EXE bundlers for years. Every other release, customers emailed: "Your app is a virus!" Antivirus heuristics flag packed EXEs. Labyrinx outputs a clean folder — plain .pyd files and a tiny launcher. No packing, no temp extraction, no false positives.

How Labyrinx Compares to Other Approaches

	Labyrinx (Enterprise)	Bytecode Obfuscation	C-to-Native Compilation	Transpiler EXE
Name obfuscation	✅	✅	❌	❌
String encryption	✅ AES-256	✅	❌	❌
Native compilation	✅ .pyd	❌ bytecode	✅ .pyd	✅ EXE
Code VM	✅ randomized	❌	❌	❌
Anti-debug	✅ multi-point	❌	❌	❌
License system	✅ built-in	Varies	❌	❌
AV-friendly	✅	⚠️ (packed EXE)	✅	⚠️
Pricing	$29/mo	$49-$199	Free	Free

Full comparison: labyrinx-dev.github.io/compare/

What No Tool Can Do

Nothing is unbreakable. The CPU must execute your logic eventually, and a debugger can trace it. The realistic goal is making reverse engineering more expensive than rewriting:

Raw .py: 0 seconds
Bytecode obfuscation (free tier): A few hours (known decryptor approach)
C-to-native compilation: A few days (assembly, not bytecode)
Labyrinx Enterprise: Weeks+ (6 layers: native code + AES-256 strings + flattened control flow + encrypted modules + custom VM + anti-debug + integrity hashes)

For 99% of potential attackers, that's enough.

Getting Started

Free: Download Labyrinx — Freemium mode includes name obfuscation + Cython compilation at no cost
Pro ($9/mo): String encryption, module encryption, license system
Enterprise ($29/mo): AES-256, custom VM, anti-debug, PYD integrity hashes

Website: labyrinx-dev.github.io
Comparison guide: labyrinx-dev.github.io/compare/
Email: labyrinx@yahoo.com

What protection strategies have you tried for shipping Python apps? What worked and what didn't?

Top comments (1)

kg8888 • Jun 18

Solid defense-in-depth breakdown. The "each level alone is beatable, combined they're a real barrier" framing is the right one.

Couple of practical questions from someone who's shipped Python tools:

Cython → pyd compatibility: have you hit edge cases with asyncio or dynamic attribute access (getattr / setattr) after compilation? Those are the two things that consistently break when I Cythonize. Cython's binding directive helps but never fully.

Performance: what's the runtime overhead of the AES string decryption layer? Naively decrypting strings on every access kills perf. I assume you're doing lazy decryption with a cached plaintext pool?

CI/CD integration: how do you handle the C compiler dependency in CI? Cython → MSVC/GCC is the biggest friction point for teams that aren't set up for native builds. Any container image you recommend?

The VM layer (randomized opcodes per build) is the most interesting bit. That's the kind of asymmetry that actually frustrates determined reverse engineers — they can't reuse any work between builds. Have you measured how much slowdown the VM introduces for CPU-bound code paths?