<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Labdev</title>
    <description>The latest articles on DEV Community by Labdev (@labdev_c81554ba3d4ae28317).</description>
    <link>https://dev.to/labdev_c81554ba3d4ae28317</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3662040%2Fb38512d7-fc99-45fc-9ad4-3eb03dd1b502.png</url>
      <title>DEV Community: Labdev</title>
      <link>https://dev.to/labdev_c81554ba3d4ae28317</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/labdev_c81554ba3d4ae28317"/>
    <language>en</language>
    <item>
      <title>PyTorch Models Are Executables: Why I Built a Scanner to Stop "Pickle Bombs"</title>
      <dc:creator>Labdev</dc:creator>
      <pubDate>Tue, 16 Dec 2025 15:58:43 +0000</pubDate>
      <link>https://dev.to/labdev_c81554ba3d4ae28317/pytorch-models-are-executables-why-i-built-a-scanner-to-stop-pickle-bombs-92i</link>
      <guid>https://dev.to/labdev_c81554ba3d4ae28317/pytorch-models-are-executables-why-i-built-a-scanner-to-stop-pickle-bombs-92i</guid>
      <description>&lt;p&gt;&lt;strong&gt;The Lie We Tell Ourselves&lt;/strong&gt;&lt;br&gt;
If you work in DevOps or Security, you probably scan your &lt;code&gt;requirements.txt&lt;/code&gt; religiously. You use Snyk or Dependabot to catch that one vulnerable version of &lt;code&gt;requests&lt;/code&gt;. You feel safe.&lt;/p&gt;

&lt;p&gt;Then, you turn around and let your Data Science team download a 5GB &lt;code&gt;.pt&lt;/code&gt; (PyTorch) file from a random Hugging Face repository and load it directly into your production environment.&lt;/p&gt;

&lt;p&gt;We treat AI models like "Data" - inert, harmless static files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They are not. They are Executables.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Technical Reality: What is a &lt;code&gt;.pt&lt;/code&gt; file?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A standard PyTorch model file is actually a Zip archive. Inside that archive, the model weights and architecture are serialized using Python's &lt;code&gt;pickle&lt;/code&gt; module. The pickle module is famous for one thing in the security world: &lt;strong&gt;Remote Code Execution (RCE)&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When you run &lt;code&gt;torch.load('model.pt')&lt;/code&gt;, the unpickler parses a stream of opcodes. If that stream contains instructions to import &lt;code&gt;os&lt;/code&gt; and run &lt;code&gt;system()&lt;/code&gt;, your machine executes it. Instantly. No sandbox. No warning.&lt;/p&gt;

&lt;p&gt;This isn't theoretical. Here is how easy it is to create a "Malicious Model" in 5 lines of Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import torch
import os

class Malicious(object):
    def __reduce__(self):
        return (os.system, ("rm -rf / --no-preserve-root",))

torch.save(Malicious(), "bert_finetune.pt")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you load that file, your server is gone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Standard SCA Tools Fail&lt;/strong&gt;&lt;br&gt;
Most teams rely on Software Composition Analysis (SCA) tools like Dependabot or standard SBOM generators. These tools work by matching library versions in &lt;code&gt;requirements.txt&lt;/code&gt; against databases of known vulnerabilities (CVEs).&lt;/p&gt;

&lt;p&gt;This approach fails for AI models because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Not a Library:&lt;/strong&gt; A custom model file isn't a "package" with a version number. It's a binary blob.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not a CVE:&lt;/strong&gt; A Pickle Bomb isn't a public vulnerability; it's a custom exploit payload hidden inside the file structure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Blind Spot:&lt;/strong&gt; You get a clean security report for your dependencies, while a 5GB RCE payload sits undetected in your Docker container.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Introducing AIsbom&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I realized we needed a tool that respects the unique nature of AI artifacts. We don't just need to scan the wrapper; we need to scan the brain.&lt;/p&gt;

&lt;p&gt;So, I built AIsbom.&lt;/p&gt;

&lt;p&gt;It is an open-source CLI that performs &lt;strong&gt;Deep Binary Introspection&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;It Unzips the Artifact:&lt;/strong&gt; It parses &lt;code&gt;PyTorch/Safetensors&lt;/code&gt; structures in memory (without loading the heavy weights).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It Disassembles the Bytecode:&lt;/strong&gt; It uses &lt;code&gt;pickletools&lt;/code&gt; to iterate over the opcode stream.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It Detects the Bomb:&lt;/strong&gt; It flags dangerous globals like &lt;code&gt;os.system&lt;/code&gt;, &lt;code&gt;subprocess&lt;/code&gt;, &lt;code&gt;eval&lt;/code&gt;, and &lt;code&gt;socket&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How to check your models right now&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can install the CLI from PyPI:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pip install aisbom-cli&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Then, point it at your model folder:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;aisbom scan ./my-models&lt;br&gt;
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;You'll get a risk assessment like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Filename           | Framework | Risk Level
-------------------|-----------|-------------------------------------

bert_finetune.pt   | PyTorch   | CRITICAL (RCE Detected: posix.system)
safe_model.st      | SafeTensors | LOW (Binary Safe)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Future is SafeTensors (But we aren't there yet)&lt;/p&gt;

&lt;p&gt;The industry is moving toward &lt;code&gt;.safetensors&lt;/code&gt;, which is a safe, JSON-header-based format. But millions of legacy &lt;code&gt;.pt&lt;/code&gt; files still exist, and developers still download them.&lt;/p&gt;

&lt;p&gt;Until everyone migrates, we need guardrails.&lt;/p&gt;

&lt;p&gt;I open-sourced this tool because I believe AI Supply Chain Security shouldn't be a black box.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/Lab700xOrg/aisbom" rel="noopener noreferrer"&gt;https://github.com/Lab700xOrg/aisbom&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Live Demo:&lt;/strong&gt; &lt;a href="https://aisbom.io" rel="noopener noreferrer"&gt;https://aisbom.io&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let me know what you think. I'm actively looking for edge cases in Pickle protocols to make the detection even more robust.&lt;/p&gt;

</description>
      <category>security</category>
      <category>python</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
