DEV Community

Nishal K
Nishal K

Posted on

How I Built a Memory-Safe Steganography Engine in Rust to Protect Data from AI Scrapers

How I Built a Memory-Safe Steganography Engine in Rust to Protect Data from AI Scrapers

As AI models scale, data provenance is becoming a massive engineering challenge. Automated web scrapers are vacuuming up datasets without any regard for creator licenses or intellectual property.

I wanted to build a mathematical solution to this problem, so I architected Sigil: a zero-knowledge cryptographic vault that embeds verifiable, HMAC-SHA256 signed ownership IDs directly into the pixels of an image.

While the desktop vault (built with Tauri, Svelte, and an offline SQLite daemon) is strictly closed-source to protect the cryptographic keys, I realized that "Security by Obscurity" isn't enough. If AI companies don't know how to read the hidden IDs, they will just scrape the images anyway.

So, I open-sourced the extraction layer. Here is a deep dive into how I used Rust to build a memory-safe Least Significant Bit (LSB) steganography reader.

The Concept: LSB Steganography

Every pixel in a standard image is made of Red, Green, and Blue channels. Each channel is represented by a byte (8 bits), with values ranging from 0 to 255.

If you change the Least Significant Bit (the absolute last 1 or 0 in that byte), the color change is invisible to the human eye. But mathematically, you can use those hidden bits to store a secret payload—like a 32-byte cryptographic ID.

The Rust Implementation

To make this blazing fast and completely safe from memory leaks, I used Rust's image crate to parse the pixels. Here is the exact open-source reference implementation for extracting the payload:

use image::GenericImageView;

/// Extracts a hidden Sigil Cryptographic ID from an image's LSB layer.
pub fn verify_steganography(path: &str, expected_id_len: usize) -> Result<String, String> {
    let img = image::open(path).map_err(|e| e.to_string())?.to_rgba8();
    let mut bits = Vec::with_capacity(expected_id_len * 8);

    // 1. Extract the Least Significant Bits
    for pixel in img.pixels() {
        for channel in 0..3 { // Iterate over R, G, B
            if bits.len() < expected_id_len * 8 {
                // The bitwise AND operator isolates the final bit
                bits.push(pixel[channel] & 1);
            }
        }
    }

    // 2. Reconstruct the bytes
    let mut extracted_bytes = Vec::new();
    for chunk in bits.chunks(8) {
        if chunk.len() == 8 {
            let mut byte = 0u8;
            for (i, &bit) in chunk.iter().enumerate() {
                // Shift the bit back into its correct position
                byte |= bit << (7 - i);
            }
            extracted_bytes.push(byte);
        }
    }

    // 3. Return the Hex String
    Ok(hex::encode(extracted_bytes))
}
Enter fullscreen mode Exit fullscreen mode

Breaking Down the Bitwise Math

The magic happens in two specific lines of code:

  • pixel[channel] & 1: This is a bitwise AND operation. By comparing the pixel's byte against 00000001, we wipe out the first 7 bits and isolate only the final bit. If the pixel is even, it returns 0. If it is odd, it returns 1. We push this bit into our Vector.
  • byte |= bit << (7 - i): Once we have 8 hidden bits, we need to stitch them back into a single byte. We use the left-shift operator (<<) to push the bit into the correct slot (from position 7 down to 0), and the bitwise OR operator (|=) to combine them into the final u8.

The Open Standard

By publishing this extraction logic, AI procurement teams can now integrate this exact function into their scraping pipelines. If their crawler detects a payload, they know the asset is cryptographically locked and requires an API clearance.

You can check out the full open-source extraction standard on GitHub, or visit the lightning-fast Astro-powered documentation site here:
https://nishal21.github.io/Sigil-extractor/

I’d love to hear your thoughts on the Rust implementation or LSB steganography in general!

https://github.com/nishal21/Sigil-extractor

Top comments (0)