Hiding in the Bits: Mastering AMBTC Steganography with Combination Theory

#python #cybersecurity #steganography #researchtocode

In our previous explorations, we looked at Coverless Steganography—a method where we don't change a single pixel, but instead find images that already "look" like our data.

Today, we are shifting gears into the Compressed Domain. We are going to hide secret messages inside the mathematics of image compression: Absolute Moment Block Truncation Coding (AMBTC).

However, simple bit-swapping in a compressed file often leaves messy visual artifacts. This is why we add Combination Theory. By using a pseudo-random Matrix P to map our message to a block's parity, we can hide 4 bits of data by flipping at most one single bit. This allows us to achieve high capacity while keeping the changes virtually invisible to the human eye.

Academic Attribution 📚
This implementation is based on the 2023 research paper:
"High Imperceptible Data Hiding Method for AMBTC Compressed Images Based on Combination Theory" by Kurnia Anggriani, Shu-Fen Chiou, Nan-I Wu, and Min-Shiang Hwang.
DOI: 10.20944/preprints202304.0341.v1

🖼️ What is AMBTC?

Before we hide data, we have to compress the image. AMBTC works by dividing an image into 4x4 blocks. For each block, it calculates:

High Quantizer (hq): The average value of "bright" pixels.
Low Quantizer (lq): The average value of "dark" pixels.
Bitmap: A 16-bit grid of 0s and 1s telling the computer which pixel gets which color.

This reduces a 16-pixel block (128 bits) down to just two 8-bit values and a 16-bit map (32 bits). It’s efficient, fast, and—most importantly—the perfect place to hide secrets.

🎲 The Secret Sauce: Combination Theory

The "magic" happens in the bitmap. Standard steganography might just flip bits randomly, but that creates "salt and pepper" noise that's easy to spot. Combination Theory uses a Pseudo-Random Matrix P.

How it works:

Generate Matrix P: Using a secret Seed (our key), we create a 4x4 matrix filled with numbers 0–15 in a random order.
Calculate Parity: We look at the current bitmap and check the "features" based on Matrix P.
The Flip: If the parity doesn't match our secret message bits, we flip exactly one bit in the bitmap.

Because we only change one pixel in a 16-pixel block, the human eye (and even most statistical tools) can't tell anything has changed.

🧪 Benchmark: Performance Results

I tested this "Base Implementation" on several standard 512x512 grayscale images. Each image was embedded with a full payload of 65,536 bits.

Image	MSE	PSNR (dB)	Capacity
Splash	14.6467	36.4734	65,536 bits
Lena	33.3143	32.9045	65,536 bits
Peppers	34.9454	32.6969	65,536 bits
Airplane	44.6739	31.6303	65,536 bits
Sailboat	76.0594	29.3193	65,536 bits
Baboon	142.7971	26.5836	65,536 bits

Observations:

The Complexity Trade-off: Images with smooth surfaces (like Splash) achieve a much higher PSNR. Complex textures (like Baboon's fur) are harder to compress via AMBTC, which naturally lowers the baseline quality.
Capacity is King: Regardless of the image, we maintained a rock-solid capacity of 65,536 bits. This is far superior to most basic LSB methods when considering the file size.

🔍 Why Hide in the Compressed Domain?

Smaller Footprint: You aren't sending a massive raw PNG; you're sending compressed data that already looks "efficient."
Bit-Level Security: Without the correct Seed, Matrix P is impossible to reconstruct. An attacker might see the modified bitmap, but they won't know which bits represent the message and which are just image data.
Minimal Distortion: Since we only flip a maximum of 1 bit per block, the visual integrity remains nearly perfect.

🎯 Conclusion

AMBTC with Combination Theory represents a sophisticated balance between compression and secrecy. It proves that we don't need to choose between a small file size and a large hidden message—we can have both.

If you want to dive into the code, check out the full Python implementation (including the performance test suite) on my GitHub:

🚀 GitHub Repository: AMBTC Steganography

What do you think? Is compressed-domain steganography the future, or do you prefer the "zero-touch" approach of coverless methods?