DEV Community

Cover image for Steganography: Part 2 - Advanced LSB
Thomas Pegler
Thomas Pegler

Posted on

Steganography: Part 2 - Advanced LSB

In Part 1 I had a simple example of LSB steganography. Today I'll show how another simple step can improve resiliency and make it harder for classic steganalysis tools to detect.

Note: One thing I didn't mention in part 1 is that the code in these will always attempt to preserve the integrity of the images over the integrity, and amount, of data that can be embedded. I operate under the assumption that an adversary has access to the original copies of the images. This means that the amount of data that can be stored will always be lower than some other algorithms that more heavily alter the images.

The easiest way to improve LSB steganography is to change how the data is embedded. There are a few proposed methods but for now, let's use simple for-loops to create blocks of pixels, like the process used in JPEG compression.

from PIL import Image

def encode(filepath):
    start = '#####'
    stop = '*****'
    full = start + 'Some string that you want to encode into an image' + stop

    binary_text = ''.join('{0:08b}'.format(ord(x), 'b') for x in full)
    print(binary_text, len(binary_text))

    with as im:
        i = 0
        w, h = im.size

        # A good block size is 8x8 or a multiple of 8
        min_block_size = 24

        print(f'Minimum block size: {min_block_size}')

        if min_block_size > w or min_block_size > h:
            print('Data too large to store in image')

        for x in range(0, w - min_block_size, min_block_size):
            for y in range(0, h - min_block_size, min_block_size):

                for j in range(x, x + min_block_size):
                    for k in range(y, y + min_block_size):
                        if i >= len(binary_text):
                            i = 0

                        bit = binary_text[i]
                        pixel = im.getpixel((j, k))

                        if bit == "0":
                            # Is odd, should be even.
                            if pixel[0] % 2 != 0:
                                new_pix = (pixel[0] - 1, pixel[1], pixel[2])
                                im.putpixel((j, k), new_pix)

                            # Is even, should be odd.
                            if pixel[0] % 2 == 0:
                                new_pix = (pixel[0] + 1, pixel[1], pixel[2])
                                im.putpixel((j, k), new_pix)

                        i += 1'encoded_{filepath}')
Enter fullscreen mode Exit fullscreen mode

As you can see the above is almost identical to the previous example, the only difference is the pair of inner for loops:

for j in range(x, x + min_block_size):
    for k in range(y, y + min_block_size):
        if i >= len(binary_text):
            i = 0

        bit = binary_text[i]
        pixel = im.getpixel((j, k))

        if bit == "0":
            # Is odd, should be even.
            if pixel[0] % 2 != 0:
                new_pix = (pixel[0] - 1, pixel[1], pixel[2])
                im.putpixel((j, k), new_pix)

                # Is even, should be odd.
                if pixel[0] % 2 == 0:
                    new_pix = (pixel[0] + 1, pixel[1], pixel[2])
                    im.putpixel((j, k), new_pix)

        i += 1
Enter fullscreen mode Exit fullscreen mode

This iterates over a square of min_block_size X min_block_size and encodes the data sequentially there. In theory, this makes the encoding more robust and harder for standard steganalysis tools to extract since you have to know the block size to retrieve it. This is the strength and weakness of this approach. You have to either define a block size, the length of the input text or send the block size some other way so that whoever is decoding it can know what block size to use.

Speaking of decoding, this method is essentially the same, just with the inner double for loop again.

def decode(filepath, block_size=None):
    start = '#####'
    stop = '*****'
    found = False
    binary_stop = ''.join('{0:08b}'.format(ord(x), 'b') for x in stop)
    bit_count = 0
    message = ''

    with as im:
        w, h = im.size
        binary_text = ''
        # A good block size is 8x8 or a multiple of 8
        min_block_size = block_size or 24

        while not found:
            for x in range(0, w - min_block_size, min_block_size):
                for y in range(0, h - min_block_size, min_block_size):

                    if message.endswith(stop):
                        found = True

                    for j in range(x, x + min_block_size):
                        for k in range(y, y + min_block_size):

                            if bit_count == 8:
                                char = chr(int(binary_text, 2))

                                if char in string.printable:
                                    message += char
                                    bit_count = 0
                                    binary_text = ''

                            pixel = im.getpixel((j, k))

                            # Since we always want to get the LSB, we 
                            # can just use the result of the modulo as 
                            # our value
                            binary_text += f'{pixel[0] % 2}'

                            bit_count += 1

    if found:
        start_point = message.find(start) + len(start)
        end = message.find(stop)
        message = message[start_point:end]
        return message
Enter fullscreen mode Exit fullscreen mode

As you can see, this is essentially the same loop as the encode, the block size is passed as an argument in this example with the known 24 as a backup. I've also added a check for the found char, to ensure it is printable (less helpful for this example but much more so later when we attempt to process cropped images).


Putting both parts together with a little argparse for ease of command line use, we get:

import argparse
import string

from PIL import Image

def encode(filepath):
    start = '#####'
    stop = '*****'
    full = start + 'Some string that you want to encode into an image' + stop

    binary_text = ''.join('{0:08b}'.format(ord(x), 'b') for x in full)
    print(binary_text, len(binary_text))

    with as im:
        i = 0
        w, h = im.size

        # A good block size is 8x8 or a multiple of 8
        min_block_size = 24

        print(f'Minimum block size: {min_block_size}')

        if min_block_size > w or min_block_size > h:
            print('Data too large to store in image')

        for x in range(0, w - min_block_size, min_block_size):
            for y in range(0, h - min_block_size, min_block_size):

                for j in range(x, x + min_block_size):
                    for k in range(y, y + min_block_size):
                        if i >= len(binary_text):
                            i = 0

                        bit = binary_text[i]
                        pixel = im.getpixel((j, k))

                        if bit == "0":
                            # Is odd, should be even.
                            if pixel[0] % 2 != 0:
                                new_pix = (pixel[0] - 1, pixel[1], pixel[2])
                                im.putpixel((j, k), new_pix)

                            # Is even, should be odd.
                            if pixel[0] % 2 == 0:
                                new_pix = (pixel[0] + 1, pixel[1], pixel[2])
                                im.putpixel((j, k), new_pix)

                        i += 1'encoded_{filepath}')

def decode(filepath, block_size=None):
    start = '#####'
    stop = '*****'
    found = False
    binary_stop = ''.join('{0:08b}'.format(ord(x), 'b') for x in stop)
    bit_count = 0
    message = ''

    with as im:
        w, h = im.size
        binary_text = ''
        # A good block size is 8x8 or a multiple of 8
        min_block_size = block_size or 24

        while not found:
            for x in range(0, w - min_block_size, min_block_size):
                for y in range(0, h - min_block_size, min_block_size):

                    if message.endswith(stop):
                        found = True

                    for j in range(x, x + min_block_size):
                        for k in range(y, y + min_block_size):

                            if bit_count == 8:
                                char = chr(int(binary_text, 2))

                                if char in string.printable:
                                    message += char
                                    bit_count = 0
                                    binary_text = ''

                            pixel = im.getpixel((j, k))

                            # Since we always want to get the LSB, we 
                            # can just use the result of the modulo as 
                            # our value
                            binary_text += f'{pixel[0] % 2}'

                            bit_count += 1

    if found:
        start_point = message.find(start) + len(start)
        end = message.find(stop)
        message = message[start_point:end]
        return message

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("-a", "--action", help="encode or decode")
    parser.add_argument("-f", "--filepath", help="path to image")
    parser.add_argument("-b", "--block_size", required=False, type=int, help="block size")
    args = parser.parse_args()

    if args.filepath:
        if args.action == "encode":
        elif args.action == "decode":
            print("Invalid action")
        print("No filepath provided")
Enter fullscreen mode Exit fullscreen mode

With that very simple script, you have your very own PNG steganographic tool. Simply ensure you have Pillow installed and from the terminal run something like:

python ./ -a encode -f file.png
Enter fullscreen mode Exit fullscreen mode


python ./ -a decode -f encoded_file.png -b 24
Enter fullscreen mode Exit fullscreen mode

And you'll have your very own, secretly encoded messaging system. The changes are undetectable to the human eye, even with the original.

Header by Isis Franรงa on Unsplash

Top comments (3)

ranggakd profile image
Retiago Drago โ€ข โ€ข Edited

The first thing I notice about your approach is the depth of your loop. I believe this increases the time complexity since you go as deep as 4-5 levels. Do you think there's a faster way to do this? I'm considering flattening the image and using a specific mathematical formula to reduce the level of the loop. What are your thoughts? ๐Ÿ˜

vapourisation profile image
Thomas Pegler โ€ข

You're absolutely right, I really didn't optimise this at all. I mostly wanted to just get something simple, with only 1 or 2 libraries to try and make the topic a little easier to start with. I ended up doing my own implementation in C++ to improve the speed (because this Python one was far too slow).

I think a good approach might be to flatten the image into a single array and then use another algorithm to figure out the correct pixels to alter or just use Pandas. Do you know any better ways of handling it?

ranggakd profile image
Retiago Drago โ€ข

In my post, I just utilized NumPy index operation and lookup table aka memoization for better and faster performance. Let me know what you think about my approach there. I'm still new and always exploring new stuff as time goes on.