Split any file into a keyless block and a 128-bit key - in two bitwise operations

#showdev #security #opensource #python

What if you could take any file — a photo, a database dump, a movie — and split it into two parts where neither part is useful on its own? Not encryption. Not compression. Just a clean cut.

That's bitsplit. It's pure Python, zero dependencies, and the entire restore operation is a single line:

restored = (data << count) | indices

The idea

Treat the whole file as one giant integer. Slice the top 128 bits off the front. Those 128 bits become your key (a short text string). Everything else becomes your block (a binary file).

File (bytes)  -->  Number  -->  [ data: 128 bits | indices: the rest ]
                                       |                  |
                                    key file          data file

To restore: shift the key left, OR with the block, write bytes. Done.

photo.jpg  -->  data.bin + key.txt
  1.05 MB       1.05 MB    102 B

Why does this work? Because the 128 missing bits sit at the most significant positions of the number. Without them, the block is a number whose top is unknown — and there are 2^128 possible tops (~3.4 × 10^38). Brute-forcing that takes longer than the age of the universe.

Try it

pip install bitsplit

bitsplit encode photo.jpg
# -> photo.jpg.dat  +  photo.jpg.key

bitsplit decode restored.jpg
# -> restored.jpg

Or from Python:

from bitsplit import encode, decode

block, key = encode(open("photo.jpg", "rb").read())
# key looks like: "340079864808174098294188674279182237768:8843264:1105424"

restored = decode(block, key)

The key has three parts: the 128-bit number, the bit shift count, and the original byte size. That's all you need to reconstruct the file.

Where it's actually useful

This isn't a replacement for AES. It's a different tool for a different shape of problem: you want one piece of data to be useless without another, and you want to control where each piece lives.

Split storage — block in S3, key on your laptop. A bucket leak reveals nothing.
Two-channel transfer — block over Telegram, key over SMS. Intercepting one channel is worthless.
Offline backups — drive in a drawer, key on paper in a safe.
Shared access — Alice holds the key, Bob holds the block. Both required.
CI/CD secrets — commit the block, store the key in env vars.
Geo-distribution — block in eu-west, key in us-east. Single-region breach, no data.

Performance

Two bitwise ops, no rounds, no block processing. On an Apple M2:

File size	bitsplit	OpenSSL AES-256	GPG AES-256	7-Zip AES-256
100 MB	0.13 s	0.64 s	2.43 s	4.86 s
1 GB	1.45 s	5.11 s	3.58 s	3.16 s
5 GB	15.6 s	58.8 s	148.5 s	372.2 s

Output size equals input size — no overhead. Streaming I/O keeps memory flat at ~20 MB regardless of file size. All files restored with identical SHA-256 checksums.

What it is NOT

I want to be loud about this, because it matters:

bitsplit is not encryption.
No ciphers. No rounds. No key derivation. No authentication. No padding. No tamper detection.

If you need compliance, audits, or signatures — use AES-GCM or ChaCha20-Poly1305. Those exist for a reason.

bitsplit is a different primitive. Think of it as tearing a document in half, not locking it in a safe. The 128-bit key makes brute-force infeasible, but an attacker who can flip bits in the block can corrupt your data and you won't know until you decode.

For a lot of real-world use cases — split storage, two-channel transfer, offline backup — that's exactly what you want. For others, it's not enough. Pick the right tool.

The whole library

The core is essentially this:

def encode(data: bytes) -> tuple[bytes, str]:
    n = int.from_bytes(data, "big")
    bits = n.bit_length()
    key_bits = min(128, bits)
    shift = bits - key_bits
    key = n >> shift
    block = n & ((1 << shift) - 1)
    block_bytes = block.to_bytes((shift + 7) // 8, "big")
    return block_bytes, f"{key}:{shift}:{len(data)}"

def decode(block: bytes, key_str: str) -> bytes:
    key, shift, size = map(int, key_str.split(":"))
    n = (key << shift) | int.from_bytes(block, "big")
    return n.to_bytes(size, "big")

That's the whole idea. Everything else is CLI, file handling, and streaming for huge files.

Try it, break it, tell me what's wrong

Repo: github.com/frolpaxa/bitsplit

Issues, PRs, and "actually you're wrong because…" comments very welcome. The math is simple enough that bugs hide in the I/O and edge cases, not the algorithm — exactly the kind of thing more eyes help with.