DEV Community

byte271
byte271

Posted on

6cy: A Content-Addressed Archive Format in Rust

Source code: [GitHub]

I built 6cy as an experimental archive format focused on content-addressing, deduplication, and practical streaming workflows. This post gives a short, concrete overview you can read quickly.


Core Architectural Features

Streaming-First Design

Optimized for single-pass read and write operations. Data can be appended without seeking back. This fits network streams and large-scale pipelines.

Data Recoverability

Blocks are self-describing and include checks. Periodic checkpoints (recovery map) let readers recover data even if the archive is truncated or partially corrupted.

Codec Polymorphism

Multiple compression algorithms can coexist in one archive (e.g., Zstd, LZ4). Each block can pick the best codec for its data. This lets the writer trade speed vs. ratio per block.

Plugin Architecture

A simple plugin ABI and manifest allow closed-source or third-party codecs to be loaded as binary plugins without changing the core codebase. This keeps the core implementation small and auditable.

Metadata-First Indexing

A central index maps files to blocks. This enables fast listing and random extraction. Readers do not need to scan the entire archive to find files.

Rust Reference Implementation

The canonical implementation is written in Rust. It prioritizes memory safety, clear error handling, and predictable performance. The repo serves as the reference for the format.


Architecture at a Glance

[ Data Blocks (content-addressed) ... ]
[ Central Index (BlockRefs + metadata) ]
[ Superblock (index offset, codecs, UUID) ]

Blocks carry block headers, compressed payloads, and a content hash. The index contains BlockRef entries that point to blocks or to remote archive references.


Quickstart (try it)

git clone https://github.com/byte271/6cy
cd 6cy
# build (Linux / macOS)
cargo build --release
# pack files
./target/release/6cy pack -o test.6cy path/to/file1 path/to/file2
# inspect
./target/release/6cy info test.6cy

Windows users: run the same commands in PowerShell or WSL. Be mindful of line endings and file ordering when testing cross-platform determinism.


Current Status

  • Feature set implemented in the reference repo: streaming write/read, central index, block-level dedup, content hashes, root hash, solid mode, plugin hooks.
  • Validation in progress: cross-platform determinism (Linux/Windows), crash-recovery semantics, fuzz testing, and performance benchmarks.
  • State: experimental / stabilizing. The spec and implementation are evolving but now have a stable core surface.

How to Help

If you want to try the format, please:

  • Clone the repo and run the quickstart above
  • Try packing the same file multiple times to see dedup in action
  • Open issues for bugs or missing docs
  • Send PRs for tests, examples, or platform fixes

Open Source & Contact

Code and issues: https://github.com/byte271/6cy

Top comments (0)