Highpass Studio

Posted on Mar 19

Stop decompressing entire archives to get one file — introducing ARCX

#performance #rust #showdev #tooling

Most archive formats make a simple task unnecessarily expensive: you need one file, so you download and decompress everything.

I built ARCX, a compressed archive format designed to fix that.

ARCX combines cross-file compression (like tar+zstd) with indexed random access (like zip), so you can retrieve a single file from a large archive in milliseconds without decompressing the rest.

Try it

GitHub: https://github.com/getarcx/arcx

Install:

cargo install arcx

Benchmark results

Across 5 real-world datasets:

~7ms to retrieve a file from a ~200MB archive
up to 200x less data read vs tar+zstd
compression within ~3% of tar+zstd

Example:

Dataset	ARCX Bytes Read	TAR+ZSTD Bytes Read	Reduction
Python ML	326 KB	63.1 MB	198x less
Build Artifacts	714 KB	140.4 MB	202x less

Why this matters

Modern systems don't need entire archives. They need one file, immediately.

This shows up in:

CI/CD pipelines (artifacts)
cloud storage (partial retrieval)
large codebases
package registries

ARCX reduces archive access to a manifest lookup, one block read, and one block decompress.

How it works

ARCX uses:

block-based compression
a binary manifest index
direct offset reads

Instead of scanning or decompressing the full archive:

Look up the file in the index
Seek to the relevant block
Decompress only that block

Comparison

Format	Compression	Selective Access
ZIP	weaker	fast
tar+zstd	strong	slow
ARCX	strong	fast

Tradeoffs

ARCX is not designed for streaming (like tar). The archive must be complete before reading because the manifest is written at the end.

DEV Community