DEV Community

I built a faster alternative to cp and rsync — here's how it works

I'm a systems engineer. I spend a lot of time copying files — backups to USB drives, transfers to NAS boxes, moving data between servers over SSH. And I kept running into the same frustrations:

  • cp -r is painfully slow on HDDs when you have tens of thousands of small files
  • rsync is powerful but complex, and still slow for bulk copies
  • scp and SFTP top out at 1-2 MB/s on transfers that should be much faster
  • No tool tells you upfront if the destination even has enough space

So I built fast-copy — a Python CLI that copies files at maximum sequential disk speed.

The core idea

When you run cp -r, files are read in directory order — which is essentially random on disk. Every file seek on an HDD costs 5-10ms. Multiply that by 60,000 files and you're spending minutes just on head movement.

fast-copy does something different: it resolves the physical disk offset of every file before copying. On Linux it uses FIEMAP, on macOS fcntl, on Windows FSCTL. Then it sorts files by block position and reads them sequentially.

That alone makes a big difference. But there's more.

Deduplication

Many directories have duplicate files — node_modules across projects, cached downloads, backup copies. fast-copy hashes every file with xxHash-128 (or SHA-256 as fallback), copies each unique file once, and creates hard links for duplicates.

In my test with 92K files, over half were duplicates — saving 379 MB and a lot of I/O time.

It also keeps a SQLite database of hashes, so repeated copies to the same destination skip files that were already copied in previous runs.

SSH tar streaming

This is the part I'm most proud of. Instead of using SFTP (which has significant protocol overhead), fast-copy streams files as chunked ~100 MB tar batches over raw SSH channels.

The remote side runs tar xf - and files land directly on disk — no temp files, no SFTP overhead. This even works on servers that have SFTP disabled, like some Synology NAS configurations.

Three modes are supported:

  • Local → Remote
  • Remote → Local
  • Remote → Remote (relay through your machine)

Real benchmarks

Local copy — 92K files to USB:

  • 44,718 unique files copied + 47,146 hard-linked
  • 509.8 MB written, 378.9 MB saved by dedup
  • 17.9 seconds, 28.5 MB/s
  • All files verified after copy

Remote to local — 92K files over LAN:

  • 509.8 MB downloaded in 14 minutes
  • 46,951 duplicates detected, saving 378.5 MB of transfer
  • 3x faster than SFTP

Getting started

The simplest way — just run the Python script:

python fast_copy.py /source /destination
Enter fullscreen mode Exit fullscreen mode

Or download a standalone binary (no Python needed) from the Releases page — available for Linux, macOS, and Windows.

For SSH transfers, install paramiko:

pip install paramiko
Enter fullscreen mode Exit fullscreen mode

For faster hashing:

pip install xxhash
Enter fullscreen mode Exit fullscreen mode

Links

I'd love to hear feedback — especially from anyone dealing with large file transfers or backup workflows. What tools are you currently using? What's missing from them?


Top comments (0)