<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: krit.k83 (ΚρητικόςIGB)</title>
    <description>The latest articles on DEV Community by krit.k83 (ΚρητικόςIGB) (@krit83).</description>
    <link>https://dev.to/krit83</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3862758%2F9597463f-b5b3-4745-ad96-b0eee21ca618.png</url>
      <title>DEV Community: krit.k83 (ΚρητικόςIGB)</title>
      <link>https://dev.to/krit83</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/krit83"/>
    <language>en</language>
    <item>
      <title>I built a faster alternative to cp and rsync — here's how it works</title>
      <dc:creator>krit.k83 (ΚρητικόςIGB)</dc:creator>
      <pubDate>Sun, 05 Apr 2026 20:27:08 +0000</pubDate>
      <link>https://dev.to/krit83/i-built-a-faster-alternative-to-cp-and-rsync-heres-how-it-works-39fa</link>
      <guid>https://dev.to/krit83/i-built-a-faster-alternative-to-cp-and-rsync-heres-how-it-works-39fa</guid>
      <description>&lt;p&gt;I'm a systems engineer. I spend a lot of time copying files — backups to USB drives, transfers to NAS boxes, moving data between servers over SSH. And I kept running into the same frustrations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;cp -r&lt;/code&gt; is painfully slow on HDDs when you have tens of thousands of small files&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;rsync&lt;/code&gt; is powerful but complex, and still slow for bulk copies&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;scp&lt;/code&gt; and SFTP top out at 1-2 MB/s on transfers that should be much faster&lt;/li&gt;
&lt;li&gt;No tool tells you upfront if the destination even has enough space&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I built &lt;strong&gt;fast-copy&lt;/strong&gt; — a Python CLI that copies files at maximum sequential disk speed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core idea
&lt;/h2&gt;

&lt;p&gt;When you run &lt;code&gt;cp -r&lt;/code&gt;, files are read in directory order — which is essentially random on disk. Every file seek on an HDD costs 5-10ms. Multiply that by 60,000 files and you're spending minutes just on head movement.&lt;/p&gt;

&lt;p&gt;fast-copy does something different: it resolves the physical disk offset of every file before copying. On Linux it uses &lt;code&gt;FIEMAP&lt;/code&gt;, on macOS &lt;code&gt;fcntl&lt;/code&gt;, on Windows &lt;code&gt;FSCTL&lt;/code&gt;. Then it sorts files by block position and reads them sequentially.&lt;/p&gt;

&lt;p&gt;That alone makes a big difference. But there's more.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deduplication
&lt;/h2&gt;

&lt;p&gt;Many directories have duplicate files — node_modules across projects, cached downloads, backup copies. fast-copy hashes every file with xxHash-128 (or SHA-256 as fallback), copies each unique file once, and creates hard links for duplicates.&lt;/p&gt;

&lt;p&gt;In my test with 92K files, over half were duplicates — saving 379 MB and a lot of I/O time.&lt;/p&gt;

&lt;p&gt;It also keeps a SQLite database of hashes, so repeated copies to the same destination skip files that were already copied in previous runs.&lt;/p&gt;

&lt;h2&gt;
  
  
  SSH tar streaming
&lt;/h2&gt;

&lt;p&gt;This is the part I'm most proud of. Instead of using SFTP (which has significant protocol overhead), fast-copy streams files as chunked ~100 MB tar batches over raw SSH channels.&lt;/p&gt;

&lt;p&gt;The remote side runs &lt;code&gt;tar xf -&lt;/code&gt; and files land directly on disk — no temp files, no SFTP overhead. This even works on servers that have SFTP disabled, like some Synology NAS configurations.&lt;/p&gt;

&lt;p&gt;Three modes are supported:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local → Remote&lt;/li&gt;
&lt;li&gt;Remote → Local&lt;/li&gt;
&lt;li&gt;Remote → Remote (relay through your machine)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Real benchmarks
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Local copy — 92K files to USB:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;44,718 unique files copied + 47,146 hard-linked&lt;/li&gt;
&lt;li&gt;509.8 MB written, 378.9 MB saved by dedup&lt;/li&gt;
&lt;li&gt;17.9 seconds, 28.5 MB/s&lt;/li&gt;
&lt;li&gt;All files verified after copy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Remote to local — 92K files over LAN:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;509.8 MB downloaded in 14 minutes&lt;/li&gt;
&lt;li&gt;46,951 duplicates detected, saving 378.5 MB of transfer&lt;/li&gt;
&lt;li&gt;3x faster than SFTP&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;The simplest way — just run the Python script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python fast_copy.py /source /destination
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or download a standalone binary (no Python needed) from the Releases page — available for Linux, macOS, and Windows.&lt;/p&gt;

&lt;p&gt;For SSH transfers, install paramiko:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;paramiko
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For faster hashing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;xxhash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/gekap/fast-copy" rel="noopener noreferrer"&gt;https://github.com/gekap/fast-copy&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;License: Apache 2.0&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'd love to hear feedback — especially from anyone dealing with large file transfers or backup workflows. What tools are you currently using? What's missing from them?&lt;/p&gt;




</description>
      <category>python</category>
      <category>linux</category>
      <category>opensource</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
