<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tyler Tan</title>
    <description>The latest articles on DEV Community by Tyler Tan (@tyler_tan_13b1f742020d35a).</description>
    <link>https://dev.to/tyler_tan_13b1f742020d35a</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3938418%2F11691764-c9e5-4c1c-9738-7fe2c8eb3d08.png</url>
      <title>DEV Community: Tyler Tan</title>
      <link>https://dev.to/tyler_tan_13b1f742020d35a</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tyler_tan_13b1f742020d35a"/>
    <language>en</language>
    <item>
      <title>Building BitTorrent from Scratch: What 2500 Lines of Modern C++ Can Do</title>
      <dc:creator>Tyler Tan</dc:creator>
      <pubDate>Mon, 18 May 2026 15:51:50 +0000</pubDate>
      <link>https://dev.to/tyler_tan_13b1f742020d35a/building-bittorrent-from-scratch-what-2500-lines-of-modern-c-can-do-3hhn</link>
      <guid>https://dev.to/tyler_tan_13b1f742020d35a/building-bittorrent-from-scratch-what-2500-lines-of-modern-c-can-do-3hhn</guid>
      <description>&lt;p&gt;A working BitTorrent downloader — from raw TCP sockets to SHA-1 hashing, all written by hand.&lt;/p&gt;

&lt;p&gt;This project starts at the socket level: I wrote my own SHA-1, hand-rolled HTTP requests, implemented bencoding from scratch, defined all seven peer wire protocol message types one by one, and finally spawned multiple peer connections with std::jthread for parallel downloading. It supports both .torrent files and magnet links, and comes with 83 unit tests. Apart from a JSON formatting library and the test framework, it has zero external dependencies.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/Tenaryo/TinyBitTorrent" rel="noopener noreferrer"&gt;TinyBitTorrent&lt;/a&gt;, built with C++23.&lt;/p&gt;

&lt;h2&gt;
  
  
  What BitTorrent Is
&lt;/h2&gt;

&lt;p&gt;Before diving into the implementation, let's take a minute to understand what BitTorrent actually does.&lt;/p&gt;

&lt;p&gt;The traditional file download model is straightforward: you click a link, your browser sends an HTTP request, and the server pushes the file to you. The bottleneck is equally straightforward — all the bandwidth pressure sits on a single server. More users means slower speeds, and if the server goes down, the file is gone.&lt;/p&gt;

&lt;p&gt;BitTorrent turns this model on its head by making every downloader an uploader at the same time. A file is split into many small chunks called pieces, each with its own SHA-1 hash. Instead of downloading all pieces from one central server, you grab a few from each of dozens — or hundreds — of peers who are also downloading, or have already finished. Meanwhile, the pieces you already have can be uploaded to other peers. Paradoxically, the more people participate, the faster the entire distribution network becomes.&lt;/p&gt;

&lt;p&gt;To implement this protocol, the first problem to solve is: how do you encode and transmit data and metadata? BitTorrent uses a format called bencoding — simple, compact, and unambiguous. Let's start there.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bencoding: BitTorrent's JSON
&lt;/h2&gt;

&lt;p&gt;Bencoding is BitTorrent's native serialization format. You can think of it as JSON's binary cousin. Where JSON uses curly braces and square brackets to mark structure, bencoding uses type prefixes and length prefixes. There are only four types.&lt;/p&gt;

&lt;p&gt;The first is the string, formatted as &lt;code&gt;length:content&lt;/code&gt;. For example, &lt;code&gt;4:spam&lt;/code&gt; means the string "spam", and &lt;code&gt;11:hello world&lt;/code&gt; means "hello world". The number before the colon must be a decimal integer with no leading zeros.&lt;/p&gt;

&lt;p&gt;The second type is the integer, wrapped in &lt;code&gt;i&lt;/code&gt; and &lt;code&gt;e&lt;/code&gt;. So &lt;code&gt;i42e&lt;/code&gt; is 42, and &lt;code&gt;i-3e&lt;/code&gt; is -3. Leading zeros are forbidden, and &lt;code&gt;i-0e&lt;/code&gt; is not allowed either.&lt;/p&gt;

&lt;p&gt;The third type is the list, wrapped in &lt;code&gt;l&lt;/code&gt; and &lt;code&gt;e&lt;/code&gt;, containing any number of bencoded values. For instance, &lt;code&gt;l4:spami42ee&lt;/code&gt; is a list with the string "spam" and the integer 42. Lists can nest other lists and dictionaries.&lt;/p&gt;

&lt;p&gt;The fourth type is the dictionary, wrapped in &lt;code&gt;d&lt;/code&gt; and &lt;code&gt;e&lt;/code&gt;, with keys and values alternating. Keys must be strings; values can be any type. Something like &lt;code&gt;d3:foo3:bar4:infod6:lengthi1024eee&lt;/code&gt; represents &lt;code&gt;{"foo": "bar", "info": {"length": 1024}}&lt;/code&gt;. Dictionary keys must be sorted in lexicographic order when encoding — the protocol explicitly requires this.&lt;/p&gt;

&lt;p&gt;At this point you might wonder — why not just use JSON? Two reasons. First, JSON can't directly represent binary data like SHA-1 hashes without Base64 encoding, which is costly. Second, bencoding is extremely simple to parse — no quote escaping, no Unicode handling, none of the complexity a JSON parser has to deal with. For BitTorrent in 2001, a format with zero library dependencies was the right call.&lt;/p&gt;

&lt;p&gt;My implementation uses std::variant as the data model. Each of the four types is a struct, all wrapped together in a variant:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;String&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;Integer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;int64_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;items_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;Dict&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;pair&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;items_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;variant&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There's an interesting circular dependency here: the definition of Value uses List and Dict, and both List and Dict contain Value. Strictly speaking, this is an incomplete type issue in C++, but std::variant and std::vector implementations since C++17 actually support this recursive pattern in practice, so the compiler lets it through. It's the cleanest way to write it, so that's what I went with.&lt;/p&gt;

&lt;p&gt;The parser is a recursive descent design that takes a mutable string_view reference and dispatches on the first character:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string_view&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Value&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="sc"&gt;'0'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="sc"&gt;'9'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="n"&gt;likely&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;colon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;':'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="cm"&gt;/* parse int from data[0..colon) */&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;remove_prefix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;colon&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;)};&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;remove_prefix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;'i'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="n"&gt;unlikely&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* parse integer... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;'l'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="n"&gt;unlikely&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* parse list... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sc"&gt;'d'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="n"&gt;unlikely&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* parse dict... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the first character is a digit, we enter the string branch (the most common case, marked &lt;code&gt;[[likely]]&lt;/code&gt;); &lt;code&gt;i&lt;/code&gt; means integer, and so on. The encoder runs in reverse, using std::format_to to assemble the prefix strings, with dict keys sorted via std::ranges::sort before encoding.&lt;/p&gt;

&lt;h2&gt;
  
  
  .torrent Files: the Download Shopping List
&lt;/h2&gt;

&lt;p&gt;With bencoding in place, parsing .torrent files is the natural next step. So why do we even need a torrent file? The answer is simple: to download something, you have to know what it is, how big it is, and where to find people who have it. A .torrent file is exactly that shopping list — it tells you the file size, how many pieces it's split into, the hash of each piece, and the tracker URL for finding peers.&lt;/p&gt;

&lt;p&gt;A .torrent file is essentially a single bencoded dictionary. At the top level there are two critical keys: &lt;code&gt;announce&lt;/code&gt;, which is the tracker URL, and &lt;code&gt;info&lt;/code&gt;, a sub-dictionary containing everything directly related to the download — &lt;code&gt;length&lt;/code&gt; (total file size in bytes), &lt;code&gt;piece length&lt;/code&gt; (the size of each piece, typically 256 KB to 1 MB), and &lt;code&gt;pieces&lt;/code&gt; (a long string of all 20-byte SHA-1 hashes concatenated together).&lt;/p&gt;

&lt;p&gt;My Metainfo struct captures exactly these six fields:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="nc"&gt;Metainfo&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;announce_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;              &lt;span class="c1"&gt;// tracker URL&lt;/span&gt;
    &lt;span class="kt"&gt;int64_t&lt;/span&gt; &lt;span class="n"&gt;length_&lt;/span&gt;&lt;span class="p"&gt;{};&lt;/span&gt;                  &lt;span class="c1"&gt;// total file size&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;info_hash_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;             &lt;span class="c1"&gt;// 20-byte raw SHA1&lt;/span&gt;
    &lt;span class="kt"&gt;int64_t&lt;/span&gt; &lt;span class="n"&gt;piece_length_&lt;/span&gt;&lt;span class="p"&gt;{};&lt;/span&gt;            &lt;span class="c1"&gt;// size of each piece&lt;/span&gt;
    &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;piece_hashes_&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// hex hashes per piece&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Parsing is a two-level iteration. First pass over the top-level dict grabs announce and info; second pass over the info sub-dict extracts length, piece length, and pieces. The pieces field needs a bit of special handling — the raw data is every 20-byte SHA-1 hash concatenated end-to-end. I slice it into 20-byte chunks and convert each one into a 40-character hex string for storage.&lt;/p&gt;

&lt;p&gt;The most noteworthy step is computing the info_hash. This isn't just any hash — you re-bencode the entire info dictionary, then compute SHA-1 over the encoded result. Think of it as taking a "fingerprint" of the info dict. Everything downstream — tracker requests, peer handshakes — identifies the file by this fingerprint. The info_hash is the file's universal identity card in the BitTorrent world.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="n"&gt;util&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Sha1&lt;/span&gt; &lt;span class="n"&gt;hasher&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;hasher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bencode&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="n"&gt;info_hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hasher&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;finalize&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a side note, there's also a from_info_dict function that reconstructs a Metainfo from an info dictionary obtained through the ut_metadata extension protocol. This comes into play with magnet link downloads, which I'll cover later.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding Peers, Shaking Hands, Downloading
&lt;/h2&gt;

&lt;p&gt;Once you have the metadata from a .torrent file, the first order of business is finding who has your data. That's the tracker's job.&lt;/p&gt;

&lt;p&gt;A tracker is essentially an HTTP service. You send it your info_hash and your peer_id (a 20-byte random string that identifies you), and it returns a list of peers currently downloading or seeding that file. I construct an HTTP GET request with the info_hash, peer_id, port number, and download progress as URL parameters, appending &lt;code&gt;compact=1&lt;/code&gt; at the end. compact=1 means "give me the peer list in compact form" — 6 bytes per peer: 4 for the IP address and 2 for the port. This keeps the tracker response tiny; even dozens of peers fit in a few hundred bytes. After parsing the response, I split the peers field into 6-byte chunks, extract the IP and port from each, and the peer list is ready.&lt;/p&gt;

&lt;p&gt;With a peer's IP and port in hand, the next step is to open a TCP connection and perform the BitTorrent handshake. The handshake packet is a neat 68 bytes, each segment with a clear purpose. Byte 1 is the protocol string length (always 19). The next 19 bytes are "BitTorrent protocol". Then 8 reserved bytes (where bit 4 of byte 26, if set, signals extension protocol support). Then 20 bytes of info_hash. Finally, 20 bytes of peer_id. The peer responds with an identically formatted packet; I verify the protocol string and info_hash match, and the handshake is done. The code for this is shorter than the description:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="nf"&gt;make_handshake&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string_view&lt;/span&gt; &lt;span class="n"&gt;info_hash&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string_view&lt;/span&gt; &lt;span class="n"&gt;peer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;reserve_extensions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;68&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sc"&gt;'\0'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;19&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"BitTorrent protocol"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reserve_extensions&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="sc"&gt;'\x10'&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="sc"&gt;'\x00'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info_hash&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;28&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;peer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;48&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the handshake, both sides enter a simple state machine. The peer first sends a bitfield message — a bitmap where each bit indicates whether the peer has the corresponding piece. After inspecting the bitfield, I send an interested message, essentially saying "I'd like to download from you." Then I wait for an unchoke message. Only after receiving it am I officially granted permission to request data. Choke and unchoke form BitTorrent's flow control mechanism; a peer can choke you at any time to deny transfers, though in practice most peers unchoke right after receiving an interested message.&lt;/p&gt;

&lt;p&gt;The logic for actually downloading a piece is the most interesting part of the entire project. A piece can be several megabytes; you can't just request it all at once — that would be slow, and the retransmission cost after packet loss would be punishing. BitTorrent's approach is to split each piece into 16 KB blocks, sending a separate Request message for each block with the piece index, the block's offset within the piece, and its length. But waiting for one block to arrive before requesting the next wastes network bandwidth. The better approach is pipelining: keep up to 5 requests in flight at all times. Whenever a Piece message arrives, I copy the data into the piece buffer at the correct offset and immediately send a new Request to fill the freed slot.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Fill the pipeline: send up to 5 block requests at once&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pending&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;send_idx&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;total_blocks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;piece_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;blocks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;send_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;begin_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;blocks&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;send_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;length_&lt;/span&gt;&lt;span class="p"&gt;}));&lt;/span&gt;
    &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;send_idx&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Event loop: receive Pieces, fill buffer, replenish requests&lt;/span&gt;
&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;received&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;total_blocks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;recv_message&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;visit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Overloaded&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Piece&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;pce&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pce&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;block_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;piece_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;pce&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;begin_&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;received&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;send_idx&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;total_blocks&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;pending&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* send next request */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;[](&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;auto&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;  &lt;span class="c1"&gt;// ignore other message types&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// After all blocks arrive, verify SHA-1&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha1_hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;piece_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;expected_hash&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="p"&gt;...;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once all blocks are in, I run SHA-1 over the assembled piece and compare it against the hash recorded in the .torrent file. A match means the piece is good. A mismatch means something went wrong in transit or the peer gave us bad data, so we throw an exception.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multithreading: Full-Speed Download
&lt;/h2&gt;

&lt;p&gt;If you can download one piece, you can download the entire file. The logic connecting these two concepts is surprisingly straightforward.&lt;/p&gt;

&lt;p&gt;First, I pre-allocate the output file to its final size with ftruncate. Think of this as "reserving your spot" on disk — the file already occupies its full footprint, and each piece's data just gets written to its correct offset with pwrite. No need to accumulate a file-sized buffer in memory.&lt;/p&gt;

&lt;p&gt;Then comes the multithreading. I spawn one std::jthread worker per peer, each responsible for a contiguous range of pieces. Within a thread, a single TCP connection is established and reused for all pieces in that worker's range (saving handshake overhead). Across threads, everything runs in parallel, each talking to a different peer. The core logic is clean enough to fit in a handful of lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Pre-allocate the file&lt;/span&gt;
&lt;span class="n"&gt;ftruncate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metainfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;length_&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_workers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;workers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;emplace_back&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;peer_idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;establish_connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;peers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;peer_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;ip_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;peers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;peer_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;port_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...);&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;pi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;pi&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="n"&gt;pi&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;download_piece_on_connection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metainfo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pi&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="n"&gt;pwrite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                   &lt;span class="n"&gt;pi&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;metainfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;piece_length_&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Error handling is taken care of too. If any worker throws, I capture the first exception with std::exception_ptr behind a mutex, and rethrow it after all threads have joined. This ensures a single failure doesn't crash the whole process before other threads have a chance to clean up their resources.&lt;/p&gt;

&lt;h2&gt;
  
  
  Magnet Links: Throw Away the Torrent File
&lt;/h2&gt;

&lt;p&gt;The .torrent file path is done, but there's another — arguably more common — way to start a download: magnet links. You've definitely seen something like this: &lt;code&gt;magnet:?xt=urn:btih:abc123...&amp;amp;dn=filename&amp;amp;tr=tracker_url&lt;/code&gt;. At its core, it's just a URL embedding the file's info_hash, a suggested display name, and one or more tracker addresses.&lt;/p&gt;

&lt;p&gt;Why magnet links? A .torrent file may be small, but it's still a file — you have to get it from a website, a forum, or some other channel first. A magnet link is just a string. Sharing a link is infinitely more convenient than sharing a file. For the BitTorrent network, magnet links are also more decentralized: even if every torrent index site goes down, as long as someone is still seeding, pasting a link is enough to start downloading.&lt;/p&gt;

&lt;p&gt;The full magnet download flow adds one critical step compared to the .torrent path: since you don't have a torrent file, you have no idea how big the file is or what its piece hashes are — you have to ask a peer for this information. The rough flow goes like this: parse the magnet link to extract the info_hash and tracker URL, query the tracker for a peer list, establish a TCP connection and perform the base handshake, then use the extension protocol to request the info dictionary from the peer. Once the info_dict passes verification, the rest is exactly the same as the .torrent path — download all the pieces as usual.&lt;/p&gt;

&lt;p&gt;Parsing the magnet link itself is straightforward string processing: check for the &lt;code&gt;magnet:?&lt;/code&gt; prefix, find the 40-character hex hash after &lt;code&gt;xt=urn:btih:&lt;/code&gt; and convert it to 20 raw bytes, locate the tracker URL after &lt;code&gt;tr=&lt;/code&gt; and URL-decode it. Compared to bencoding, this is about as hard as drinking a glass of water.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Extension Handshake
&lt;/h2&gt;

&lt;p&gt;The core challenge of magnet link downloads is "without a torrent file, how do I know what to download?" BitTorrent's answer is the ut_metadata extension defined in BEP 9, which allows peers to exchange torrent info dictionaries. But to use ut_metadata, you first need to complete the extension handshake defined in BEP 10.&lt;/p&gt;

&lt;p&gt;The extension handshake is an extra round of negotiation that happens immediately after the standard handshake. First, bit 4 of byte 26 (the 5th byte of the reserved field) in my handshake packet is set to 1 — this flag tells the peer "I speak the extension protocol." If the peer also supports it, it will set the same bit in its handshake response.&lt;/p&gt;

&lt;p&gt;Right after the handshake, I send an extension handshake message. This message has type Extended with message ID 0 (by convention, ID 0 is always the extension handshake), and its payload is the bencoded dictionary &lt;code&gt;{"m": {"ut_metadata": 1}}&lt;/code&gt;. This says "I want to use the ut_metadata extension, and I'll call it ID 1." The peer responds with a similarly structured dictionary &lt;code&gt;{"m": {"ut_metadata": N}}&lt;/code&gt;, telling me what message ID it has assigned to ut_metadata — it might be 1, 2, or some other number. From this point on, all ut_metadata messages use that ID.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// After the standard handshake, check if the peer supports extensions&lt;/span&gt;
&lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;has_ext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hs_buf&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mh"&gt;0x10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;has_ext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Send extension handshake: {"m": {"ut_metadata": 1}}&lt;/span&gt;
    &lt;span class="n"&gt;Dict&lt;/span&gt; &lt;span class="n"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;{{&lt;/span&gt;&lt;span class="s"&gt;"m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;{{&lt;/span&gt;&lt;span class="s"&gt;"ut_metadata"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}}}}};&lt;/span&gt;
    &lt;span class="n"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Extended&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bencode&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;})}));&lt;/span&gt;

    &lt;span class="c1"&gt;// Parse the response to get the peer's ut_metadata message ID&lt;/span&gt;
    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;recv_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;metadata_ext_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parse_ext_handshake_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;payload_&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once this step completes, I know exactly which message ID to use when requesting metadata.&lt;/p&gt;

&lt;h2&gt;
  
  
  Asking a Peer for Metadata
&lt;/h2&gt;

&lt;p&gt;With the peer's ut_metadata message ID in hand, requesting metadata means constructing the bencoded dictionary &lt;code&gt;{"msg_type": 0, "piece": 0}&lt;/code&gt; and sending it as an Extended message. msg_type=0 means "this is a request," and piece=0 means "give me chunk 0 of the metadata." (The ut_metadata protocol splits the info dictionary into 16 KB chunks for transmission; the overwhelming majority of torrents have an info dict that fits in a single chunk, so piece=0 is all you need.)&lt;/p&gt;

&lt;p&gt;The peer responds with an Extended message whose payload is &lt;code&gt;{"msg_type": 1, "piece": 0, "total_size": N, ...info dict bencoded data appended at the end...}&lt;/code&gt;. msg_type=1 means this is a response, and total_size tells me how many bytes the info dictionary's bencoded form takes. The key operation is extracting the last total_size bytes from the payload — that's the complete bencoded info dictionary.&lt;/p&gt;

&lt;p&gt;Once I have info_bencode, I do two things. First, bencode-decode it and feed it into from_info_dict to reconstruct a Metainfo — now I have piece_hashes, length, and piece_length, everything I need. Second, and this is the critical part, I compute SHA-1 over info_bencode and compare it against the info_hash from the magnet link. If they don't match, the peer gave me bogus data — throw an exception, try a different peer. This is "trust but verify"; the entire BitTorrent protocol's security rests on hash verification.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Request metadata&lt;/span&gt;
&lt;span class="n"&gt;send_metadata_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadata_ext_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Receive the response&lt;/span&gt;
&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;recv_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sock&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;info_bencode&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parse_metadata_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;payload_&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Parse the info dict&lt;/span&gt;
&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;info_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bencode&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info_bencode&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="k"&gt;auto&lt;/span&gt; &lt;span class="n"&gt;metainfo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;from_info_dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info_dict&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Verify: info_hash must match the magnet link&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sha1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info_bencode&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;info_hash&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="p"&gt;...;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once verification passes, the path forward is identical to the .torrent download: use the Metainfo to query the tracker for a peer list, spawn multiple threads for parallel piece download, and pwrite everything to disk. Magnet links and .torrent files converge on the same destination.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;From raw TCP sockets to bencoding, from .torrent parsing and tracker communication to the peer wire protocol's handshake and block pipelining, from multithreaded parallel download to magnet links and extension protocols — bit by bit, a working BitTorrent client came together. Around 2500 lines of source code, just under 3500 including tests and build configuration.&lt;/p&gt;

&lt;p&gt;The biggest takeaway from this project is that the best way to understand a protocol or a system is to implement it yourself. The BitTorrent protocol specification is only a handful of pages. But there's an ocean of difference between calling someone else's library and filling every byte of a socket buffer by hand, cross-referencing BEP documents to figure out why the peer won't send an unchoke.&lt;/p&gt;

&lt;p&gt;Of course, this implementation is aggressively minimal. No seeding (download-only), no DHT for decentralized peer discovery (fully tracker-dependent), no UDP tracker support (HTTP only), no rarest-first piece selection (just sequential assignment), no PEX peer exchange, and no end-game mode. These are the clear dividing lines between a production-grade BitTorrent client and a "learning wheel." As a practical tool, it doesn't hold a candle to qBittorrent or Transmission. As a learning exercise, it did everything I wanted it to do.&lt;/p&gt;

&lt;p&gt;If the project interests you, the code is at &lt;a href="https://github.com/Tenaryo/TinyBitTorrent" rel="noopener noreferrer"&gt;https://github.com/Tenaryo/TinyBitTorrent&lt;/a&gt;. Feedback and drive-by comments welcome.&lt;/p&gt;

</description>
      <category>cpp</category>
      <category>networking</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
