Some time ago I completed the CodeCrafters BitTorrent challenge, and after that I rewrote the torrent client in Blazor to understand it more deeply and also as a portfolio project. So I chose Blazor rather than going with .NET MVC.
I took on this project to enhance my C# skills and logic, and I learned more about how cryptography works, networking, TCP handshake, and obviously how BitTorrent files work.
This article isn’t a deep dive, but rather a quick overview of how I built the BitTorrent client and how it works under the hood.
1. BitTorrent Encoding
BitTorrent files are decoded using bencoding, and you have to decode this format to get the file contents. Bencoding supports 4 types, and they are encoded like below:
Integers are encoded as i<base10 integer>e.
Zero is encoded as i0e.
Byte Strings are encoded as <length>:<contents>.
The length is the number of bytes in the string, encoded in base 10.
A colon (:) separates the length and the contents.
The string "bencode" is encoded as 7:bencode.
Lists are encoded as l<elements>e.
A list containing the string "bencode" and the integer -20 is encoded as l7:bencodei-20ee.
Dictionaries are encoded as d<pairs>e.
A dictionary with keys "wiki" ? "bencode" and "meaning" ? 42 is encoded as d7:meaningi42e4:wiki7:bencodee.
2. BitTorrent File Content
A BitTorrent file contains a dictionary with keys like:
- announce
- announce-list
- comment
- created by
- creation date
- encoding
- info
What we mainly need is the info dictionary, because inside that the pieces we will download are stored.
3. Peers
We use peers to download the file. To find peers, we use the tracker available in the announce
dictionary field. We read the tracker URL from announce
and send a GET request there, and in the response we get the list of available peers, including their IP and port. Then we use those peers to download the pieces.
We can also use just one peer to download, but it is recommended to use multiple peers in parallel to improve speed. There are many dead peers as well as slow peers, so we should make sure the peer is working properly and that we are using the right peers.
4. TCP Handshake
BitTorrent uses TCP to make the communication between client and peer. We make the connection with the peer, and after a successful handshake the peer will return messages like bitfield
, unchoke
, etc. Then we have to send messages like interested
or request any specific piece to download.
If somehow a piece is not downloaded, you can make a download piece later queue with the piece index, and then request all the pieces again with a new peer or try again with the same peer (I have to implement this too 💀).
5. Download File
Now you have successfully established the connection with the peer. What you have to do next is download the file in BitTorrent, but it is not that simple. There are many things you have to take care of, like making sure you got the piece data correctly by comparing it with the expected piece data from the info dictionary.
You have to match that, and as I wrote before you have to make sure you are using multi-peer downloading and properly assign each peer to each piece.
If you got this far, pray that your file was not corrupt or a pending piece, otherwise all your hard work is useless.
Anyway, this is just my learning project to increase my skills, and I think it helped me a lot no doubt, but it was not too easy. This post is not about the full details or how the code works, but just about the under-the-hood process of how it works.
✨ Here’s the full source code on GitHub if you want to take a look.
Top comments (0)