DEV Community

Andreas Tzionis
Andreas Tzionis

Posted on

How BitTorrent (really) works:

1. Discovering nodes

BitTorrent node ip addresses (for a given torrent) are discovered from a DHT network, called Kademlia.

Kademlia assigns random ids to nodes similar to the data ids (ie. the torrent hash) so nodes know about torrents with ids close to their own.

Image description

Kademlia messages are sent over UDP and contain a payload and some metadata (type and length).

The "get_peers" message returns nodes that are closer to a given id (eg. torrent hash).

Script: https://github.com/liveduo/bittorrent-scripts/blob/main/1-connect-dht.js

*82 lines of code (depends on bencode npm package)

Image description

To get the nodes that own a given id (ie. torrent hash) "get_peers" messages are sent recursively to get nodes that closer and closer to the torrent.

Script: https://github.com/LiveDuo/bittorrent-scripts/blob/main/2-discover-nodes.js

*130 lines of code (depends on bencode npm package)

Image description

2. Connecting to a node

Initially (before 2008), BitTorrent needed trackers, centralised servers that contained the torrent metadata.

This metadata can be size, torrent files, piece hashes for checksums and others.

Now metadata can be downloaded from nodes in the DHT without centralised parties.

Image description

Bittorrent messages are sent over TCP and contain a payload and some metadata.

The "handshake", "interested" and "unchoke" messages establish a connection and interest for a torrent.

The "bitfield" message shows which pieces a node has and "piece" is used to transfer data.

Image description

To connect to a node we first send a "handshake" message.

We should receive "handshake" and "unchoke" messages if the node is ready for "piece" requests (ie. requests for data).

Script: https://github.com/LiveDuo/bittorrent-scripts/blob/main/3-connect-node.js

*69 lines of code (depends on bencode npm package)

Image description

2. Downloading data

Files are transferred in pieces from multiple nodes in parallel.

When all pieces are received the torrent files can be reconstructed.

Script: https://github.com/LiveDuo/bittorrent-scripts/blob/main/4-download-data.js

*165 lines of code (depends on bencode npm package)

Image description

More details on:
https://www.tzionis.com/bittorrent-protocol-in-25-minutes

Top comments (0)