Hooking Up Online: a symposium on sockets

One of the coolest things that I remember doing is creating a little chat application and using it to chat with my friends. It made me feel powerful, even though the actual code was simple. Of course, I knew what next had to be done...

In this article I will share some insight that I've gained in my time writing networking code for Real-Time Applications. So let's start with that age old question.

TCP Or UDP?

Traditional wisdom says to use UDP for speed, and TCP for reliability, but fails to really explain the reasoning behind it. In a real time application there may be reasons to use both - for example, in a video game you may want to use TCP for text chat, and UDP for communicating game state. Why is that, though?

TCP is arguably the one most people learn first, which makes it a worthy starting point. Its main draw is that it is reliable, that is, data written to a TCP socket is guaranteed to arrive at its destination, and in the order that it was sent. In order to accomplish this, the TCP socket has an internal buffer, which we will call the send queue. When data is written to a socket, it simply goes to the send queue and awaits transmission. The exact time it takes to drain the send queue is subject to vagaries both analogue and digital, but the short of all this is that data arrives without any guarantee in respect to an upper time bound. This makes TCP ill-suited for streaming data in real time.

But Just turn off Nagle's Algorithm! This saying is parroted a lot, and it misses the point. The issue of using TCP to stream data in a real-time application is that TCP makes no guarantees about delivery time. This is problematic in a real-time program, where the the data is expected to be current, or near current as possible - not some crusty old news from a few hundred milliseconds ago! So even with Nagle's Algorithm disabled, there is little difference. This is because data is buffered in the send queue until the destination acknowledges its reception.

Some readers may of course point out TCP works just fine in a real time setting, for example, Websockets are often used for just this purpose, and they sit entirely on top of TCP. This is true to a degree, but assumes that the send queue may be emptied faster than it is filled. More latency will be introduced each time data is written to the socket before it can clear the send queue, which itself will grow - and it will grow - until it overflows, and the system throws an out of memory error. This is a race condition. A tell-tale sign of this for users is common in VOIP programs, where after some hitch or stall, it sounds like several people are talking all at once, or way out of time.

Another way of understanding the problem is put like this: since TCP is a reliable protocol, it is ill-suited for streaming data in real time; that is, we need an unreliable protocol.

Streaming In Real-Time

At this point it's worth stating how streaming is supposed to be done in a real-time scenario. The number one priority in a real-time system is immediacy. Everything should be as close to current as possible. This means if data is dropped, then it should be ignored; new data will be sent along soon to take its place. This may result in some of artifact - think of audio dropping out of a call on your phone, or characters warping across screen in an online game. While these artifacts are ugly, they are unavoidable.

This may all seem a bit esoteric, so for clarity think of a TCP streaming application as a VCR, and real-time streaming applications as live television. If the power goes out, then the VCR will simply resume from where it was, but with live television it will resume from where things currently are at present. As a viewer you miss out on what happened during the black out, but when the power comes back on you are just as current as anyone else watching. In this analogy data loss is the power outage. It also also worth mentioning that though you may miss out on what happens during the outage of live television, you may well guess what does based off prior context. This would be equivalent to extrapolating objects in an online game.

This all holds regardless of protocols used, though here I refer to TCP specifically, in general, any reliable, buffered protocol will have these same problems. In order to provide the best experience possible, we have to use an unreliable protocol and take control of the process ourselves.

Enter UDP

UDP was designed for one thing: speed. In order to get that speed, all of the nice features from TCP were stripped away to the bare bones. It is a stateless protocol and the only guarantee it offers is that sent data will arrive intact, provided it arrives at all. Being stateless, or connectionless, the only two functions we have to worry about for the most part are sento and recvfrom.

An important point to note here is that each call to sendto on a UDP socket generates a datagram that is sent at once over the network. Datagrams are subject to the MTU size, which may be up to 1500 bytes on most networks, though in practice may be lower. The minimum MTU is 576 bytes. If your datagram size exceeds this limit, then the network will reject it flat out. This is further complicated by the fact the internet is composed of many different networks, each may vary in MTU size. It is possible to probe for the highest allowed MTU size between two parties by sending packets of dummy data of increasing size until the other party stops receiving them, but in practice commercial applications get away with assuming a MTU size.

Another consideration is handling transmission errors. UDP makes no guarantee about when data will arrive at its destination. This means it is possible for data to arrive out of the order it were sent, or to be lost all together. We have a natural interest in being able to control some of this chaos. A simple and effective way to do this is to stamp each packet with a monotonic sequence id. The application should also keep track of the last sequence id it received. When reading a packet off the network, the two values should be compared. If they match then there was no problem in transmission. A mismatch indicates data loss, and the difference between the two values may be used to discover how many packets are missing. The application id is also monotonic, and should always reflect the most recent packet, i.e. never set it to a lower value if a latent packet arrives. This method works great for one way streams, but may be easily extended for duplex streams by encoding an extra sequence id that keeps track of the last packet sent from the remote side. The same rules apply as given above.

If reliability is desired in some part, then the above method can be further extended once again, in the same fashion. An additional sequence id may be added for reliable data, then the connection simply maintains a send queue for reliable data, and each time a packet is written, some portion of reliable data is copied into it, and the reliable sequence id incremented. The remote side should reply with the last reliable sequence id it has read, then the local send queue is drained up to that point. If the two ids match, then there is no more reliable data to send. Reliable data is sent each packet until it is acknowledged by the remote side. If you find yourself using reliable transfer a lot, then it may be better to simply use TCP instead, or to rethink your approach if you're writing for a real-time application.

At this point, if you know that your largest possible packet will be less than the MTU size, then no other framing is required to get a usable protocol up and running. On the other hand, if it's conceivable that your packet size will exceed the the MTU, then it's time to consider a fragmentation approach. Fragmentation is the act of splitting a large packet up into a series of smaller ones that fit within the MTU size, then sending them out individually, where they are then reassembled on the remote side. To facilitate this, I find the best to write another monotonic id to the packet, which keeps track of each fragment within the whole. The concept is straight forward enough, but there is room for finessing - for example, how should out of order fragments be handled?

The Final Word

So from here you should have a good idea of why reliable protocols a la TCP are a poor choice for streaming in real time, and more over, how to get started with UDP to do something better, but there are numerous avenues to explore from here, but are beyond the scope of this introduction, e.g.: how to handle data loss in a graceful manner.

Some readers may be wondering if there are any other alternatives as well, and the answer is yes! There is DCCP But as of this writing it is available only under Linux and BSD. DCCP offers essentially all of the same features written above, sans the reliability mechanism, which makes it a perfect out of the box solution for any real-time application.