The term "rate-limiting" is known to almost everyone who has worked with web servers. I think It's one of those subjects that we often just use and don't think about it. But I think knowing how rate limiting works is useful If you have ever used it.
As you may know, "rate-limiting" is basically monitoring the incoming requests and making sure they don't violate a threshold. Basically, we define a rate of requests per time-unit and discard packets of a stream that is sending more packets than the one defined in our rate.
You probably noticed that I used the word "discard" when our threshold is reached. That's not always the case. In a lot of cases, we put the packets in a queue to keep the packets at a constant output rate. (And packets won't be discarded.)
What we'll be discussing doesn't require a queue and I dare say It's probably the simplest way to implement rate-limiting.
The most famous ways of implementing rate-limiting (Traffic Shaping) are:
- Token Bucket
- Leaky Bucket
- (r, t) Traffic Shaping
The Leaky Bucket is somewhat similar to the Token Bucket but right now we don't care about the constant output rate and Token Bucket is the only algorithm we will talk about.
The Token Bucket analogy is very simple. It's all about a bucket and tokens in it. Let's discuss it step by step.
- Picture a bucket in your mind.
- Fill the buckets with tokens at a constant rate.
- When a packet arrives, check if there is any token in the bucket.
- If there was any token left, remove one from the bucket and forward the packet. If the bucket was empty, simply drop the packet.
That's it. Wasn't it simple?
Enough talking. Let's write a simple Token Bucket throttler in Python.
We start by defining a class with 4 arguments when It's being instantiated.
- tokens: number of tokens added to the bucket in each time unit.
- time_unit: the tokens are added in this time frame.
- forward_callback: this function is called when the packet is being forwarded.
- drop_callback: this function is called when the packet should be dropped.
We also prefill the bucket with the given tokens. This allows for a burst of packets when the throttler first starts working.
last_check is the timestamp that we previously handled a packet. For initialization, we set the time when we created the bucket.
import time class TokenBucket: def __init__(self, tokens, time_unit, forward_callback, drop_callback): self.tokens = tokens self.time_unit = time_unit self.forward_callback = forward_callback self.drop_callback = drop_callback self.bucket = tokens self.last_check = time.time()
Then, we define
handle() where the magic happens.
def handle(self, packet): #1 current = time.time() #2 time_passed = current - self.last_check #3 self.last_check = current #4 self.bucket = self.bucket + \ time_passed * (self.tokens / self.time_unit) #5 if (self.bucket > self.tokens): #6 self.bucket = self.tokens if (self.bucket < 1): #7 self.drop_callback(packet) else: self.bucket = self.bucket - 1 #8 self.forward_callback(packet)
handleaccepts only 1 parameter: the packet.
- The current timestamp is put into
- The time passed between now and the previous
handlecall is put into
- We update the
last_checkto the current timestamp.
- This is the most important part. By multiplying the
time_passedand our rate (which is
time_unit) we find out how many token needs to be added to the bucket.
- Reset the bucket if it has more tokens than the default value.
- If the bucket doesn't have enough token, drop the packet.
- Otherwise, remove one token and forward the packet.
That's all. This is the final working version:
import time class TokenBucket: def __init__(self, tokens, time_unit, forward_callback, drop_callback): self.tokens = tokens self.time_unit = time_unit self.forward_callback = forward_callback self.drop_callback = drop_callback self.bucket = tokens self.last_check = time.time() def handle(self, packet): current = time.time() time_passed = current - self.last_check self.last_check = current self.bucket = self.bucket + \ time_passed * (self.tokens / self.time_unit) if (self.bucket > self.tokens): self.bucket = self.tokens if (self.bucket < 1): self.drop_callback(packet) else: self.bucket = self.bucket - 1 self.forward_callback(packet) def forward(packet): print("Packet Forwarded: " + str(packet)) def drop(packet): print("Packet Dropped: " + str(packet)) throttle = TokenBucket(1, 1, forward, drop) packet = 0 while True: time.sleep(0.2) throttle.handle(packet) packet += 1
You should be getting something like this:
Packet Forwarded: 0 Packet Dropped: 1 Packet Dropped: 2 Packet Dropped: 3 Packet Dropped: 4 Packet Forwarded: 5 Packet Dropped: 6 Packet Dropped: 7 Packet Dropped: 8 Packet Dropped: 9 Packet Forwarded: 10 Packet Dropped: 11 Packet Dropped: 12 Packet Dropped: 13 Packet Dropped: 14 Packet Forwarded: 15
As you can see in the code, our rate is 1 packet per second. We also send a packet every 0.2 seconds. When we first send a packet it quickly empties the bucket and it takes some time for the bucket to be filled again and that's why 4 packets are dropped between each successful forward.
Rate limiting and traffic policing, in general, is a very vast subject so If you liked what you just read, there are many materials available online about this subject that you can use to have a much deeper understanding of traffic policing.
Thank you for your time.