## DEV Community

Amir Keshavarz

Posted on • Updated on

# Rate limiting using the Sliding Window algorithm

In this part of the rate-limiting series, we will introduce the Sliding Window algorithm and implement it in Python.

Even though there are more rate-limiting algorithms out there, I'm going to end the series here since I think These three algorithms are a pretty good gateway to the rate-limiting techniques.

## Sliding Window

Unlike the previous Fixed Window algorithm, this algorithm does not limit the requests based on a fixed time unit.

A problem with Fixed Window was that It allowed a huge burst at the edge of windows because you can combine the capacity of the current and the next window to send a burst of requests.

Sliding Window tries to fix that by taking the previous counter into account, causing the flow to be more smooth.

Let's explain it with an example. Imagine that we have a rate limiter with a time unit of 60 seconds and a capacity of 50.

A request has entered at 00:01:20. The current time unit still has capacity left but as we learned the previous counter also affects the outcome.

The previous counter only affects as much as the space it still has occupied in the window. In our example, that would be ~67%.

So with that in mind, we use the following formula to calculate an estimated count.

ec = previous counter * ((time unit - time into the current counter) / time unit) + current counter

ec = 50 * 0.67 + 20 = 53.5

Our capacity was 50 and because of that, this request is dropped since 53.5 > 50.

Now, If another request comes in at 00:01:40, the ec would be:
ec = 50 * 0.34 + 20 = 37

This request is allowed to pass since 37 < 50.

## Implementation

In this very simple implementation, We will build a rate-limiter that uses Sliding Window to limit packets in 1-second time frames.

We start by defining a class with 3 arguments when It's being instantiated.

1. capacity: number of the allowed packets that can pass through in a second.
2. time_unit: length of time unit in seconds.
3. forward_callback: this function is called when the packet is being forwarded.
4. drop_callback: this function is called when the packet should be dropped.
``````from time import time, sleep

class SlidingWindow:

def __init__(self, capacity, time_unit, forward_callback, drop_callback):
self.capacity = capacity
self.time_unit = time_unit
self.forward_callback = forward_callback
self.drop_callback = drop_callback

self.cur_time = time()
self.pre_count = capacity
self.cur_count = 0
``````

`cur_time` is the edge of our current time unit.
`pre_count` is the previous counter. We pre-fill this to prevent bursting at the beginning of the rate limiter.
`cur_count` is the current counter that will get filled as new requests come in.

Then, we define `handle()` where the magic happens.

``````def handle(self, packet): #1
if (time() - self.cur_time) > self.time_unit: #2
self.cur_time = time()
self.pre_count = self.cur_count
self.cur_count = 0

ec = (self.pre_count * (self.time_unit - (time() - self.cur_time)) / self.time_unit) + self.cur_count #3

if (ec > self.capacity): #4
return self.drop_callback(packet)

self.cur_count += 1 #5
return self.forward_callback(packet)

``````
1. `handle` accepts only 1 parameter: the packet.
2. If the current time unit is passed, switch over to a new time unit.
3. Calculate the estimated count based on the given formula
4. Drop the packet if `ec` is bigger than the capacity.
5. Otherwise, add one to the current counter and forward the packet.

### Final Code

``````from time import time, sleep

class SlidingWindow:

def __init__(self, capacity, time_unit, forward_callback, drop_callback):
self.capacity = capacity
self.time_unit = time_unit
self.forward_callback = forward_callback
self.drop_callback = drop_callback

self.cur_time = time()
self.pre_count = capacity
self.cur_count = 0

def handle(self, packet):

if (time() - self.cur_time) > self.time_unit:
self.cur_time = time()
self.pre_count = self.cur_count
self.cur_count = 0

ec = (self.pre_count * (self.time_unit - (time() - self.cur_time)) / self.time_unit) + self.cur_count

if (ec > self.capacity):
return self.drop_callback(packet)

self.cur_count += 1
return self.forward_callback(packet)

def forward(packet):
print("Packet Forwarded: " + str(packet))

def drop(packet):
print("Packet Dropped: " + str(packet))

throttle = SlidingWindow(5, 1, forward, drop)

packet = 0

while True:
sleep(0.1)
throttle.handle(packet)
packet += 1

``````

You should be getting something like this:

``````Packet Forwarded: 0
Packet Forwarded: 1
Packet Dropped: 2
Packet Forwarded: 3
Packet Dropped: 4
Packet Forwarded: 5
Packet Dropped: 6
Packet Forwarded: 7
Packet Dropped: 8
Packet Forwarded: 9
Packet Dropped: 10
Packet Forwarded: 11
Packet Dropped: 12
Packet Forwarded: 13
Packet Dropped: 14
Packet Forwarded: 15
Packet Dropped: 16
Packet Forwarded: 17
Packet Dropped: 18
Packet Forwarded: 19
Packet Dropped: 20
Packet Forwarded: 21
Packet Dropped: 22
Packet Forwarded: 23

``````

## Conclusion

In this post, we learned about the Sliding Window algorithm and tried to implement it in Python.

During this series, we covered Token Bucket, Fixed Window, and Sliding Window. Even though there are famous algorithms like Leaky Bucket or Sliding Log, I decided to end this series here since I believe these three covers the basic ideas of most rate-limiting algorithms, and this series was also expected to be short and simple.

Thank you for your time and I hope to see you on my other posts :)

miquelvir

Great post! I believe that the last implementation is not right however.

``````if (time() - self.cur_time) > self.time_unit:
self.cur_time = time()
self.pre_count = self.cur_count
self.cur_count = 0

ec = (self.pre_count * (self.time_unit - (time() - self.cur_time)) / self.time_unit) + self.cur_count
``````

Let time() = self.cur_time + 1.5*self.time_unit. Let self.cur_count = capacity. That is, we are in the middle of the next window. Then, it enters inside the if. According to the logic, we would expect to have an estimated rate of 1/2*previousWindowCount+1/2*currentWindowCount. Nonetheless, since self.cur_time = time(), ec = ((self.pre_count * (self.time_unit - (0)) / self.time_unit) + self.cur_count = (self.pre_count * (self.time_unit / self.time_unit) + self.cur_count = self.pre_count + self.cur_count = capacity + 0 = capacity. The estimated rate will be the capacity for a whole new window.

I believe that the right approach would be setting curTime to the "fixed window" start time, and not the time of the request. That is:

``````if (time() - self.cur_time) > self.time_unit:
current_window_start = self.cur_time
while current_window_start + self.time_unit < time():
current_window_start += self.time_unit

self.cur_time = current_window_start
self.pre_count = self.cur_count
self.cur_count = 0

ec = (self.pre_count * (self.time_unit - (time() - self.cur_time)) / self.time_unit) + self.cur_count
``````

What do you think? Am I missing something?

OrenRosen

Thanks, I also noticed it.
Also, you could just do

``````current_window_start = math.floor(time() / time_unit) * time_unit
``````

Hi Amir, this is a great post. I'm networking engineer, studying network programmability and also studying for Cisco Devnet certifications. I read the theory about these rate limiting techniques, but it wasn't until i read this post that i understood. Thank you for sharing this series.

I would like to ask you. In a REST API, what is the best approach to do this rate limiting by user? Any guidance will be appreciate.

ravsrist26

Hi Amir,

Thanks for the post. Why are we rejecting the packets based on estimated count rather than the actual count?