**Section 1 — Introduction
What Is This Project and Why Does It Matter?
Imagine you run a website that thousands of people use every day.
One afternoon, an attacker decides to flood your server with millions
of fake requests, slowing it down for everyone else. This is called
a DDoS attack — Distributed Denial of Service.
At HNG, I was given the task of building a tool that watches all
incoming traffic to a Nextcloud server in real time, learns what
normal traffic looks like, and automatically blocks attackers the
moment something looks suspicious.
No third party tools. No shortcuts. Just Python, Linux, and logic.
Here is how I built it.
**Section 2 — The sliding window
How the Sliding Window Works
The first problem I had to solve was: how do I track how many
requests are arriving right now?
The answer is a sliding window — a data structure that always
shows you the last 60 seconds of traffic, no matter when you look.
I used Python's deque (double-ended queue) to build it.
Think of it like a conveyor belt at a supermarket checkout:
- New items (requests) are added to the RIGHT end every second
- Old items older than 60 seconds are removed from the LEFT end
- At any moment, the belt only holds the last 60 seconds of data
Here is the core idea in code:
from collections import deque
import time
window = deque()
def record_request():
now = int(time.time())
window.append((now, 1))
# Evict entries older than 60 seconds from the left
cutoff = now - 60
while window and window[0][0] < cutoff:
window.popleft()
def get_rate():
counts = [count for _, count in window]
return sum(counts) / len(counts) if counts else 0
I maintained two of these windows — one tracking global traffic
across all IPs, and one per individual IP address. This lets me
detect both a single aggressive attacker and a distributed attack
spread across many IPs.
**Section 3 — The baseline
How the Baseline Learns From Traffic
Knowing the current rate is not enough. I also need to know
what NORMAL looks like so I can compare.
If a street normally has 5 cars per minute and suddenly 50 show up,
that is suspicious. But if it normally has 40 cars, then 50 is fine.
I built a rolling baseline that:
- Keeps a 30-minute history of per-second request counts
- Recalculates the mean (average) and stddev every 60 seconds
- Stores results per hour of the day (hour 14, hour 15, etc.)
- Prefers the current hour's data when enough has been collected
The mean tells me the typical rate. The standard deviation tells
me how much traffic normally varies. Together they let me judge
whether current traffic is unusual.
I also set floor values — the mean never drops below 1.0 and
stddev never drops below 0.5. This prevents division by zero
and stops the system from being too trigger-happy when traffic
is very low.
**Section 4 — The detection logic
How the Detection Logic Makes a Decision
With the current rate and the baseline ready, the detector
asks two questions for every incoming request:
Question 1 — Z-score check
The z-score measures how many standard deviations above normal
the current rate is. The formula is:
z = (current_rate - mean) / stddev
If the baseline mean is 5 req/s and stddev is 2:
- 7 req/s → z = (7-5)/2 = 1.0 → normal
- 11 req/s → z = (11-5)/2 = 3.0 → borderline
- 20 req/s → z = (20-5)/2 = 7.5 → attack!
If the z-score exceeds 3.0, an anomaly is flagged.
Question 2 — Rate multiplier check
If the current rate is more than 5 times the baseline mean,
flag it regardless of the z-score. This catches sudden extreme
spikes even before the baseline has enough data.
Whichever check fires first wins.
There is also an error surge check — if an IP's 4xx/5xx error
rate is 3 times the baseline error rate, the detection threshold
is tightened automatically, making it easier to catch that IP.
**Section 5 — How iptables blocks an IP
How iptables Blocks an Attacker
Once an anomaly is detected, the tool needs to actually stop
the attacker from sending more traffic. I used iptables for this
— a built-in Linux firewall that operates at the kernel level.
Think of iptables as a bouncer standing at the door of your server.
You can give it a list of rules — "if a packet comes from this IP,
drop it before it even reaches Nginx."
When an attacker is detected, my tool runs this command:
iptables -I INPUT -s 1.2.3.4 -j DROP
Breaking that down:
-
-I INPUT— insert this rule at the top of the incoming traffic list -
-s 1.2.3.4— match packets from this source IP -
-j DROP— silently discard them
The attacker's requests never reach Nginx or Nextcloud.
They just disappear into nothing.
The ban follows a backoff schedule:
- First offense → 10 minute ban
- Second offense → 30 minute ban
- Third offense → 2 hour ban
- Fourth offense → permanent ban
When the timer expires, the tool automatically removes the rule:
iptables -D INPUT -s 1.2.3.4 -j DROP
Every ban and unban sends a Slack notification so I always know
what is happening on my server in real time.
**Section 6 — Closing
Wrapping Up
Building this project taught me a lot about how real security
tooling works under the hood. The key lessons were:
- A sliding window is just a deque with eviction logic
- A baseline is just a mean and stddev recalculated over time
- A z-score is just a way of asking "how unusual is this number?"
- iptables is just Linux telling itself to ignore certain packets
None of these concepts are complicated once you strip away the
jargon. Security tooling is just math plus system administration.
The full source code is available on GitHub:
[https://github.com/ejalonibudamilola/hng-anomaly-detector.git]
The live dashboard is running at:
[http://hng-detector.damiloladeborah.link:8080]
This project was built as part of the HNG14 DevOps track.
Top comments (0)