DEV Community

Joseph Joshua
Joseph Joshua

Posted on

Detecting & Blocking Anomalous Traffic with Cloud Anomaly Detector

A lightweight, containerized anomaly detection system that monitors traffic in real time, detects abuse patterns, and automatically blocks malicious IPs at the host firewall level.


I built a real-time anomaly detection system that monitors nginx access logs, computes adaptive rolling baselines per time window, detects traffic anomalies using statistical methods (z-score + spike multipliers), and automatically blocks malicious IPs using host-level iptables rules. The system includes Slack alerts and a live dashboard for observability and debugging.


๐Ÿง  Background / Motivation

Modern systems face constant threats such as:

  • DDoS attacks
  • Credential stuffing
  • API abuse and scraping bots
  • Sudden traffic spikes that degrade service

Most production solutions rely on expensive managed WAFs or cloud security tools. I wanted to build a low-cost, self-hosted anomaly detection engine that runs entirely on a VPS using logs, statistics, and system-level enforcement.

Constraints:

  • Must be containerized (Docker-based)
  • Must run on low-cost VPS infrastructure
  • Must use logs (not packet inspection tools)
  • Must enforce bans at host level (not only inside containers)
  • Must provide real-time visibility and debugging

๐Ÿ—๏ธ What I Built

A full-stack anomaly detection pipeline composed of:

  • Detector Service (Python)
  • Baseline Engine (rolling statistical model)
  • Blocker Service (iptables enforcement on host)
  • Dashboard (real-time monitoring UI)
  • Slack Alerting System (incident notifications)

โš™๏ธ How It Works

Nginx logs every request in structured JSON format.

{
  "ip": "1.2.3.4",
  "endpoint": "/",
  "status": 200,
  "timestamp": 1710000000
}
Enter fullscreen mode Exit fullscreen mode

๐Ÿ”„ From Logs to Detection

Once nginx writes request logs, the detector continuously processes them in real time.

Each incoming log entry goes through the following pipeline:

  1. Parse JSON log entry
  2. Extract IP, timestamp, and status code
  3. Update per-second counters
  4. Feed values into rolling baseline engine
  5. Evaluate anomaly conditions

This pipeline runs continuously with minimal latency, ensuring near real-time detection.


๐Ÿ“‰ Rolling Baseline Behavior

The system does not rely on fixed thresholds. Instead, it learns traffic behavior over time.

For each time window, the baseline tracks:

  • Average request rate (mean)
  • Variance (standard deviation)
  • Traffic distribution per second

This allows the system to adapt dynamically to traffic changes.

Example behavior:

  • Normal traffic period โ†’ stable baseline
  • Gradual increase โ†’ baseline adjusts slightly
  • Sudden spike โ†’ deviation becomes statistically significant

โš ๏ธ Anomaly Decision Process

Every second, the detector evaluates:

  • Current request rate vs baseline mean
  • Z-score deviation
  • Spike multiplier threshold
  • Error rate deviation

If any condition exceeds configured thresholds, the IP or system state is flagged.

This ensures:

  • Low false positives during normal usage
  • Fast reaction to sudden abuse patterns

๐Ÿšซ Blocking Execution Flow

When an anomaly is confirmed, the system does not block immediately inside the application layer.

Instead, it uses a decoupled enforcement pipeline:

  1. IP is added to a shared ban queue
  2. Host worker process reads queue
  3. Firewall rule is applied at kernel level

This ensures:

  • Separation of detection and enforcement
  • Reliability even if app crashes
  • Immediate packet-level blocking

๐Ÿ”ฅ Why Host-Level Blocking Matters

Blocking inside containers or application code is not sufficient because:

  • Traffic may already be routed through Docker bridge
  • App-level blocking still consumes resources
  • Reverse proxies may already forward requests

Using iptables DOCKER-USER ensures:

Traffic is dropped before it reaches the container network stack

This makes enforcement fast and reliable.


๐Ÿ“Š Observability Layer

To ensure visibility, the system exposes:

  • Live request rate graphs
  • Current baseline values
  • Active banned IP list
  • Recent anomaly events

The dashboard updates in real time based on detector outputs.


๐Ÿงช Testing Strategy (k6)

The system is validated using controlled load testing:

  • Gradual ramp-up tests
  • Sudden spike injection
  • Sustained high traffic simulation

This ensures:

  • Baseline accuracy
  • Proper Z-score calibration
  • Reliable ban triggering

๐Ÿงฉ System Reliability Design

Several mechanisms improve stability:

  • Warm-up period (prevents early noise)
  • Duplicate ban suppression
  • Rolling window smoothing
  • Queue-based enforcement (decoupled architecture)

These ensure the system remains stable under continuous load.


๐Ÿงญ Summary of Flow

  1. Nginx logs requests
  2. Detector parses logs
  3. Baseline is updated
  4. Anomaly detected using statistical rules
  5. IP is queued for blocking
  6. Host worker applies firewall rule
  7. Slack alert is sent
  8. Dashboard reflects updated state

Visit repo for code workflow: https://github.com/izzyjosh/cloud-anomaly-detector

Top comments (0)