Sharon

Posted on Sep 8

Which Open-Source WAF Really Delivers? A Head-to-Head Benchmark

#cybersecurity #waf #safeline #devops

Web Application Firewalls (WAFs) play a critical role in protecting websites and APIs by filtering and monitoring HTTP traffic. But with so many open-source options available, how do you know which one actually performs best?

I recently ran a controlled benchmark across several open-source WAFs to measure detection accuracy, false positives, and performance latency. Here’s a breakdown of how I tested, what I found, and why SafeLine stood out.

Testing Methodology

To keep things fair, I used only open-source tools and environments. All WAFs were tested with their default configurations, so no vendor tuning or custom rules gave one an unfair advantage.

Metrics evaluated:

Detection Rate: How effectively does the WAF block attack traffic?
False Positive Rate: How often does it mistakenly block legitimate requests?
Accuracy Rate: A balance of both detection and false positives.
Detection Latency: How quickly does the WAF process requests under load?

How Metrics Were Calculated

Borrowing from predictive classification in statistics:

TP (True Positives): Attacks correctly blocked.
TN (True Negatives): Legitimate requests correctly allowed.
FN (False Negatives): Attacks that slipped through.
FP (False Positives): Legitimate requests wrongly blocked.

Formulas used:

Detection Rate = TP / (TP + FN)
False Positive Rate = FP / (TP + FP)
Accuracy Rate = (TP + TN) / (TP + TN + FP + FN)

To reduce randomness, I also split performance into 90% average latency and 99% average latency.

Test Samples

Traffic was collected over 10 hours using Burp Suite:

White samples (normal traffic):
- 60,707 HTTP requests, 2.7 GB of data
- Browsing forums, real-world browsing behavior
Black samples (attack traffic):
- 600 HTTP requests
- 4 categories of attacks:
- Simple DVWA vulnerabilities
- Payloads from PortSwigger’s official attack library
- VulHub PoCs against classic CVEs
- DVWA hardened mode attack scenarios

Testing Setup

Target Machine: Nginx returning a static 200 response for every request

location / {
    return 200 'hello WAF!';
    default_type text/plain;
}

Testing Tool Requirements:
- Parse Burp export
- Repackage HTTP traffic correctly
- Strip cookies (for open-source sharing)
- Modify host headers for routing
- Determine blocking based on HTTP 200 response
- Evenly mix attack and normal traffic
- Auto-calculate metrics

Results

SafeLine WAF

Coraza

ModSecurity

Baota WAF

nginx-lua-waf

Comparison Table

Key Takeaways

SafeLine WAF: Best overall balance with the lowest false positives and false negatives. Strong detection without breaking normal traffic.
Coraza & ModSecurity: High detection rates but struggled with too many false positives, which could frustrate end-users in production.
Other WAFs: Useful in niche scenarios but lagged behind in accuracy or performance.

⚠️ Keep in mind: Different samples and testing methods can produce different outcomes. Always align testing with your real-world environment and traffic patterns.

This benchmark provides insight, but your final choice should depend on your unique needs, infrastructure, and threat model.

Join the SafeLine Community

If you continue to experience issues, feel free to contact SafeLine support for further assistance.

DEV Community