Web Application Firewalls (WAFs) play a critical role in protecting websites and APIs by filtering and monitoring HTTP traffic. But with so many open-source options available, how do you know which one actually performs best?
I recently ran a controlled benchmark across several open-source WAFs to measure detection accuracy, false positives, and performance latency. Here’s a breakdown of how I tested, what I found, and why SafeLine stood out.
Testing Methodology
To keep things fair, I used only open-source tools and environments. All WAFs were tested with their default configurations, so no vendor tuning or custom rules gave one an unfair advantage.
Metrics evaluated:
- Detection Rate: How effectively does the WAF block attack traffic?
- False Positive Rate: How often does it mistakenly block legitimate requests?
- Accuracy Rate: A balance of both detection and false positives.
- Detection Latency: How quickly does the WAF process requests under load?
How Metrics Were Calculated
Borrowing from predictive classification in statistics:
- TP (True Positives): Attacks correctly blocked.
- TN (True Negatives): Legitimate requests correctly allowed.
- FN (False Negatives): Attacks that slipped through.
- FP (False Positives): Legitimate requests wrongly blocked.
Formulas used:
- Detection Rate = TP / (TP + FN)
- False Positive Rate = FP / (TP + FP)
- Accuracy Rate = (TP + TN) / (TP + TN + FP + FN)
To reduce randomness, I also split performance into 90% average latency and 99% average latency.
Test Samples
Traffic was collected over 10 hours using Burp Suite:
-
White samples (normal traffic):
- 60,707 HTTP requests, 2.7 GB of data
- Browsing forums, real-world browsing behavior
-
Black samples (attack traffic):
- 600 HTTP requests
- 4 categories of attacks:
- Simple DVWA vulnerabilities
- Payloads from PortSwigger’s official attack library
- VulHub PoCs against classic CVEs
- DVWA hardened mode attack scenarios
Testing Setup
- Target Machine: Nginx returning a static 200 response for every request
location / {
return 200 'hello WAF!';
default_type text/plain;
}
-
Testing Tool Requirements:
- Parse Burp export
- Repackage HTTP traffic correctly
- Strip cookies (for open-source sharing)
- Modify host headers for routing
- Determine blocking based on HTTP 200 response
- Evenly mix attack and normal traffic
- Auto-calculate metrics
Results
SafeLine WAF
Coraza
ModSecurity
Baota WAF
nginx-lua-waf
Comparison Table
Key Takeaways
- SafeLine WAF: Best overall balance with the lowest false positives and false negatives. Strong detection without breaking normal traffic.
- Coraza & ModSecurity: High detection rates but struggled with too many false positives, which could frustrate end-users in production.
- Other WAFs: Useful in niche scenarios but lagged behind in accuracy or performance.
⚠️ Keep in mind: Different samples and testing methods can produce different outcomes. Always align testing with your real-world environment and traffic patterns.
This benchmark provides insight, but your final choice should depend on your unique needs, infrastructure, and threat model.
Join the SafeLine Community
If you continue to experience issues, feel free to contact SafeLine support for further assistance.
Top comments (0)