Reducing False Positives in WAF: Combining OWASP Rules with AI Context

#ai #architecture #security #devops

Every API request does not carry the same weight. A product catalog handles tens of thousands of requests per second and cannot tolerate 100ms latency. On the other hand, a highly sensitive admin panel handling 10 requests per second can tolerate that latency. On evaluation, commonly used Web Application Firewalls such as ModSecurity, Cloudflare and Fastly apply a default rule configuration across the application, and while they support custom rules, they do not provide per-route security profiles without significant configuration effort.

Additionally, WAFs possess an intrinsic tradeoff between Speed and Accuracy. Rule based WAFs detect SQL or XSS attacks in microseconds but that generates False Positives due to no context. LLM based security can understand context but that adds unacceptable latency to each request. Therefore, we decided to go hybrid.

This article is about Argus, a hybrid open-source WAF in Go which uses pattern matching(Coraza WAF by OWASP) and probabilistic AI model(Gemini LLM) for three risk profiles allowing developers to choose latency and context per route. Achieving this required grappling with three fundamental problems:

How do we merge speed and context awareness without forcing developers into a binary choice?
How does the WAF degrade when external dependencies fail?
How do we make it easier to adopt in production?

Merging Speed and Context

Deterministic regex based rules catch majority of attacks and adding AI context limits those false positives where the legitimate requests look suspicious. Writing a SQL tutorial about DELETE TABLE command will be blocked by regex rules but allowed by AI if context is provided.

However, every end point has different latency and security budget so instead of defining speed and latency globally, we decided to let the developers choose.

The Solution: Three Modes

Latency First: Coraza runs and gives the final verdict to allow or block the request.
Paranoid: Coraza blocks or verifies but each request is validated by Gemini.
Smart Shield: If Coraza blocks, Gemini eliminates False Positives using context.

Finally, each request’s Gemini verdict is logged in the database for admin analysis.

Resilience: What Happens When AI Fails?

Argus directly depends on Gemini’s API for SmartShield and Paranoid modes. But what happens when Google's service has an outage?

OWASP rules still block threats(SQL injection, XSS patterns) even when Gemini is down. So the foundation of protection doesn't disappear, it only loses the AI layer.

Circuit Breaker

In order to design a system so the AI Layer degrades gracefully, not catastrophically, we plugged in a ‘3 state Circuit Breaker’:

Closed(Normal): Gemini API responding. Requests flow through modes normally.
Open: After 3 consecutive failures, circuit stays open for 30 seconds. No Gemini calls are made and each mode falls back to Coraza based verdict.
Half Open: After 30 seconds, one Gemini API request is made.

Success → Closed Circuit

Failure → Re-open Circuit for 30 seconds

Drop-in Adoption in Production

A hybrid WAF with circuit breakers, three modes, and Gemini integration sounds complex. Argus is designed so that this complexity is fully encapsulated and integration remains simple for developers. Argus offers two integration paths designed for different infrastructures.

Go SDK

For Go apps, Argus integrates as a middleware that wraps your http.Handler with the chosen protection mode.

waf, _ := argus.NewWAF()
client := argus.NewClient(
    "https://argus-5qai.onrender.com",
    "api-key",
    20*time.Second,
)
config := argus.Config{
    Mode: argus.SmartShield,
}
shield := argus.NewMiddleware(client, waf, config)
http.Handle("/api/", shield.Protect(yourHandler))
http.ListenAndServe(":8080", nil)

Docker Sidecar

For applications written in other languages (Node, Python, Ruby, PHP), Argus can run as a lightweight sidecar reverse proxy. Your application remains unchanged and traffic is routed through Argus before reaching your service.


docker run -d \
  --name argus-sidecar \
  -p 8000:8000 \
  -e TARGET_URL=http://host.docker.internal:3000 \
  -e ARGUS_API_KEY=api-key \
  -e ARGUS_API_URL=https://argus-5qai.onrender.com/ \
  ghcr.io/priyansh-dimri/argus-sidecar:latest

Optimizing the Hot Path

Because this middleware sits on the critical path of every request, performance was the primary constraint during development. We optimized for the hot path, achieving:

262µs processing time for clean requests.
151ns overhead for the circuit breaker (atomic state checks instead of mutexes).
56% parallel efficiency scaling up to 4 cores.

You can check out the source code and contribute at github.com/priyansh-dimri/argus/