Shahraan Hussain

Posted on Jun 22

I Tested an Open-Source Anti-Bot Firewall (Anubis) Against Requests, AsyncIO, Selenium, and Playwright

#antibot #firewall #ai #opensource

Lately I've been spending a lot of time studying modern anti-bot systems.

Most discussions today revolve around Cloudflare, DataDome, Akamai, Kasada, Human Security, or PerimeterX. But while exploring the anti-bot ecosystem, I stumbled upon an interesting open-source project called Anubis.

Its description immediately caught my attention:

"Anubis is a Web AI Firewall Utility that weighs the soul of your connection using one or more challenges in order to protect upstream resources from scraper bots."

The project was created to help smaller communities defend themselves against the massive amount of automated traffic generated by AI crawlers and large-scale scrapers.

Naturally, I wanted to see how it behaved in practice.

What Is Anubis?

Anubis positions itself as a lightweight anti-bot layer that sits in front of websites and presents computational challenges to visitors before allowing access.

The idea is simple:

Legitimate users pass the challenge.
Automated traffic gets filtered.
Website resources are protected from abuse.

The project openly states that it can be considered a "nuclear option" because it may block smaller scrapers and even impact beneficial crawlers such as Internet Archive bots.

Unlike enterprise anti-bot vendors, Anubis focuses on simplicity and self-hosting.

The Test Targets

I found several publicly accessible websites running Anubis:

The goal wasn't to bypass anything.

I simply wanted to understand:

What happens when common scraping tools interact with Anubis-protected websites?

Experiment #1 — Plain Requests

Like most scrapers, I started with the simplest possible setup.

No browser.

No JavaScript.

No special headers.

Just Python Requests.

import requests
from datetime import datetime

for x in range(20):
    interval_start = datetime.now()

    response = requests.get("https://bugs.winehq.org/")

    interval_end = datetime.now()

    print(
        f"status Code:{response.status_code} "
        f"Interval {x+1}: {interval_end - interval_start}"
    )

My expectation was straightforward.

I thought I would see:

403 Forbidden

or perhaps:

429 Too Many Requests

Instead, every request returned:

200 OK

Twenty consecutive requests.

Same IP.

No proxy rotation.

No browser fingerprinting.

No TLS spoofing.

Just Requests.

First Surprise

At this point I became suspicious.

Many anti-bot solutions allow a few requests before escalating defenses.

So I wondered:

Maybe Anubis is tracking request frequency.

Maybe rate limits trigger after a threshold.

Maybe concurrent traffic changes the outcome.

So I moved to the next test.

Experiment #2 — 100 Concurrent Requests

I used aiohttp to generate 100 simultaneous requests.

import asyncio
import aiohttp
from datetime import datetime

async def fetch(session, i):
    start = datetime.now()

    response = await session.get(
        "https://bugs.winehq.org/"
    )

    print(
        f"status:{response.status} "
        f"req{i+1}: {datetime.now()-start}"
    )

async def main():
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(
            *[fetch(session, i) for i in range(100)]
        )

asyncio.run(main())

Again, I expected something to happen.

Potential outcomes:

Rate limiting
Challenge escalation
Temporary blocking
Connection throttling

The result?

Every request eventually completed successfully.

100 requests
100 x 200 OK
0 blocks

Some requests took longer than others, but they all succeeded.

The Same IP the Entire Time

One important detail:

I never changed IP addresses.

Throughout the experiment I used:

Requests
AsyncIO
Selenium
Playwright
Playwright MCP

all from the same IP.

This is significant because IP reputation is often one of the earliest signals used by anti-bot systems.

Repeated requests from the same source can contribute to:

Reputation scoring
Velocity analysis
Abuse detection
Challenge escalation

Yet I continued receiving successful responses.

Experiment #3 — Selenium

At this point I assumed browser automation would be more likely to trigger protections.

I launched Selenium.

The browser displayed the familiar message:

Chrome is being controlled by automated test software

which many anti-bot vendors actively monitor.

The outcome?

Normal page access.

No obvious blocking.

No visible challenge failure.

Experiment #4 — Playwright

Next came Playwright.

Playwright generally provides a more realistic browser environment and is often used when anti-bot protections become stricter.

The result remained the same.

Successful page loads.

No visible enforcement action.

Experiment #5 — Playwright MCP

Finally, I tested using Playwright MCP.

Again:

Success

The Anubis challenge completed successfully and access was granted.

What Does This Mean?

The most important takeaway is:

A 200 response does not automatically mean the anti-bot system failed.

There are multiple possible explanations.

Possibility 1: Conservative Configuration

The websites may be running relatively relaxed Anubis settings.

Many administrators prioritize accessibility over aggressive enforcement.

Possibility 2: Challenge Completion

The requests may have successfully completed the challenge process without triggering additional scrutiny.

Possibility 3: Enforcement Thresholds Were Not Reached

Anubis may require:

Larger traffic volumes
Longer observation windows
Different behavioral patterns

before escalating protections.

Possibility 4: Configuration Gaps

Like any security product, deployment configuration matters.

A protection layer can only be as effective as the rules applied to it.

Possibility 5: Areas for Future Improvement

Every anti-bot solution evolves over time.

Open-source projects especially benefit from community testing and feedback.

Experiments like this can help identify edge cases and potential improvements.

What I'm Going To Test Next

Rather than speculate, I plan to deploy my own Anubis-protected environment.

That will allow controlled testing of:

Request-based scraping
Browser automation
TLS fingerprints
Header anomalies
Concurrency limits
IP reputation
Behavioral analysis

Only then can I determine whether what I observed was:

A configuration choice
A deployment-specific behavior
An intentional design decision
Or an area that deserves further investigation

Final Thoughts

One thing I've learned after years of working in web scraping:

Never underestimate an anti-bot system.

A successful request today doesn't mean a defense is weak.

Likewise, a blocked request doesn't necessarily mean a defense is strong.

The interesting part is understanding why a request succeeds or fails.

This experiment raised more questions than answers—which is exactly what makes anti-bot research fascinating.

The next step is building a controlled environment and digging deeper into how Anubis evaluates connections, challenges clients, and decides who gets through.

And honestly, that's where the fun begins.

WebScraping #DataExtraction #BotDetection #AntiScraping #SecurityFail #DevOps #WebSecurity #CyberSecurity #BotDefense #ScraperLife

DEV Community