DEV Community: Shahraan Hussain

I Tested an Open-Source Anti-Bot Firewall (Anubis) Against Requests, AsyncIO, Selenium, and Playwright

Shahraan Hussain — Mon, 22 Jun 2026 17:00:03 +0000

Lately I've been spending a lot of time studying modern anti-bot systems.

Most discussions today revolve around Cloudflare, DataDome, Akamai, Kasada, Human Security, or PerimeterX. But while exploring the anti-bot ecosystem, I stumbled upon an interesting open-source project called Anubis.

Its description immediately caught my attention:

"Anubis is a Web AI Firewall Utility that weighs the soul of your connection using one or more challenges in order to protect upstream resources from scraper bots."

The project was created to help smaller communities defend themselves against the massive amount of automated traffic generated by AI crawlers and large-scale scrapers.

Naturally, I wanted to see how it behaved in practice.

What Is Anubis?

Anubis positions itself as a lightweight anti-bot layer that sits in front of websites and presents computational challenges to visitors before allowing access.

The idea is simple:

Legitimate users pass the challenge.
Automated traffic gets filtered.
Website resources are protected from abuse.

The project openly states that it can be considered a "nuclear option" because it may block smaller scrapers and even impact beneficial crawlers such as Internet Archive bots.

Unlike enterprise anti-bot vendors, Anubis focuses on simplicity and self-hosting.

The Test Targets

I found several publicly accessible websites running Anubis:

The goal wasn't to bypass anything.

I simply wanted to understand:

What happens when common scraping tools interact with Anubis-protected websites?

Experiment #1 — Plain Requests

Like most scrapers, I started with the simplest possible setup.

No browser.

No JavaScript.

No special headers.

Just Python Requests.

import requests
from datetime import datetime

for x in range(20):
    interval_start = datetime.now()

    response = requests.get("https://bugs.winehq.org/")

    interval_end = datetime.now()

    print(
        f"status Code:{response.status_code} "
        f"Interval {x+1}: {interval_end - interval_start}"
    )

My expectation was straightforward.

I thought I would see:

403 Forbidden

or perhaps:

429 Too Many Requests

Instead, every request returned:

200 OK

Twenty consecutive requests.

Same IP.

No proxy rotation.

No browser fingerprinting.

No TLS spoofing.

Just Requests.

First Surprise

At this point I became suspicious.

Many anti-bot solutions allow a few requests before escalating defenses.

So I wondered:

Maybe Anubis is tracking request frequency.

Maybe rate limits trigger after a threshold.

Maybe concurrent traffic changes the outcome.

So I moved to the next test.

Experiment #2 — 100 Concurrent Requests

I used aiohttp to generate 100 simultaneous requests.

import asyncio
import aiohttp
from datetime import datetime

async def fetch(session, i):
    start = datetime.now()

    response = await session.get(
        "https://bugs.winehq.org/"
    )

    print(
        f"status:{response.status} "
        f"req{i+1}: {datetime.now()-start}"
    )

async def main():
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(
            *[fetch(session, i) for i in range(100)]
        )

asyncio.run(main())

Again, I expected something to happen.

Potential outcomes:

Rate limiting
Challenge escalation
Temporary blocking
Connection throttling

The result?

Every request eventually completed successfully.

100 requests
100 x 200 OK
0 blocks

Some requests took longer than others, but they all succeeded.

The Same IP the Entire Time

One important detail:

I never changed IP addresses.

Throughout the experiment I used:

Requests
AsyncIO
Selenium
Playwright
Playwright MCP

all from the same IP.

This is significant because IP reputation is often one of the earliest signals used by anti-bot systems.

Repeated requests from the same source can contribute to:

Reputation scoring
Velocity analysis
Abuse detection
Challenge escalation

Yet I continued receiving successful responses.

Experiment #3 — Selenium

At this point I assumed browser automation would be more likely to trigger protections.

I launched Selenium.

The browser displayed the familiar message:

Chrome is being controlled by automated test software

which many anti-bot vendors actively monitor.

The outcome?

Normal page access.

No obvious blocking.

No visible challenge failure.

Experiment #4 — Playwright

Next came Playwright.

Playwright generally provides a more realistic browser environment and is often used when anti-bot protections become stricter.

The result remained the same.

Successful page loads.

No visible enforcement action.

Experiment #5 — Playwright MCP

Finally, I tested using Playwright MCP.

Again:

Success

The Anubis challenge completed successfully and access was granted.

What Does This Mean?

The most important takeaway is:

A 200 response does not automatically mean the anti-bot system failed.

There are multiple possible explanations.

Possibility 1: Conservative Configuration

The websites may be running relatively relaxed Anubis settings.

Many administrators prioritize accessibility over aggressive enforcement.

Possibility 2: Challenge Completion

The requests may have successfully completed the challenge process without triggering additional scrutiny.

Possibility 3: Enforcement Thresholds Were Not Reached

Anubis may require:

Larger traffic volumes
Longer observation windows
Different behavioral patterns

before escalating protections.

Possibility 4: Configuration Gaps

Like any security product, deployment configuration matters.

A protection layer can only be as effective as the rules applied to it.

Possibility 5: Areas for Future Improvement

Every anti-bot solution evolves over time.

Open-source projects especially benefit from community testing and feedback.

Experiments like this can help identify edge cases and potential improvements.

What I'm Going To Test Next

Rather than speculate, I plan to deploy my own Anubis-protected environment.

That will allow controlled testing of:

Request-based scraping
Browser automation
TLS fingerprints
Header anomalies
Concurrency limits
IP reputation
Behavioral analysis

Only then can I determine whether what I observed was:

A configuration choice
A deployment-specific behavior
An intentional design decision
Or an area that deserves further investigation

Final Thoughts

One thing I've learned after years of working in web scraping:

Never underestimate an anti-bot system.

A successful request today doesn't mean a defense is weak.

Likewise, a blocked request doesn't necessarily mean a defense is strong.

The interesting part is understanding why a request succeeds or fails.

This experiment raised more questions than answers—which is exactly what makes anti-bot research fascinating.

The next step is building a controlled environment and digging deeper into how Anubis evaluates connections, challenges clients, and decides who gets through.

And honestly, that's where the fun begins.

WebScraping #DataExtraction #BotDetection #AntiScraping #SecurityFail #DevOps #WebSecurity #CyberSecurity #BotDefense #ScraperLife

Anti-Bot Evasion 2026: Why Your TLS Handshake Is Getting You Flagged (And How to Fix It)

Shahraan Hussain — Sun, 21 Jun 2026 11:05:06 +0000

Why Your Browser Version Could Be Exposing Your Scraper Before the First Request

Modern anti-bot systems no longer rely solely on HTTP headers, JavaScript fingerprints, or IP reputation. Increasingly, detection begins before the first HTTP request is even processed—during the TLS handshake itself.

One signal that has become difficult to ignore is the rise of Post-Quantum (PQ) key exchange support in modern browsers.

Recently, I ran a series of tests to understand how this affects browser impersonation and scraping infrastructure. The results were interesting.

The Evolution of Browser Fingerprinting

For years, many scraping tools focused on matching:

User-Agent strings
HTTP headers
Browser APIs
Canvas and WebGL fingerprints

However, anti-bot vendors have steadily moved lower in the networking stack.

Today, platforms such as Cloudflare, Akamai, DataDome, Kasada, and others analyze signals including:

TLS ClientHello fingerprints
Cipher suite ordering
TLS extension ordering
JA3 and JA4 fingerprints
HTTP/2 SETTINGS fingerprints
HTTP/3 and QUIC characteristics
Browser behavior consistency

This means that claiming to be Chrome 149 while presenting a TLS handshake that looks nothing like Chrome 149 can immediately increase suspicion.

The Post-Quantum Shift

Recent browser versions have started deploying hybrid post-quantum key exchanges.

A commonly observed example is:

X25519MLKEM768

This hybrid mechanism combines traditional elliptic-curve cryptography with post-quantum cryptographic protection.

From an anti-bot perspective, the important observation is simple:

If a client claims to be a modern browser but does not exhibit characteristics commonly associated with that browser generation, it becomes easier to identify inconsistencies.

A Simple Experiment

To explore this, I tested a modern browser impersonation stack and inspected the negotiated connection details using Cloudflare's trace endpoint.

The response included:

tls=TLSv1.3
http=http/2
kex=X25519MLKEM768

The interesting field here is:

kex=X25519MLKEM768

which indicates that a post-quantum hybrid key exchange was successfully negotiated.

Why This Matters

Consider two clients:

Client A

User-Agent: Chrome 149
TLS Key Share: X25519 only

Client B

User-Agent: Chrome 149
TLS Key Share: X25519MLKEM768

Neither signal alone determines whether the client is a bot.

However, modern anti-bot systems are built around consistency.

When every layer of the connection aligns with what is expected from a real browser, the overall risk score tends to improve.

When multiple inconsistencies accumulate, the opposite happens.

The Common Misconception

Many engineers assume that bypassing anti-bot systems is primarily about headers:

headers = {
    "User-Agent": "Chrome/149"
}

Unfortunately, that approach stopped being sufficient years ago.

Today, anti-bot systems may inspect:

TLS fingerprints
HTTP/2 fingerprints
HTTP/3 fingerprints
Browser APIs
Behavioral signals
Session history
IP reputation

TLS is only one layer, but it is often the first layer.

Testing Modern TLS Profiles

When validating a browser impersonation stack, I now check:

TLS Layer

Cipher suite ordering
Extension ordering
Supported groups
Signature algorithms
PQ key share support

HTTP Layer

HTTP/2 SETTINGS frames
Header ordering
Priority behavior

Browser Layer

Navigator properties
WebGL
Canvas
Audio fingerprints

A mismatch at any layer can become a useful signal for detection systems.

What This Does NOT Mean

It's important not to overstate the impact.

The absence of a PQ key share does not automatically mean:

No PQ = Blocked

Real-world traffic includes:

Older browsers
Enterprise-managed devices
Corporate TLS proxies
Embedded browsers
Mobile WebViews

Blocking solely on PQ support would generate too many false positives.

A more accurate conclusion is:

The absence of a post-quantum key share is becoming an increasingly useful negative signal when a client claims to be a recent browser version.

Practical Takeaways

If you're building browser impersonation or scraping infrastructure:

Review Your TLS Stack

Verify that your TLS implementation matches the browser version you claim to emulate.

Stop Focusing Only on Headers

Headers are just one component of a much larger fingerprint.

Validate End-to-End Consistency

The goal isn't merely to send a modern User-Agent.

The goal is to make every layer of the connection look consistent with that User-Agent.

Monitor Browser Changes

Browser fingerprints evolve continuously.

A profile that looked authentic six months ago may now be outdated.

Final Thoughts

Anti-bot detection continues to move deeper into the networking stack.

While Post-Quantum key exchanges are not a magic bypass, they are becoming part of the broader fingerprint expected from modern browsers.

For scraping engineers, the lesson is straightforward:

The challenge is no longer making your headers look like Chrome.

The challenge is making your entire connection behave like Chrome.

And increasingly, that starts with the TLS handshake.

webscraping #antibot #cybersecurity #tls #cloudflare #postquantum #python #golang #devops #programming #antibotbypass

Can an AI Agent Behave Like a Human? A 12-Hour Experiment with StoryCaptcha

Shahraan Hussain — Thu, 18 Jun 2026 13:33:46 +0000

A day ago, I came across a LinkedIn post from Tyler Richards showcasing an experimental CAPTCHA called StoryCaptcha.

The concept was simple but unusual.

Instead of asking users to identify traffic lights or solve image puzzles, StoryCaptcha asks users to write a short story based on a random prompt and then evaluates the interaction using behavioral signals.

The goal wasn't to build a production-ready CAPTCHA.

It was an experiment exploring behavioral biometrics and user interaction patterns.

As someone working in web scraping and anti-bot research, I immediately became curious.

What happens when an AI agent attempts the challenge?

More importantly:

Can an AI-controlled browser generate interaction patterns that a behavioral CAPTCHA considers human?

I spent the next 12 hours trying to answer that question.

The Setup

For this experiment I used:

Playwright MCP
VS Code
GitHub Copilot
Chromium

The objective wasn't to bypass the CAPTCHA.

The objective was to understand how a behavioral scoring system evaluates AI-driven interactions.

First Attempt: 56/100

My first run scored 56/100 and failed.

The reason quickly became obvious.

The AI agent was behaving exactly how an automation system would behave:

Copying and pasting content
Completing actions immediately
Following deterministic patterns
Showing almost no hesitation

Efficient.

But not very human.

The Interesting Part

Unlike many behavioral systems, StoryCaptcha actually exposes a large portion of the signals it evaluates.

The dashboard displayed metrics such as:

Typing Signals

Typed vs Pasted
Keystrokes per character
Key-hold (dwell) profile
Key-overlap (rollover)
Rhythm variability
Non-repeating intervals

Behavioral Signals

Cognitive pauses
Inter-interaction timing
Correction behavior
Backspace usage

Mouse Signals

Mouse path curvature
Straightness
Teleport detection

Content Signals

Reads like language
On-topic for prompt

This transformed the experiment from simple testing into a feedback-driven behavioral analysis exercise.

Instead of guessing blindly, I could observe which signals were being evaluated and adjust the agent's behavior accordingly.

Observation #1: Copy-Paste Was a Dead Giveaway

Initially the AI agent preferred copying and pasting the story.

StoryCaptcha immediately detected this.

The first optimization was simple:

Instead of pasting content, I instructed the agent to type the response character by character.

The score improved.

Observation #2: Human Typing Isn't Uniform

The next issue was typing cadence.

Humans don't type with perfectly consistent timing.

Sometimes we pause.

Sometimes we think.

Sometimes we speed up.

I instructed the agent to:

Use random keystroke delays
Avoid identical intervals
Pause naturally between thoughts

The score improved again.

One metric I paid particular attention to was:

Non-Repeating Intervals

StoryCaptcha was actively measuring how repetitive the timing patterns were.

Observation #3: Humans Make Mistakes

Humans aren't perfect typists.

We:

Misspell words
Hit incorrect keys
Use backspace
Correct ourselves

Automation rarely does.

So I instructed the agent to:

Occasionally introduce spelling mistakes
Use backspace corrections
Continue naturally after correction

The dashboard reflected these behaviors through correction metrics and the overall score improved.

Observation #4: Humans Don't Instantly Click Everything

The agent was still too efficient.

Humans typically:

Read content
Hover over elements
Pause before actions
Explore pages

I encouraged more natural cursor movement and hovering behavior.

StoryCaptcha evaluates:

Mouse path curvature
Teleport detection
Interaction timing

So this adjustment had a measurable impact.

Observation #5: One Signal Refused To Cooperate

The most fascinating metric was:

Key Overlap (Rollover)

StoryCaptcha reported:

Human ≈ 25%–50% overlap

My agent consistently scored:

0%

Even after improving almost every other metric.

This was particularly interesting because it exposed a difference between simulated typing and real human keyboard behavior.

Humans frequently begin pressing the next key before fully releasing the previous key.

Many automation frameworks generate perfectly sequential key events.

The CAPTCHA was successfully identifying that distinction.

Despite scoring well overall, this remained one of the strongest indicators that the interaction was not genuinely human.

Final Result

After roughly 10 experimental runs:

Attempt	Score
Initial	56
Intermediate	60–70
Optimized	76–77

The challenge eventually passed consistently.

However, the score wasn't the most valuable outcome.

The real value was understanding how behavioral features influenced the evaluation.

What I Learned

Behavioral Biometrics Are More Than Mouse Movement

Before this experiment, most discussions I encountered focused on:

Browser fingerprints
TLS fingerprints
Device identification
Network reputation

This experiment reminded me that behavior itself can become a powerful signal.

Not just what actions occur.

But how they occur.

AI Agents Create New Challenges

Traditional automation focuses on:

Speed
Efficiency
Determinism

AI agents introduce:

Exploration
Context awareness
Adaptive behavior

As AI agents become more common, behavioral detection systems will likely become increasingly important.

Reverse Engineering Doesn't Always Require Source Code

I never saw StoryCaptcha's implementation.

I never saw its scoring algorithm.

But by observing outputs, forming hypotheses, and iteratively adjusting behavior, I was still able to learn a surprising amount about what the system valued.

That's one of the things I enjoy most about reverse engineering:

Observe.

Hypothesize.

Test.

Repeat.

Final Thoughts

I started this experiment asking:

Can an AI agent behave like a human?

Twelve hours later, I think the more interesting question is:

Which parts of human behavior are hardest for machines to reproduce?

The answer, at least from this experiment, appears to be much more nuanced than simply moving a mouse or typing text.

And that's exactly what made the exercise worth exploring.

antibotbypass #antibot #cybersecurity #webscraping