DEV Community

Shubham Gupta
Shubham Gupta

Posted on • Originally published at siliconfriendly.com

Your AI Agents Are Flying Blind. Here's a Pre-Flight Check.

Your agent hits a 403. It retries. Gets a CAPTCHA. Retries again. Finally gets some HTML back — 200KB of JavaScript-rendered noise it burns 4,000 tokens trying to parse before concluding it can't extract anything useful.

You've been there. I've been there. Every agent builder has been there.

The problem isn't your agent. It's that there's no way to know if a website will cooperate before your agent wastes time and tokens on it.

The Spectrum of Agent-Friendliness

Not all websites treat agents the same. Some have structured APIs, machine-readable docs, and llms.txt files. Others serve you a Cloudflare challenge page and call it a day.

Silicon Friendly rates 834+ websites on a scale from L0 to L5:

Level What it means
L0 Hostile — blocks agents outright
L1 Passive — no agent support, you're scraping
L2 Basic — has an API, but docs are human-only
L3 Structured — good APIs, machine-readable content
L4 Verified — strong agent support across 30 criteria
L5 Agent-native — built for programmatic access first

Each rating is based on 30 concrete criteria: structured error responses, rate limit headers, auth documentation, machine-readable pricing, llms.txt support, and more.

Wire It Into Your Agent in 5 Minutes

Silicon Friendly exposes an MCP server your agents can query at runtime. If you're using CrewAI (which supports MCP natively), here's the setup:

pip install crewai crewai-tools[mcp]
Enter fullscreen mode Exit fullscreen mode
from crewai import Agent, Task, Crew
from crewai_tools import MCPServerAdapter

sf_server = {
    "url": "https://siliconfriendly.com/mcp",
    "transport": "streamable_http",
}

with MCPServerAdapter(sf_server, connect_timeout=30) as tools:
    scout = Agent(
        role="Integration Scout",
        goal="Evaluate services for agent compatibility before integration",
        backstory=(
            "You check whether websites and APIs are built for AI agents. "
            "You use Silicon Friendly to look up agent-friendliness ratings "
            "and search for alternatives when a service scores poorly."
        ),
        tools=tools,
        verbose=True,
    )

    task = Task(
        description=(
            "We need to integrate a payment processor. "
            "Check if stripe.com is agent-friendly. "
            "Then search for other payment processors and compare."
        ),
        expected_output=(
            "Stripe's agent-friendliness level, "
            "alternatives with their levels, "
            "and a recommendation."
        ),
        agent=scout,
    )

    crew = Crew(agents=[scout], tasks=[task])
    result = crew.kickoff()
    print(result)
Enter fullscreen mode Exit fullscreen mode

The agent calls check_agent_friendliness on stripe.com (L4), then search_websites for "payment processing" to find alternatives like Razorpay (L5) and Square Developer (L5). It makes an informed recommendation instead of guessing.

Key Tools Available via MCP

Tool Purpose
check_agent_friendliness Quick L0-L5 check for any domain
search_websites Semantic search by use case ("email API", "cloud storage")
get_website Full 30-criteria breakdown for a domain
get_levels_info Explains the rating system

No auth required for reads. The MCP endpoint is https://siliconfriendly.com/mcp using Streamable HTTP transport.

Works with smolagents Too

If you're using HuggingFace's smolagents instead of CrewAI, the same MCP server works — smolagents has native MCP support. Any framework that speaks MCP can connect.

Practical Advice

L3+ is the sweet spot. Anything rated L3 or above has structured data, clear APIs, and machine-readable content. Below that, your agent is scraping HTML and guessing.

Use search instead of hardcoding. Instead of assuming which service to use, let your agent search for "email API" or "cloud storage" and pick the highest-rated option. The directory has 834+ websites indexed.

Check the details when it matters. get_website returns the full 30-criteria breakdown — you'll see exactly what works (structured error responses, rate limit headers) and what doesn't (no machine-readable pricing, no agent auth flow).

Check Your Own Stack

Before your next agent build, check whether the services you depend on will actually cooperate:

👉 siliconfriendly.com/llms.txt

If a site you use isn't rated yet, you can submit it through the MCP server or the website. You get bonus API queries for each site you verify.


Silicon Friendly is an open directory. The MCP server is free to use. Registry entry: com.siliconfriendly/directory.

Top comments (1)

Collapse
 
signalstack profile image
signalstack

The pre-flight check framing is right. The 403→CAPTCHA→200KB noise loop isn't just wasteful — it corrupts your agent's context for the rest of the session. It's still trying to reason about what it 'found' when the page was basically empty.

The L0-L5 taxonomy makes sense. The check I'd add before hitting any new domain: does the site expose an API or llms.txt? If yes, use that. If not, decide up front whether to scrape or skip. The 'try and see what comes back' approach multiplies token waste across every page load in a pipeline.

The MCP integration is clever because it removes a whole class of 'worked in testing, broke in production' failures. Testing usually hits cooperative domains. Production hits the uncooperative ones at 2am.

One thing worth building into your agent's prompting: explicit instructions to abort on L0/L1, not retry. Most agents retry on transient-looking errors. But a CAPTCHA isn't transient — it's a policy. Treating it like a network timeout just burns more tokens on the same wall.