New IteraTools Endpoint: POST /crawl — BFS Web Crawler for AI Agents

#ai #api #mcp #webdev

POST /crawl — BFS Web Crawler

We just shipped POST /crawl to IteraTools — a breadth-first web crawler that extracts structured content from multiple pages in a single API call.

This is particularly useful for AI agents that need to digest entire documentation sites, product catalogs, or any multi-page website.

What it does

Starting from a seed URL, it performs BFS traversal, visiting each page and returning:

title — the page title
markdown — full page content as clean markdown (up to 20,000 chars/page)
links — outbound links found on the page

By default it stays on the same domain, so you won't accidentally crawl the entire internet.

Quick example

curl -X POST https://api.iteratools.com/crawl \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://docs.example.com","max_pages":10}'

Response:

{
  "ok": true,
  "data": {
    "pages": [
      {
        "url": "https://docs.example.com",
        "title": "Documentation Home",
        "markdown": "# Getting Started\n\nWelcome to the docs...",
        "links": ["https://docs.example.com/guide", "https://docs.example.com/api"]
      },
      {
        "url": "https://docs.example.com/guide",
        "title": "Getting Started Guide",
        "markdown": "## Installation\n\n```

bash\nnpm install example\n

```...",
        "links": ["https://docs.example.com/api", "https://docs.example.com/examples"]
      }
    ],
    "total": 2,
    "crawl_time_ms": 4821
  }
}

Parameters

Parameter	Type	Default	Description
`url`	string	required	Starting URL
`max_pages`	integer	5	Max pages to crawl (1–20)
`same_domain`	boolean	true	Only follow same-domain links
`include_pattern`	string	null	Regex: only crawl matching URLs
`exclude_pattern`	string	null	Regex: skip matching URLs

Filter examples

Crawl only blog posts:

{"url": "https://myblog.com", "max_pages": 20, "include_pattern": "/blog/"}

Skip admin and login pages:

{"url": "https://mysite.com", "max_pages": 10, "exclude_pattern": "/(admin|login|logout)"}

Pricing

$0.010 per job (up to 20 pages included) via x402 micropayment on Base (USDC), or Bearer API key.

That's $0.001–$0.0005 per page crawled — significantly cheaper than most scraping services.

Use cases

Documentation ingestion — feed your AI agent an entire docs site before answering questions
Competitive research — extract product pages, pricing, and feature lists from competitor sites
Content auditing — scan your own site for outdated content or broken structure
RAG pipeline seeding — combine with /embeddings to build a searchable knowledge base from any website

MCP support

Available in the mcp-iteratools package (v1.0.27+):

{
  "mcpServers": {
    "iteratools": {
      "command": "npx",
      "args": ["-y", "mcp-iteratools"],
      "env": { "ITERATOOLS_API_KEY": "your-key" }
    }
  }
}