DEV Community

Cover image for How to Scrape Websites with Claude Code (2026 Guide)
Simon
Simon

Posted on • Originally published at crawlforge.dev

How to Scrape Websites with Claude Code (2026 Guide)

Claude Code can edit files, run shell commands, and write tests — but it cannot fetch live web pages on its own. Connect it to CrawlForge MCP and it gains 20 scraping tools that run straight from your terminal.

npm install -g crawlforge-mcp-server
npx crawlforge-setup  # paste your API key
# Now in Claude Code:
# > Fetch https://news.ycombinator.com and list the top 5 stories
Enter fullscreen mode Exit fullscreen mode

This guide shows you how to scrape websites with Claude Code using CrawlForge MCP, from installation to stealth-mode bypass. Every code block below is runnable.


TL;DR: Claude Code has no network access by default. Install CrawlForge MCP and Claude Code gets 20 scraping tools — fetch_url, extract_content, scrape_with_actions, stealth_mode, and more. Free tier = 1,000 credits, no credit card.

Table of Contents


The Problem: Claude Code Cannot Fetch URLs

By default, Claude Code has no network access. Ask it to "read this blog post" and it will tell you it cannot open URLs. The built-in WebFetch helper exists in Claude Desktop but is limited, rate-capped, and frequently blocked by Cloudflare, Akamai, and other edge protections.

CrawlForge MCP solves this by exposing 20 scraping tools — fetch_url, extract_content, scrape_structured, stealth_mode, deep_research, and more — as Model Context Protocol tools that Claude Code can call like any other function. For more background on the protocol itself, see the complete guide to MCP web scraping.

+------------+     MCP       +-------------+     HTTPS     +-----------+
| Claude Code| <-----------> | CrawlForge  | <-----------> |  Target   |
| (terminal) |  JSON-RPC     |   Server    |               |   site    |
+------------+   over stdio  +-------------+               +-----------+
                                    |
                                Credits +
                                Stealth +
                                Actions
Enter fullscreen mode Exit fullscreen mode

Prerequisites

  • Node.js 18+ — check with node --version
  • Claude Code — install with npm install -g @anthropic-ai/claude-code
  • A CrawlForge account — free at crawlforge.dev/signup (1,000 credits included, no credit card)

Step 1: Install CrawlForge MCP

npm install -g crawlforge-mcp-server
Enter fullscreen mode Exit fullscreen mode

Verify the install:

crawlforge-mcp-server --version
# crawlforge-mcp-server 3.0.16
Enter fullscreen mode Exit fullscreen mode

Step 2: Get Your API Key

  1. Go to crawlforge.dev/signup and create an account.
  2. Open the dashboard at crawlforge.dev/dashboard/api-keys.
  3. Copy the key — it starts with cf_live_.

Step 3: Register the MCP Server with Claude Code

The fastest path is the setup wizard:

npx crawlforge-setup
Enter fullscreen mode Exit fullscreen mode

It writes the correct entry to ~/.config/claude-code/mcp.json (Linux/macOS) or %APPDATA%\claude-code\mcp.json (Windows).

Prefer manual configuration? Here's the exact JSON

Add this to your Claude Code MCP config file:

{
  "mcpServers": {
    "crawlforge": {
      "command": "crawlforge-mcp-server",
      "env": {
        "CRAWLFORGE_API_KEY": "cf_live_your_key_here"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Restart Claude Code so it picks up the new server.


Step 4: Verify the Connection

Open Claude Code and run:

/mcp
Enter fullscreen mode Exit fullscreen mode

You should see crawlforge listed as connected with 20 tools available. If not, jump to Troubleshooting.

Step 5: Your First Scrape

Paste this prompt into Claude Code:

Fetch https://news.ycombinator.com using CrawlForge and give me the
top 5 story titles with their URLs as a JSON array.
Enter fullscreen mode Exit fullscreen mode

Claude Code will call fetch_url (1 credit), parse the HTML, and return something like:

[
  { "title": "Show HN: My side project", "url": "https://example.com/post/1" },
  { "title": "Why X is changing Y", "url": "https://example.com/post/2" }
]
Enter fullscreen mode Exit fullscreen mode

That is it. You are scraping.

Full Working Example: Scrape a Pricing Page

Here is a realistic task: extract pricing tiers from a SaaS site. Paste this prompt:

Use scrape_structured to extract pricing from https://crawlforge.dev/pricing.
Return an array of { plan, price, credits, features[] }.
Enter fullscreen mode Exit fullscreen mode

See what Claude Code builds behind the scenes
// What Claude Code sends to CrawlForge via MCP
const response = await fetch('https://crawlforge.dev/api/v1/tools/scrape_structured', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.CRAWLFORGE_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: 'https://crawlforge.dev/pricing',
    selectors: {
      plan: '.pricing-card h3',
      price: '.pricing-card .price',
      credits: '.pricing-card .credits',
      features: '.pricing-card ul li',
    },
  }),
});

const data = await response.json();
console.log(data.plan); // ['Free', 'Hobby', 'Professional', 'Business']
Enter fullscreen mode Exit fullscreen mode

Cost: 2 credits. Compare that to running a headless browser locally: zero infrastructure, no Puppeteer debugging, no Cloudflare roulette.

Advanced: Scrape JavaScript-Rendered Sites

Some sites render pricing or product data through client-side React. fetch_url returns the pre-hydration HTML shell and misses the data. Switch to scrape_with_actions (5 credits):

scrape_with_actions — clicks, waits, and scrolls for SPAs
// Prompt Claude Code with the goal, and it generates this call
const payload = {
  url: 'https://app.example.com/dashboard',
  actions: [
    { type: 'wait', selector: '.data-grid', timeout: 5000 },
    { type: 'click', selector: 'button.load-more' },
    { type: 'wait', timeout: 1500 },
  ],
  formats: ['json', 'markdown'],
};

const result = await fetch('https://crawlforge.dev/api/v1/tools/scrape_with_actions', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.CRAWLFORGE_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify(payload),
}).then(r => r.json());
Enter fullscreen mode Exit fullscreen mode

For Cloudflare and Akamai-protected sites, use stealth_mode (also 5 credits). The fingerprint-rotation tradeoffs are covered in the stealth mode deep dive.

Tool Quick Reference

Tool Credits When to use
fetch_url 1 Static HTML, you will parse yourself
extract_text 1 Clean readable text from article pages
extract_content 2 Readability-style main content extraction
scrape_structured 2 CSS selectors into typed fields
search_web 5 You do not know the URL yet
scrape_with_actions 5 SPA requires clicks, waits, scrolls
stealth_mode 5 Anti-bot systems (Cloudflare, DataDome)
deep_research 10 Multi-source research with citations

Full list in the 20-tools overview.

Troubleshooting

MCP server failed to start

Confirm crawlforge-mcp-server is on your PATH:

which crawlforge-mcp-server
Enter fullscreen mode Exit fullscreen mode

If empty, reinstall globally:

npm install -g crawlforge-mcp-server
Enter fullscreen mode Exit fullscreen mode

Unauthorized or 401 errors

Your API key is missing or malformed. It must start with cf_live_. Re-export in your shell and restart Claude Code:

export CRAWLFORGE_API_KEY="cf_live_..."
Enter fullscreen mode Exit fullscreen mode

Insufficient credits

Check usage at crawlforge.dev/dashboard/usage. Free tier = 1,000 credits/month. Upgrade to Hobby ($19/mo) for 25,000.

Tools list is empty when running /mcp

MCP config is not being read. Check the right path for your OS:

  • macOS: ~/Library/Application Support/claude-code/mcp.json
  • Linux: ~/.config/claude-code/mcp.json
  • Windows: %APPDATA%\claude-code\mcp.json

Cloudflare 403 on every fetch

Swap fetch_url for stealth_mode. If you still get blocked, the target uses server-side JA3/JA4 fingerprinting — open a GitHub issue with the URL.

FAQ

Can Claude Code scrape websites without CrawlForge?

Not reliably. Claude Code has no built-in network access, and the WebFetch helper in Claude Desktop is rate-limited and blocked by most anti-bot systems. CrawlForge MCP adds 20 dedicated scraping tools that handle static HTML, JavaScript-rendered pages, and Cloudflare-protected sites.

How much does it cost to scrape with Claude Code?

CrawlForge uses a credit model: basic fetches cost 1 credit, structured extraction 2 credits, search 5, stealth 5, deep research 10. Free accounts get 1,000 credits per month with no credit card. The Hobby plan ($19/mo) includes 25,000 credits.

Why do I get 403 errors when Claude Code fetches certain URLs?

Sites protected by Cloudflare, DataDome, or Akamai block generic HTTP clients via TLS fingerprinting and JavaScript challenges. Switch from fetch_url (1 credit) to stealth_mode (5 credits), which rotates browser fingerprints and solves challenges automatically.

Does CrawlForge MCP work with Claude Code on Windows?

Yes. Install via npm, run npx crawlforge-setup, and the config lands at %APPDATA%\claude-code\mcp.json. Node.js 18+ is the only system requirement. Windows users should run the setup command from PowerShell or Windows Terminal for the cleanest experience.

What is the difference between fetch_url and scrape_with_actions?

fetch_url returns raw HTML via a fast HTTP request (1 credit). scrape_with_actions spins up a headless browser, executes clicks/waits/scrolls, then captures the hydrated DOM (5 credits). Use fetch_url for static pages and scrape_with_actions only when JavaScript rendering is required.


Next Steps

Start Free — 1,000 Credits Included

Top comments (0)