Simon

Posted on Apr 17 • Originally published at crawlforge.dev

How to Scrape Websites with Claude Code (2026 Guide)

#claudecode #webscraping #mcp #ai

Claude Code can edit files, run shell commands, and write tests — but it cannot fetch live web pages on its own. Connect it to CrawlForge MCP and it gains 20 scraping tools that run straight from your terminal.

npm install -g crawlforge-mcp-server
npx crawlforge-setup  # paste your API key
# Now in Claude Code:
# > Fetch https://news.ycombinator.com and list the top 5 stories

This guide shows you how to scrape websites with Claude Code using CrawlForge MCP, from installation to stealth-mode bypass. Every code block below is runnable.

TL;DR: Claude Code has no network access by default. Install CrawlForge MCP and Claude Code gets 20 scraping tools — fetch_url, extract_content, scrape_with_actions, stealth_mode, and more. Free tier = 1,000 credits, no credit card.

The Problem: Claude Code Cannot Fetch URLs
Prerequisites
Step 1: Install CrawlForge MCP
Step 2: Get Your API Key
Step 3: Register the MCP Server with Claude Code
Step 4: Verify the Connection
Step 5: Your First Scrape
Full Working Example: Scrape a Pricing Page
Advanced: Scrape JavaScript-Rendered Sites
Tool Quick Reference
Troubleshooting
FAQ

The Problem: Claude Code Cannot Fetch URLs

By default, Claude Code has no network access. Ask it to "read this blog post" and it will tell you it cannot open URLs. The built-in WebFetch helper exists in Claude Desktop but is limited, rate-capped, and frequently blocked by Cloudflare, Akamai, and other edge protections.

CrawlForge MCP solves this by exposing 20 scraping tools — fetch_url, extract_content, scrape_structured, stealth_mode, deep_research, and more — as Model Context Protocol tools that Claude Code can call like any other function. For more background on the protocol itself, see the complete guide to MCP web scraping.

+------------+     MCP       +-------------+     HTTPS     +-----------+
| Claude Code| <-----------> | CrawlForge  | <-----------> |  Target   |
| (terminal) |  JSON-RPC     |   Server    |               |   site    |
+------------+   over stdio  +-------------+               +-----------+
                                    |
                                Credits +
                                Stealth +
                                Actions

Prerequisites

Node.js 18+ — check with node --version
Claude Code — install with npm install -g @anthropic-ai/claude-code
A CrawlForge account — free at crawlforge.dev/signup (1,000 credits included, no credit card)

Step 1: Install CrawlForge MCP

npm install -g crawlforge-mcp-server

Verify the install:

crawlforge-mcp-server --version
# crawlforge-mcp-server 3.0.16

Step 2: Get Your API Key

Go to crawlforge.dev/signup and create an account.
Open the dashboard at crawlforge.dev/dashboard/api-keys.
Copy the key — it starts with cf_live_.

Step 3: Register the MCP Server with Claude Code

The fastest path is the setup wizard:

npx crawlforge-setup

It writes the correct entry to ~/.config/claude-code/mcp.json (Linux/macOS) or %APPDATA%\claude-code\mcp.json (Windows).

Prefer manual configuration? Here's the exact JSON

Add this to your Claude Code MCP config file:

{
  "mcpServers": {
    "crawlforge": {
      "command": "crawlforge-mcp-server",
      "env": {
        "CRAWLFORGE_API_KEY": "cf_live_your_key_here"
      }
    }
  }
}

Restart Claude Code so it picks up the new server.

Step 4: Verify the Connection

Open Claude Code and run:

/mcp

You should see crawlforge listed as connected with 20 tools available. If not, jump to Troubleshooting.

Step 5: Your First Scrape

Paste this prompt into Claude Code:

Fetch https://news.ycombinator.com using CrawlForge and give me the
top 5 story titles with their URLs as a JSON array.

Claude Code will call fetch_url (1 credit), parse the HTML, and return something like:

[
  { "title": "Show HN: My side project", "url": "https://example.com/post/1" },
  { "title": "Why X is changing Y", "url": "https://example.com/post/2" }
]

That is it. You are scraping.

Full Working Example: Scrape a Pricing Page

Here is a realistic task: extract pricing tiers from a SaaS site. Paste this prompt:

Use scrape_structured to extract pricing from https://crawlforge.dev/pricing.
Return an array of { plan, price, credits, features[] }.

See what Claude Code builds behind the scenes

// What Claude Code sends to CrawlForge via MCP
const response = await fetch('https://crawlforge.dev/api/v1/tools/scrape_structured', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.CRAWLFORGE_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: 'https://crawlforge.dev/pricing',
    selectors: {
      plan: '.pricing-card h3',
      price: '.pricing-card .price',
      credits: '.pricing-card .credits',
      features: '.pricing-card ul li',
    },
  }),
});

const data = await response.json();
console.log(data.plan); // ['Free', 'Hobby', 'Professional', 'Business']

Cost: 2 credits. Compare that to running a headless browser locally: zero infrastructure, no Puppeteer debugging, no Cloudflare roulette.

Advanced: Scrape JavaScript-Rendered Sites

Some sites render pricing or product data through client-side React. fetch_url returns the pre-hydration HTML shell and misses the data. Switch to scrape_with_actions (5 credits):

scrape_with_actions — clicks, waits, and scrolls for SPAs

// Prompt Claude Code with the goal, and it generates this call
const payload = {
  url: 'https://app.example.com/dashboard',
  actions: [
    { type: 'wait', selector: '.data-grid', timeout: 5000 },
    { type: 'click', selector: 'button.load-more' },
    { type: 'wait', timeout: 1500 },
  ],
  formats: ['json', 'markdown'],
};

const result = await fetch('https://crawlforge.dev/api/v1/tools/scrape_with_actions', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.CRAWLFORGE_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify(payload),
}).then(r => r.json());

For Cloudflare and Akamai-protected sites, use stealth_mode (also 5 credits). The fingerprint-rotation tradeoffs are covered in the stealth mode deep dive.

Tool Quick Reference

Tool	Credits	When to use
`fetch_url`	1	Static HTML, you will parse yourself
`extract_text`	1	Clean readable text from article pages
`extract_content`	2	Readability-style main content extraction
`scrape_structured`	2	CSS selectors into typed fields
`search_web`	5	You do not know the URL yet
`scrape_with_actions`	5	SPA requires clicks, waits, scrolls
`stealth_mode`	5	Anti-bot systems (Cloudflare, DataDome)
`deep_research`	10	Multi-source research with citations

Full list in the 20-tools overview.

Troubleshooting

MCP server failed to start

Confirm crawlforge-mcp-server is on your PATH:

which crawlforge-mcp-server

If empty, reinstall globally:

npm install -g crawlforge-mcp-server

Unauthorized or 401 errors

Your API key is missing or malformed. It must start with cf_live_. Re-export in your shell and restart Claude Code:

export CRAWLFORGE_API_KEY="cf_live_..."

Insufficient credits

Check usage at crawlforge.dev/dashboard/usage. Free tier = 1,000 credits/month. Upgrade to Hobby ($19/mo) for 25,000.

Tools list is empty when running /mcp

MCP config is not being read. Check the right path for your OS:

macOS: ~/Library/Application Support/claude-code/mcp.json
Linux: ~/.config/claude-code/mcp.json
Windows: %APPDATA%\claude-code\mcp.json

Cloudflare 403 on every fetch

Swap fetch_url for stealth_mode. If you still get blocked, the target uses server-side JA3/JA4 fingerprinting — open a GitHub issue with the URL.

FAQ

Can Claude Code scrape websites without CrawlForge?

Not reliably. Claude Code has no built-in network access, and the WebFetch helper in Claude Desktop is rate-limited and blocked by most anti-bot systems. CrawlForge MCP adds 20 dedicated scraping tools that handle static HTML, JavaScript-rendered pages, and Cloudflare-protected sites.

How much does it cost to scrape with Claude Code?

CrawlForge uses a credit model: basic fetches cost 1 credit, structured extraction 2 credits, search 5, stealth 5, deep research 10. Free accounts get 1,000 credits per month with no credit card. The Hobby plan ($19/mo) includes 25,000 credits.

Why do I get 403 errors when Claude Code fetches certain URLs?

Sites protected by Cloudflare, DataDome, or Akamai block generic HTTP clients via TLS fingerprinting and JavaScript challenges. Switch from fetch_url (1 credit) to stealth_mode (5 credits), which rotates browser fingerprints and solves challenges automatically.

Does CrawlForge MCP work with Claude Code on Windows?

Yes. Install via npm, run npx crawlforge-setup, and the config lands at %APPDATA%\claude-code\mcp.json. Node.js 18+ is the only system requirement. Windows users should run the setup command from PowerShell or Windows Terminal for the cleanest experience.

What is the difference between fetch_url and scrape_with_actions?

fetch_url returns raw HTML via a fast HTTP request (1 credit). scrape_with_actions spins up a headless browser, executes clicks/waits/scrolls, then captures the hydrated DOM (5 credits). Use fetch_url for static pages and scrape_with_actions only when JavaScript rendering is required.

Next Steps

Read the CrawlForge quick start for five copy-paste examples
Browse the getting started docs for full API reference
Compare MCP clients in Claude Desktop vs Claude Code
Evaluate alternatives at Firecrawl alternative

Start Free — 1,000 Credits Included

DEV Community