AlterLab

Posted on Jun 18 • Originally published at alterlab.io

How to Migrate from Firecrawl to AlterLab: Step-by-Step Guide (2026)

#antibot #automation #python #dataextraction

Note: Both APIs are capable – this guide is for developers prioritizing pay-as-you-go pricing and no subscription requirements.

TL;DR

To migrate from Firecrawl to AlterLab, install the alterlab package, replace your FirecrawlApp instantiation with alterlab.Client, and update your API key. The scraping parameters translate directly, allowing you to switch without rewriting your core extraction logic. You can complete the migration in under an hour.

Why migrate?

If you are evaluating alternatives to Firecrawl, the most common reason to switch is pricing structure. Firecrawl relies on credit-based billing with monthly plan minimums. AlterLab uses a pure pay-as-you-go model where you only pay for successful requests, and your account balance never expires. Read our detailed Firecrawl comparison for more context.

Beyond pricing, AlterLab provides fine-grained control over browser rendering, intelligent tier routing to bypass captchas, and a unified API for scraping, monitoring, and AI extraction. Our architecture handles scale without requiring you to manage concurrent job queues or complex asynchronous workflows manually.

Prerequisites

Before you start the migration, you need:

An AlterLab account (free sign-up)
An active API key
Python 3.8+ or Node.js environment
5 minutes to update your code

Step 1: Install the AlterLab SDK

The fastest way to migrate is using the AlterLab Python SDK. You can also use the REST API directly if you prefer writing your own HTTP wrappers. See our Getting started guide for full installation details.

```bash title="Terminal — Install AlterLab"
pip install alterlab




For Node.js users, the process is identical using npm:



```bash title="Terminal — Install AlterLab Node"
npm install @alterlab/client

Step 2: Replace your API calls

AlterLab is designed to be highly compatible with existing scraping pipelines. You only need to swap the initialization and the core scraping method.

Here is what your Firecrawl implementation likely looks like:

```python title="before_firecrawl.py"

Firecrawl (before migration)

from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key="fc-YOUR_API_KEY")
response = app.scrape_url('https://example.com')

print(response.get('markdown'))
print(response.get('metadata').get('title'))




Here is the equivalent implementation after migrating to AlterLab:



```python title="after_alterlab.py" {3-8}
# AlterLab (after migration)

client = alterlab.Client(api_key="al-YOUR_API_KEY")
response = client.scrape(
    url="https://example.com",
    formats=["markdown", "html"]
)

print(response.markdown)
print(response.metadata.title)

The primary difference is specifying the output formats in the request rather than extracting them from a monolithic response dictionary. This reduces bandwidth overhead by only fetching the formats you actually need.

Step 3: Handle response format differences

Firecrawl returns a dictionary containing metadata, markdown, HTML, and other fields depending on the request. AlterLab returns a strongly-typed response object, which provides better IDE autocompletion and type safety.

Instead of dictionary lookups, access properties directly on the response object:

response.get('html') becomes response.html
response.get('markdown') becomes response.markdown
response.get('metadata') becomes response.metadata

If you are passing the response directly into an LLM or a downstream pipeline, update the accessor syntax to match the AlterLab object structure. The markdown output generated by AlterLab is optimized for LLM context windows, stripping navigation elements, footers, and boilerplate automatically.

Format Conversion Pipelines

Firecrawl often requires you to write custom parsing logic after retrieving the markdown. AlterLab natively supports multiple formats in a single request.

You can request json, markdown, and text simultaneously.

```python title="multi_format.py" {4}
response = client.scrape(
url="https://example.com",
formats=["markdown", "json", "text"]
)

Use text for raw token counts

token_count = len(response.text.split())

Use markdown for LLM context

llm_prompt = f"Summarize this: {response.markdown}"




The `text` format is specifically designed for RAG (Retrieval-Augmented Generation) pipelines, stripping all HTML and markdown formatting to provide clean, readable prose.

### Migrating Batch Processing
If you use Firecrawl to scrape multiple URLs simultaneously, you likely map over the `scrape_url` method or use their async batch endpoints. AlterLab handles batching seamlessly.

Pass a list of URLs directly to the `scrape` method. AlterLab automatically parallelizes the requests and returns a list of response objects.



```python title="batch_processing.py" {4-7}

client = alterlab.Client()
responses = client.scrape(
    urls=["https://example.com/1", "https://example.com/2"],
    formats=["markdown"]
)

for response in responses:
    print(response.markdown)

Migrating LLM Extraction

If you use Firecrawl's LLM extraction capabilities to convert raw text into structured JSON, you will migrate to AlterLab's Cortex AI. The schema definition remains exactly the same, using standard JSON Schema.

```python title="cortex_extraction.py" {3-17}

AlterLab Cortex AI Extraction

client = alterlab.Client()
response = client.extract(
url="https://example.com/products",
schema={
"type": "object",
"properties": {
"products": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"}
}
}
}
}
}
)

print(response.data.products)




Cortex AI natively understands complex DOM structures and requires no manual CSS selectors.

## Step 4: Update your error handling
AlterLab simplifies error handling through intelligent tier routing. If a scrape fails due to an anti-bot block on a standard residential proxy, AlterLab automatically escalates the request using a higher tier. Tier 1 handles standard sites using cURL. Tier 3 introduces headless browsers for JavaScript rendering. Tier 5 resolves complex captchas and Cloudflare challenges.

You do not need to implement manual retry logic for blocks. The API returns the data or throws a terminal exception.

Catch `alterlab.errors.ScrapeError` for terminal failures:



```python title="error_handling.py" {7-9}

from alterlab.errors import ScrapeError

client = alterlab.Client(api_key="al-YOUR_API_KEY")

try:
    response = client.scrape("https://example.com")
except ScrapeError as e:
    print(f"Scrape failed: {e.message}")
    print(f"Status code: {e.status_code}")

Remove any exponential backoff or retry decorators you previously used to handle 429 Too Many Requests errors from the target site. AlterLab manages proxy rotation, rate limits, and concurrent session tracking internally.

Cost comparison

When you migrate, your billing shifts from monthly subscriptions to strict usage-based billing. See the full AlterLab pricing page for details.

Here is a practical comparison:

With AlterLab, 10,000 basic HTML requests cost exactly $2.00. You add funds to your balance, and those funds remain available until you use them. There are no monthly quotas to manage, no overage penalties, and no credits that expire at the end of the billing cycle.

Firecrawl pricing requires you to select a monthly tier, which means you pay the fixed price even if your scraping volume drops. AlterLab aligns costs directly with your infrastructure usage.

Migrating Advanced Workflows

Webhooks

If you rely on Firecrawl webhooks to push scraped data to your servers, you must update your endpoint to receive the AlterLab JSON payload structure. AlterLab webhooks trigger immediately upon task completion, removing the need to poll for status updates.

To configure a webhook in AlterLab, pass the webhook_url parameter during your scrape request. The core data payload remains the same, but the wrapper keys differ.

```python title="webhooks.py" {4}
response = client.scrape(
url="https://example.com",
formats=["json"],
webhook_url="https://your-server.com/webhooks/alterlab"
)




### Scheduling and Diff Monitoring
Many developers run Firecrawl within CRON jobs on their own servers to track page changes over time. AlterLab handles this natively. You can migrate your local CRON schedules directly to AlterLab's infrastructure.



```python title="scheduling.py" {4-5}
client.schedules.create(
    url="https://example.com/pricing",
    formats=["markdown"],
    cron="0 0 * * *",
    detect_diff=True
)

This configuration runs the scrape daily at midnight and only triggers a webhook if the markdown content has changed since the previous run.

Team Collaboration and API Keys

If you are migrating a team from Firecrawl, AlterLab simplifies access control. In Firecrawl, teams often share a single API key or struggle with segmented billing. AlterLab provides native multi-user organizations with shared billing.

You can issue scoped API keys for different environments:

al-prod-... with a $100/day spend limit
al-dev-... with a $5/day spend limit

This ensures a rogue script in development cannot drain your account balance. You configure these limits in the AlterLab dashboard without modifying your application code.

Common issues and fixes

Missing format error: AlterLab requires you to specify the formats you need (e.g., formats=["json", "markdown"]). If you omit this parameter, the API defaults to raw HTML. Explicitly declare your required formats to ensure consistent pipeline behavior and minimize bandwidth.
Timeout limits: AlterLab defaults to a 60-second timeout. If you are scraping extremely slow, Javascript-heavy sites, increase this limit via client.scrape(url, timeout=120). The maximum allowed timeout is 300 seconds.
Authentication: Ensure your ALTERLAB_API_KEY is loaded in your environment variables. The Python SDK will automatically pick it up if you omit the api_key parameter during client initialization.
Javascript Rendering: Firecrawl attempts to automatically detect when Javascript rendering is needed. In AlterLab, you can explicitly control this by setting the minimum proxy tier. Set min_tier=3 to guarantee headless browser execution for Single Page Applications (SPAs) built with React or Vue.
Pagination handling: If your Firecrawl script handled pagination by manually finding links, you can migrate this logic directly. AlterLab's Cortex AI can also extract pagination URLs automatically by adding a next_page_url field to your JSON schema.
Geolocation restrictions: If you need to scrape sites that restrict access by country, specify the proxy location in your AlterLab request: client.scrape(url, geo="us"). We support over 40 country codes natively without requiring third-party proxy integrations.

You're done

That is the entire migration process. Swap the client library, update the API key, and adjust your response object accessors. Your existing data extraction logic, LLM prompts, and downstream data pipelines will continue working as normal.

Hit reply to our support team if you encounter any unexpected behavior during your transition. We monitor API logs 24/7 and assist with custom extraction schemas if you hit edge cases.

DEV Community