Note: Both APIs are capable – this guide is for developers prioritizing pay-as-you-go pricing and no subscription requirements.
TL;DR
To migrate from Firecrawl to AlterLab, install the alterlab package, replace your FirecrawlApp instantiation with alterlab.Client, and update your API key. The scraping parameters translate directly, allowing you to switch without rewriting your core extraction logic. You can complete the migration in under an hour.
Why migrate?
If you are evaluating alternatives to Firecrawl, the most common reason to switch is pricing structure. Firecrawl relies on credit-based billing with monthly plan minimums. AlterLab uses a pure pay-as-you-go model where you only pay for successful requests, and your account balance never expires. Read our detailed Firecrawl comparison for more context.
Beyond pricing, AlterLab provides fine-grained control over browser rendering, intelligent tier routing to bypass captchas, and a unified API for scraping, monitoring, and AI extraction. Our architecture handles scale without requiring you to manage concurrent job queues or complex asynchronous workflows manually.
Prerequisites
Before you start the migration, you need:
- An AlterLab account (free sign-up)
- An active API key
- Python 3.8+ or Node.js environment
- 5 minutes to update your code
Step 1: Install the AlterLab SDK
The fastest way to migrate is using the AlterLab Python SDK. You can also use the REST API directly if you prefer writing your own HTTP wrappers. See our Getting started guide for full installation details.
```bash title="Terminal — Install AlterLab"
pip install alterlab
For Node.js users, the process is identical using npm:
```bash title="Terminal — Install AlterLab Node"
npm install @alterlab/client
Step 2: Replace your API calls
AlterLab is designed to be highly compatible with existing scraping pipelines. You only need to swap the initialization and the core scraping method.
Here is what your Firecrawl implementation likely looks like:
```python title="before_firecrawl.py"
Firecrawl (before migration)
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key="fc-YOUR_API_KEY")
response = app.scrape_url('https://example.com')
print(response.get('markdown'))
print(response.get('metadata').get('title'))
Here is the equivalent implementation after migrating to AlterLab:
```python title="after_alterlab.py" {3-8}
# AlterLab (after migration)
client = alterlab.Client(api_key="al-YOUR_API_KEY")
response = client.scrape(
url="https://example.com",
formats=["markdown", "html"]
)
print(response.markdown)
print(response.metadata.title)
The primary difference is specifying the output formats in the request rather than extracting them from a monolithic response dictionary. This reduces bandwidth overhead by only fetching the formats you actually need.
Step 3: Handle response format differences
Firecrawl returns a dictionary containing metadata, markdown, HTML, and other fields depending on the request. AlterLab returns a strongly-typed response object, which provides better IDE autocompletion and type safety.
Instead of dictionary lookups, access properties directly on the response object:
-
response.get('html')becomesresponse.html -
response.get('markdown')becomesresponse.markdown -
response.get('metadata')becomesresponse.metadata
If you are passing the response directly into an LLM or a downstream pipeline, update the accessor syntax to match the AlterLab object structure. The markdown output generated by AlterLab is optimized for LLM context windows, stripping navigation elements, footers, and boilerplate automatically.
Format Conversion Pipelines
Firecrawl often requires you to write custom parsing logic after retrieving the markdown. AlterLab natively supports multiple formats in a single request.
You can request json, markdown, and text simultaneously.
```python title="multi_format.py" {4}
response = client.scrape(
url="https://example.com",
formats=["markdown", "json", "text"]
)
Use text for raw token counts
token_count = len(response.text.split())
Use markdown for LLM context
llm_prompt = f"Summarize this: {response.markdown}"
The `text` format is specifically designed for RAG (Retrieval-Augmented Generation) pipelines, stripping all HTML and markdown formatting to provide clean, readable prose.
### Migrating Batch Processing
If you use Firecrawl to scrape multiple URLs simultaneously, you likely map over the `scrape_url` method or use their async batch endpoints. AlterLab handles batching seamlessly.
Pass a list of URLs directly to the `scrape` method. AlterLab automatically parallelizes the requests and returns a list of response objects.
```python title="batch_processing.py" {4-7}
client = alterlab.Client()
responses = client.scrape(
urls=["https://example.com/1", "https://example.com/2"],
formats=["markdown"]
)
for response in responses:
print(response.markdown)
Migrating LLM Extraction
If you use Firecrawl's LLM extraction capabilities to convert raw text into structured JSON, you will migrate to AlterLab's Cortex AI. The schema definition remains exactly the same, using standard JSON Schema.
```python title="cortex_extraction.py" {3-17}
AlterLab Cortex AI Extraction
client = alterlab.Client()
response = client.extract(
url="https://example.com/products",
schema={
"type": "object",
"properties": {
"products": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"}
}
}
}
}
}
)
print(response.data.products)
Cortex AI natively understands complex DOM structures and requires no manual CSS selectors.
## Step 4: Update your error handling
AlterLab simplifies error handling through intelligent tier routing. If a scrape fails due to an anti-bot block on a standard residential proxy, AlterLab automatically escalates the request using a higher tier. Tier 1 handles standard sites using cURL. Tier 3 introduces headless browsers for JavaScript rendering. Tier 5 resolves complex captchas and Cloudflare challenges.
You do not need to implement manual retry logic for blocks. The API returns the data or throws a terminal exception.
Catch `alterlab.errors.ScrapeError` for terminal failures:
```python title="error_handling.py" {7-9}
from alterlab.errors import ScrapeError
client = alterlab.Client(api_key="al-YOUR_API_KEY")
try:
response = client.scrape("https://example.com")
except ScrapeError as e:
print(f"Scrape failed: {e.message}")
print(f"Status code: {e.status_code}")
Remove any exponential backoff or retry decorators you previously used to handle 429 Too Many Requests errors from the target site. AlterLab manages proxy rotation, rate limits, and concurrent session tracking internally.
Cost comparison
When you migrate, your billing shifts from monthly subscriptions to strict usage-based billing. See the full AlterLab pricing page for details.
Here is a practical comparison:
With AlterLab, 10,000 basic HTML requests cost exactly $2.00. You add funds to your balance, and those funds remain available until you use them. There are no monthly quotas to manage, no overage penalties, and no credits that expire at the end of the billing cycle.
Firecrawl pricing requires you to select a monthly tier, which means you pay the fixed price even if your scraping volume drops. AlterLab aligns costs directly with your infrastructure usage.
Migrating Advanced Workflows
Webhooks
If you rely on Firecrawl webhooks to push scraped data to your servers, you must update your endpoint to receive the AlterLab JSON payload structure. AlterLab webhooks trigger immediately upon task completion, removing the need to poll for status updates.
To configure a webhook in AlterLab, pass the webhook_url parameter during your scrape request. The core data payload remains the same, but the wrapper keys differ.
```python title="webhooks.py" {4}
response = client.scrape(
url="https://example.com",
formats=["json"],
webhook_url="https://your-server.com/webhooks/alterlab"
)
### Scheduling and Diff Monitoring
Many developers run Firecrawl within CRON jobs on their own servers to track page changes over time. AlterLab handles this natively. You can migrate your local CRON schedules directly to AlterLab's infrastructure.
```python title="scheduling.py" {4-5}
client.schedules.create(
url="https://example.com/pricing",
formats=["markdown"],
cron="0 0 * * *",
detect_diff=True
)
This configuration runs the scrape daily at midnight and only triggers a webhook if the markdown content has changed since the previous run.
Team Collaboration and API Keys
If you are migrating a team from Firecrawl, AlterLab simplifies access control. In Firecrawl, teams often share a single API key or struggle with segmented billing. AlterLab provides native multi-user organizations with shared billing.
You can issue scoped API keys for different environments:
-
al-prod-...with a $100/day spend limit -
al-dev-...with a $5/day spend limit
This ensures a rogue script in development cannot drain your account balance. You configure these limits in the AlterLab dashboard without modifying your application code.
Common issues and fixes
-
Missing format error: AlterLab requires you to specify the formats you need (e.g.,
formats=["json", "markdown"]). If you omit this parameter, the API defaults to raw HTML. Explicitly declare your required formats to ensure consistent pipeline behavior and minimize bandwidth. -
Timeout limits: AlterLab defaults to a 60-second timeout. If you are scraping extremely slow, Javascript-heavy sites, increase this limit via
client.scrape(url, timeout=120). The maximum allowed timeout is 300 seconds. -
Authentication: Ensure your
ALTERLAB_API_KEYis loaded in your environment variables. The Python SDK will automatically pick it up if you omit theapi_keyparameter during client initialization. -
Javascript Rendering: Firecrawl attempts to automatically detect when Javascript rendering is needed. In AlterLab, you can explicitly control this by setting the minimum proxy tier. Set
min_tier=3to guarantee headless browser execution for Single Page Applications (SPAs) built with React or Vue. -
Pagination handling: If your Firecrawl script handled pagination by manually finding links, you can migrate this logic directly. AlterLab's Cortex AI can also extract pagination URLs automatically by adding a
next_page_urlfield to your JSON schema. -
Geolocation restrictions: If you need to scrape sites that restrict access by country, specify the proxy location in your AlterLab request:
client.scrape(url, geo="us"). We support over 40 country codes natively without requiring third-party proxy integrations.
You're done
That is the entire migration process. Swap the client library, update the API key, and adjust your response object accessors. Your existing data extraction logic, LLM prompts, and downstream data pipelines will continue working as normal.
Hit reply to our support team if you encounter any unexpected behavior during your transition. We monitor API logs 24/7 and assist with custom extraction schemas if you hit edge cases.
Top comments (0)