DEV Community

Cover image for How to Scrape Web Data and Import It into Memberstack with Firecrawl CLI
Ben Sabic
Ben Sabic

Posted on

How to Scrape Web Data and Import It into Memberstack with Firecrawl CLI

You can pull content from any website and load it straight into your Memberstack data tables using two CLI tools: Firecrawl for scraping and Memberstack CLI for the import. No backend code needed. If you're using an AI agent like Claude Code or Codex, you can hand it this guide and it will run the whole workflow for you.

This guide walks through each step, from installation to a working import, with real commands you can copy and run. All three walkthroughs (scrape a single page, crawl a full site, and search the web) follow the same pattern: pull the data, shape it, import it.

What You Need

Before you start, make sure you have:

  • Node.js v18 or higher installed on your machine. Check with node --version.
  • A Firecrawl account for web scraping. Sign up free at firecrawl.dev.
  • A Memberstack account to store the data. Sign up free at memberstack.com.
  • A terminal (or an AI agent with terminal access like Claude Code, Codex, Cursor, or Gemini CLI).

If you're using an AI agent, install both Agent Skills so the agent knows how to use these tools:

npx skills add Flash-Brew-Digital/memberstack-skills --skill memberstack-cli
npx skills add firecrawl/cli --skill firecrawl
Enter fullscreen mode Exit fullscreen mode

Step 1: Install Both CLIs

Install Firecrawl CLI and Memberstack CLI globally with a single command:

npm install -g firecrawl-cli memberstack-cli
Enter fullscreen mode Exit fullscreen mode

Check that both are working:

firecrawl --version
memberstack --version
Enter fullscreen mode Exit fullscreen mode

Step 2: Authenticate

Authenticate with both services. Each tool stores credentials locally so you only do this once.

Firecrawl:

firecrawl login --browser
Enter fullscreen mode Exit fullscreen mode

This opens your browser to sign in and link your API key. For AI agents, the --browser flag handles this without manual prompting.

Memberstack:

memberstack auth login
Enter fullscreen mode Exit fullscreen mode

This opens the Memberstack OAuth page, where you'll be prompted to choose a Memberstack application.

Verify both are connected:

firecrawl --status
memberstack whoami
Enter fullscreen mode Exit fullscreen mode

Step 3: Scrape Your Source Data

Here's where Firecrawl does its work. Choose the approach that matches what you need.

Option A: Scrape a Single Page

Pull clean content from one URL. Good for grabbing a product page, directory listing, or article.

firecrawl scrape https://example.com/products --only-main-content -o .firecrawl/products.md
Enter fullscreen mode Exit fullscreen mode

The --only-main-content flag strips out navigation, footers, and ads, leaving just the useful content.

For structured data like links and metadata alongside the markdown:

firecrawl scrape https://example.com/products --format markdown,links --pretty -o .firecrawl/products.json
Enter fullscreen mode Exit fullscreen mode

Option B: Crawl an Entire Site Section

When you need content from multiple pages, like all docs or all blog posts on a site, use crawl.

First, discover what pages exist:

firecrawl map https://example.com --search "products" -o .firecrawl/product-urls.txt
Enter fullscreen mode Exit fullscreen mode

Then crawl the section:

firecrawl crawl https://example.com --include-paths /products --limit 50 --wait --progress -o .firecrawl/crawl-results.json
Enter fullscreen mode Exit fullscreen mode

The --wait flag keeps the command running until the crawl finishes. --progress shows you how far along it is.

Option C: Search the Web and Scrape Results

If you don't have a specific URL yet, search for what you need:

firecrawl search "organic coffee suppliers wholesale" --scrape --scrape-formats markdown --pretty -o .firecrawl/coffee-suppliers.json
Enter fullscreen mode Exit fullscreen mode

The --scrape flag tells Firecrawl to not just return search results, but also scrape the content from each result page. That gives you full page content to work with.

Step 4: Prepare the Data for Memberstack

Memberstack's records import accepts CSV or JSON files. You need to shape your scraped data to match the table columns you want in Memberstack.

Here's where an AI agent really shines. If you're using Claude Code, Codex, or a similar agent, you can say:

"Read the scraped data in .firecrawl/crawl-results.json, extract the page titles, URLs, and descriptions, and create a CSV file with columns: title, url, description."

The agent will write a small script to transform the data. That's it. No coding on your part.

If you prefer to do it yourself, here's a quick example using jq (a command-line JSON tool):

# Extract data from a crawl result into CSV format
echo "title,url,description" > import-data.csv
jq -r '.data[] | [.metadata.title, .metadata.sourceURL, .metadata.description] | @csv' \
  .firecrawl/crawl-results.json >> import-data.csv
Enter fullscreen mode Exit fullscreen mode

Your CSV file should look something like this:

title,url,description
"Organic Beans Co","https://example.com/products/beans","Fair-trade organic coffee beans"
"Mountain Roast","https://example.com/products/roast","Single-origin dark roast"
"Sunrise Blend","https://example.com/products/blend","Morning blend with notes of chocolate"
Enter fullscreen mode Exit fullscreen mode

For JSON imports, format the data as an array of objects:

[
  {
    "title": "Organic Beans Co",
    "url": "https://example.com/products/beans",
    "description": "Fair-trade organic coffee beans"
  },
  {
    "title": "Mountain Roast",
    "url": "https://example.com/products/roast",
    "description": "Single-origin dark roast"
  }
]
Enter fullscreen mode Exit fullscreen mode

Step 5: Create a Memberstack Data Table

Before importing, you need a table in Memberstack to hold the data. Create one from the CLI:

memberstack tables create --name "Products" --key products --read-rule PUBLIC --create-rule ADMIN_ONLY
Enter fullscreen mode Exit fullscreen mode

The access rules control who can interact with the data:

  • PUBLIC means anyone can read the records, including visitors who aren't logged in.
  • AUTHENTICATED requires a logged-in member.
  • AUTHENTICATED_OWN limits members to their own records.
  • ADMIN_ONLY restricts access to admins.

Choose the rules that make sense for your use case. A product directory might be public. A members-only resource library might need AUTHENTICATED.

Verify the table was created:

memberstack tables list
Enter fullscreen mode Exit fullscreen mode

Step 6: Import the Data

Now load your prepared file into Memberstack:

memberstack records import products --file import-data.csv
Enter fullscreen mode Exit fullscreen mode

Or if you used JSON:

memberstack records import products --file import-data.json
Enter fullscreen mode Exit fullscreen mode

That's it. Your scraped web data is now in a Memberstack data table.

Check the results:

memberstack records find products --take 5
memberstack records count products
Enter fullscreen mode Exit fullscreen mode

Putting It All Together

Here's the complete workflow from start to finish. Copy this block and run it, or give it to your AI agent:

# 1. Scrape a website section
firecrawl crawl https://example.com --include-paths /products --limit 50 --wait -o .firecrawl/crawl-results.json

# 2. Transform the data into CSV
echo "title,url,description" > import-data.csv
jq -r '.data[] | [.metadata.title, .metadata.sourceURL, .metadata.description] | @csv' \
  .firecrawl/crawl-results.json >> import-data.csv

# 3. Create a Memberstack table
memberstack tables create --name "Products" --key products --read-rule PUBLIC --create-rule ADMIN_ONLY

# 4. Import the records
memberstack records import products --file import-data.csv

# 5. Verify
memberstack records count products
memberstack records find products --take 3
Enter fullscreen mode Exit fullscreen mode

For AI Agents: How to Run This Workflow

If you're an AI agent reading this page (or a person who wants to delegate this to an agent), here's the summary:

  1. Install: npm install -g firecrawl-cli memberstack-cli
  2. Authenticate: firecrawl login --browser and memberstack auth login
  3. Scrape: Use firecrawl scrape, firecrawl crawl, or firecrawl search --scrape to get the source data. Save output to a file with -o.
  4. Transform: Read the scraped JSON/markdown and reshape it into a CSV or JSON file matching the target table columns. Column headers become field keys in Memberstack.
  5. Create table: memberstack tables create --name "Your Table" --key your-table --read-rule PUBLIC --create-rule ADMIN_ONLY
  6. Import: memberstack records import your-table --file your-data.csv
  7. Verify: memberstack records count your-table and memberstack records find your-table --take 5

For the Firecrawl CLI, run firecrawl --help or firecrawl <command> --help for full option details. For the Memberstack CLI, run memberstack --help or see the command reference.

Real-World Use Cases

This scrape-and-import pattern works for a lot of scenarios beyond a simple product directory.

Build a resource library. Crawl a collection of industry articles or documentation sites, extract titles, URLs, and summaries, then import them as a curated resource library for your members. Gate access with Memberstack's AUTHENTICATED read rules so only logged-in members can browse the collection.

Populate a supplier or vendor directory. Search for businesses in a niche with firecrawl search, scrape the results to get company names, descriptions, and URLs, and import them into a member-facing directory. Members can browse or filter the data on your Webflow or WordPress frontend.

Seed a course catalog. If you're building an education platform, crawl an existing public course listing site to get course names, descriptions, and categories. Import them as records, then link each to a Memberstack plan so only paid members can access certain courses.

Migrate content from another platform. Crawl your existing site on Teachable, Kajabi, or another platform to extract page content. Reshape it and import it into Memberstack data tables as part of a platform migration.

Keep data fresh. Script the whole workflow as a cron job or CI step. Crawl a source site on a schedule, overwrite or update your Memberstack records with memberstack records bulk-update, and your membership site stays current without manual work.

Helpful Flags to Know

A few flags across both CLIs that are especially useful in this workflow:

What you want to do Command
Scrape only the main content (skip nav/footer) firecrawl scrape <url> --only-main-content
Wait for JavaScript to finish rendering firecrawl scrape <url> --wait-for 3000
Find specific pages on a site before crawling firecrawl map <url> --search "keyword"
Preview a bulk update before applying it memberstack records bulk-update --file data.csv --dry-run
Delete old records before re-importing memberstack records bulk-delete table-key --where "field equals value"
Export existing records for backup memberstack records export table-key --format csv --output backup.csv
Run everything against your test environment Add --sandbox to any Memberstack command (this is the default)
Switch to production Add --live to any Memberstack command

Frequently Asked Questions

Can I scrape a website and put the data into Memberstack without coding?

Yes. Both Firecrawl CLI and Memberstack CLI are terminal tools you can run with simple commands. If you use an AI agent like Claude Code, you can describe what you want in plain English and the agent runs the commands for you. No scripting or programming is required.

What data formats does Memberstack accept for record imports?

Memberstack CLI accepts CSV and JSON files. CSV column headers can use the data.* prefix or plain field names (both title and data.title work). JSON files should be arrays of objects where each key becomes a field name.

Does Firecrawl work on JavaScript-heavy websites?

Yes. Firecrawl handles JavaScript rendering, dynamic content, and single-page applications automatically. Use the --wait-for flag to give a page extra rendering time, or use firecrawl browser for sites that need user interaction like clicking pagination buttons or expanding dropdowns.

Do I need an API key for both tools?

Firecrawl requires an API key (sign up at firecrawl.dev for a free tier with 500 credits). Memberstack CLI uses OAuth browser login with no API key needed. Both tools store credentials locally so you authenticate once.

Which AI agents can run Firecrawl and Memberstack CLI together?

Both tools ship Agent Skills that work with Claude Code, OpenAI Codex, Gemini CLI, GitHub Copilot, Cursor, and any agent that supports the open skills standard. Install both skills and the agent can chain the two tools together in a single workflow.

Will this work with my live Memberstack environment?

By default, Memberstack CLI runs all commands against your sandbox (test mode). This lets you test the full import without affecting live data. When you're happy with the results, add --live to your commands to run against production.

Can I update existing records instead of creating new ones?

Yes. Use memberstack records bulk-update --file updates.csv to update records you've already imported. The file needs an id column with each record's ID. Add --dry-run to preview what would change before committing.

Key Takeaways

  • Firecrawl CLI scrapes the web and outputs clean markdown or JSON. Memberstack CLI imports that data into your membership platform. Together, they form a no-code pipeline from web to membership site.
  • Install both with npm install -g firecrawl-cli memberstack-cli. Authenticate each tool once.
  • Use firecrawl scrape for single pages, firecrawl crawl for site sections, or firecrawl search --scrape to find and scrape in one step.
  • Shape the scraped data into CSV or JSON, then import with memberstack records import.
  • AI agents like Claude Code can run this entire workflow from a plain-English prompt. Install both Agent Skills and let the agent handle it.

Top comments (0)