Rohith

Posted on Apr 3

How We Built an Autonomous Growth Engine with OpenClaw, MCP, and Clura

#ai #webscraping #sass #chromeextension

As a technical founder, I’ve built products that work; but marketing them was always a struggle. We lacked time, team, and budget, so our growth stalled on manual effort. This time, for Clura (our AI web-scraper Chrome extension), we decided to flip the script: use the product to power our marketing. We built an automated pipeline driven by an MCP (Model Context Protocol) orchestrator running an OpenClaw AI agent, with Clura as one of its tools.

In practice, the system generates search keywords, runs Google Maps searches, scrapes business leads via Clura, then enriches and funnels the data into an email outreach agent. All steps are automated by the agent. We treated marketing as data flow, not guesswork.

This article covers the full architecture and implementation: what MCP and OpenClaw are, how they interact, how we integrated the Clura Chrome extension for Google Maps scraping, the email sending and warm-up strategies, plus security and compliance. I provide pseudocode, a mermaid flowchart, a comparison table of orchestration approaches, and a step-by-step guide to replicate this system.

👉 If you’re an early founder stuck doing outbound by hand, this story shows how to turn marketing into reliable infrastructure — an always-on, agent-driven engine that scales without extra hires.

The Problem: Small Teams, Big Marketing Challenges
In the early days, marketing felt like a manual grind. We were a small team with no marketing specialists. Finding leads meant Google searches and copy-pasting data into spreadsheets. Cold emails were crafted one by one. We’d do a flurry of activity after launch, only to see it fade out.

**We realized: **It wasn’t a lack of ideas — it was a lack of systems. Marketing tasks were linear, not compounding. Every day required new effort, with little to show long-term. We asked ourselves: What if Clura could help? If Clura can extract structured data from any site, why are we still gathering leads by hand?

That question sparked a change: Stop doing marketing; build marketing as a data pipeline. Instead of hiring an expensive agency or tools, we turned inward. We asked, “If Clura can do X for users, can it do X for us?” In short, we became our own first users of Clura. We designed a growth engine where Clura is a tool in the loop, not just a product to sell.

What Are MCP and OpenClaw?
Before we explain the pipeline, let’s define the core concepts:

Model Context Protocol (MCP): An open standard (introduced by Anthropic) that lets language models talk to external tools and services in a unified way. Think of it as “USB-C for AI”: one interface to plug any tool (APIs, databases, scrapers) into any model (GPT, Claude, etc.). An MCP server is a lightweight program exposing capabilities (tools) via this protocol. The model (the “brain”) connects as a client, discovers available tools, and calls them during its processing. In effect, the MCP mediates between the agent and the outside world. As one guide puts it, MCPs act like a “traffic cop between your apps and your models” — they route requests and enforce rules.

OpenClaw: An open-source AI agent framework that implements the MCP architecture. It runs locally (or in the cloud) and provides an agent interface for models. OpenClaw calls the Gateway its control plane, connecting the model to tools (called “skills”). Each skill in OpenClaw is essentially an MCP server: when enabled, it registers tools (like web-search or calendar) that the agent can call. For example, enabling a web-search skill lets the model use a search API; enabling an email skill lets it draft messages. Crucially, OpenClaw is model-agnostic (works with GPT, Claude, etc.) and can hot-load skills without restarting. In OpenClaw’s words, the Gateway is the control plane — the agent (assistant) is the product.

In our system, MCP is the overarching orchestration layer, and OpenClaw is our agent running atop it. We built or enabled skills (tools) for scraping, browser control, email, etc., and the LLM agent manages the workflow. This decoupling of “brain” and “body” means we can swap out tools easily, and the LLM just issues generic “tool calls” (like browser.open or exec.run) as needed.

Architecture: The Autonomous Agent Pipeline
Here’s the high-level pipeline we built (see chart below):

Keyword Generation (LLM): The agent generates search queries (e.g. “dentists in Birmingham”) each cycle.
Google Maps Search (Browser): The agent uses a browser tool to perform a Google Maps search for each query.
Clura Scraping: The agent runs the Clura extension (via a command or browser automation) to scrape the visible Maps results.
Data Structuring: Scraped data (business name, address, phone, website, etc.) is cleaned, deduped, and enriched (e.g. adding missing emails via API).
Outreach (Email Agent): The agent drafts and sends personalized emails to the collected leads using an email tool.
Response Handling: Replies are automatically fetched and handled (flagging interested leads, sending follow-ups).
Monitoring & Logging: Every step logs metrics (lead counts, open/reply rates, errors) to a dashboard.

Why This Works
This design turns marketing into repeatable data flow. Every day, new search queries feed new leads to the system without human intervention. We never “run out of things to do”; we just add more queries or regions to target. The MCP architecture makes it easy to integrate new tools too (e.g. a new enrichment API, another scraping skill, etc.).

Because OpenClaw can invoke arbitrary tools, we treat Clura as just one of them. In effect, Clura is a “skill” plugged into our agent via MCP. We combined that with built-in OpenClaw tools: for instance, the browser tool (to open Maps) and the exec tool (to run our scripts). This means the agent’s prompt says “do A, then B,” and the MCP handles calling browser.open() or exec.run() accordingly. No hardcoded glue code – just prompt-driven orchestration.

Keyword Generation with LLM
We needed relevant search phrases to find businesses. Instead of manually listing them, we let the LLM agent brainstorm. At each run (or once a week), the agent prompts itself something like: “List 5 Google Maps search queries for local [industry] in [region]”. For example, “recruitment agencies in Manchester” or “cafes near London Bridge”. These prompts yielded a mix of city, region, and niche combinations.

Once generated, these queries queue up as tasks. We limit the list (say 50 queries) so the agent doesn’t overwhelm itself. Each query flows through the pipeline in order. The agent can also track which queries it already processed (via a simple memory or log) to avoid repeats.

(Implementation note: We stored queries in a small database table. The agent’s code (via exec or its own memory) reads new queries and processes them one by one.)

Google Maps Scraping with Clura
With a query ready, the agent uses the browser tool to open Google Maps for that query:

await browser.open("https://www.google.com/maps/search/" + encodeURIComponent(query));

OpenClaw’s browser tool launches a controlled Chromium session, separate from my user browser. It loads the Maps results page for, say, “restaurants in Birmingham”.

Next, we tap Clura. Clura’s Chrome extension (see [Clura Web Store][48]) is designed for exactly this: extracting structured data without coding. We scripted the agent to click the Clura icon or otherwise trigger the “Google Maps Places Scraper” template (built into Clura). Conceptually:

// Instruct agent to trigger Clura’s scraping on the current Maps page
await exec("clura-scrape --template maps-places --output=leads.csv");
In reality, we built a tiny helper script (clura-scrape) that uses Puppeteer to run the extension’s “Google Maps Places” template on the open page. The template scrolls automatically through results and collects all listings. As Clura describes:

“Clura’s Chrome extension includes a Google Maps Places Scraper template that extracts structured place and business data from the currently open Google Maps search results pages”.

The result is a CSV or JSON file of leads: business name, address, phone, website, etc. This file is saved in our system (either on disk or in a cloud store), ready for the next stage. All of this happened in one agent session — the LLM said “run the scraper” and MCP executed the tools.

Data Structuring and Enrichment
The raw CSV from Clura is a good start, but we refine it before outreach:

Deduplication: We remove exact duplicates (same name/address). Clura typically avoids duplicates by design, but extra caution doesn’t hurt.
Validation: We drop entries with missing emails or nonsense names. If Clura missed a phone number, we might drop that lead or find it via a lookup.
Email Finding: Many business listings don’t show an email. We use an enrichment API (for example, Hunter.io or our own domain query) to find corporate emails from the company website. This step is crucial for outreach.
Tagging: We add tags or notes (e.g. industry, query used) to each lead. This helps in personalised emails.
This processing runs in a simple script or Python function that the agent calls via exec:

import csv
leads = []
with open('leads.csv', newline='') as f:
    reader = csv.DictReader(f)
    for row in reader:
        # simple dedupe by set of names seen
        if row['Name'] not in seen:
            seen.add(row['Name'])
            leads.append(row)
# enrich leads (pseudo)
for lead in leads:
    lead['Email'] = find_email(lead['Website'])
# Save clean leads
with open('clean_leads.csv','w',newline='') as f:
    writer = csv.DictWriter(f, fieldnames=leads[0].keys())
    writer.writeheader()
    writer.writerows(leads)

These clean leads become our target list. We pass them to the email agent step.

Email Outreach Automation
Now we have qualified leads with names and emails. The agent handles email outreach via a dedicated skill:

Drafting Emails: We prompt the model with a template and lead data. Example prompt: “Write a concise email to [[Name]] at [Company] introducing Clura as a tool that helps [[benefit]], and ask for a meeting.” The LLM (e.g. Claude) outputs a personalized message. This is done in an agent turn, e.g. AI: ....
Sending Emails: We use a simple email-sending script. For instance, in Node or Python:

await exec(`node send_email.js --to ${lead.Email} --subject "Quick question" --body "${emailText}"`);

The send_email.js script handles SMTP or Gmail API auth. We set up a dedicated sending domain (with proper SPF/DKIM) to keep reputation clean.
Tracking Responses: We use the agent to periodically check for replies. For example, every night the agent runs python check_replies.py which reads our inbox (via IMAP or an API) and logs any new responses. The agent can even parse and auto-reply to simple replies, or flag interesting leads for us.
Key tools: OpenClaw’s built-in exec for running our scripts, and cron to schedule them. The agent logic ties it together: it takes the cleaned leads and for each runs the draft-and-send subroutine.

Deliverability and Warm-Up
Automated emailing only works if messages land in inboxes. We treated this seriously:

Dedicated Domain: We used a fresh domain (e.g. example-outreach.com) so our main domain wasn’t at risk.
Authentication: We published SPF, DKIM, DMARC records before sending any emails.
Low Initial Volume: In week one we sent maybe 5–10 emails per day per address. Then we gradually increased volume. This aligns with best practices: Instantly’s guide advises “start with 10 warmups daily, then increase over time”.
Engagement Focus: We crafted genuine messages to encourage replies. The agent even had a rule to respond promptly to any reply, simulating human interaction.
Consistent Sending: We maintained a regular schedule (no bursts), and varied send times slightly each day. This mimics natural usage patterns.
Monitoring: We tracked open and bounce rates. Initially we aimed for 20–30% open rate (benchmarked as “great” ≈27%). Bounces were kept under 2%. When deliverability dipped, we paused and fixed the issue (e.g. removed a broken sender, cleaned list).
As a result of this careful warm-up, our email metrics stayed healthy. By week three our open rates were ~30%, reply rates ~5–10% (above the ~2.9% typical cold reply), and no spam complaints.

Results and KPIs
We treated the whole engine like a product: we measured everything. Key metrics included:

Leads Generated: **Number of new business contacts per week. We went from ~50/week initially to ~300/week after scaling.
Email Opens: Tracked via pixels/trackers. We consistently saw ~30% open rate on each campaign.
**Reply Rate: The percentage of emails getting a response. We averaged ~6%, which outperformed typical cold benchmarks.
Meetings Booked: Conversions to calls/demos. This started around 1% of leads, later rising to 3–5% after refining our email copy.
Pipeline Growth: Leads qualified per month. We integrated replies into our CRM, so we could measure how many deals came from this pipeline.
Time Saved: As a founder, I no longer spend hours on spreadsheets. The automation freed up ~15 hours/week of staff time.
We logged all data in a simple database. Every run of the agent recorded how many leads scraped, how many emails sent, bounce/complaint counts, etc. This let us iterate: for example, if a certain industry query gave few replies, we adjusted messaging or paused that query.

The Architecture & Implementation Details
Let’s delve into the technical layers and how we connected them:

OpenClaw Agent: We ran OpenClaw in Docker on a small server. It hosted the LLM (we chose Claude 2 due to cost). The agent was configured with skills for browser, exec, and our custom scripts. We used the CLI (openclaw acp) to host a session.
Browser Automation: We used OpenClaw’s browser tool. It opens a headless Chrome (island profile) for the agent to control. This was how we navigated to Google Maps and interacted with Clura’s UI if needed. In effect, Clura ran inside that Chrome profile.
Clura Integration: Because Clura is a Chrome extension, we had two integration paths:
Manual trigger via browser tool: The agent could click the extension icon in the browser toolbar using browser.click() at the right selector. This triggered the scraping UI.
Command-line helper: We also wrote a Node.js script using Puppeteer that takes the current Maps URL and runs the Clura template automatically. The agent called it via exec.run("node maps_scrape.js ..."). This script used Clura’s internal API or commanded the extension headlessly.
After execution, Clura returned a CSV, which our agent fetched (e.g. browser.download()) or our script saved locally.
Data Pipeline: After scraping, data flowed into a central store. We used a Postgres database. Our scripts (called by exec) loaded leads.csv into the DB, upserting by a unique key (business name+address). This ensured persistent record-keeping. The agent could then query the DB or read from a shared CSV.
Scheduling: OpenClaw’s cron tool scheduled the entire pipeline once per day. Cron opened a new session: generate keywords, loop through the pipeline, send emails, and finish by checking responses. All tools calls were logged.
Error Handling: We anticipated failures: Google might block a request, Clura might time out, email send could error. We coded try/catch around each critical step. For example, if browser.open() failed, the agent logs “Maps load failed” and retries once with a delay. If an email bounce happens, it logs the address for removal. Errors were aggregated into an alert system (we set up a simple webhook to Slack for critical failures).
Rate Limits: We respected all APIs: Google Maps is a bit finicky, so we added a 2-second pause after loading each page to mimic human use. Our enrichment API had a strict quota, so we cached lookups. The email script throttled to stay under Gmail’s send limits (approx 2000/day for Google Workspace).
Monitoring & Logging: Each agent run produced a log file. We logged the number of leads scraped per query, open/response rates per campaign, and any tool errors. For observability, we used Prometheus/Grafana to graph daily leads and open rates. If we saw the graph flatten, we’d know pipeline needed tuning.
Infrastructure: Nothing exotic — one modest VPS (e.g. t3.small) hosted the agent, database, and scripts. Clura itself runs in the browser, not on our server, so no extra cost. We did use small paid LLM credits and a couple of email accounts (free Gmail) for sending.
Costs: Roughly: $20/month for the VPS, $10/month on OpenAI/Claude usage (few hundred calls), minimal IP/proxy costs for Maps (we used a free proxy pool at times). Essentially zero compared to hiring a SDR.
Security, Compliance, and Ethical Considerations
Building an autonomous scraper+mailer raises some flags, so we followed best practices:

Respect Robots.txt: Wherever possible, we obey website crawling rules. Google Maps doesn’t have a public robots.txt disallowing the list, but our system still added delays to avoid rapid-fire requests. We treated Google Maps like a public directory. In general, our scraper used a clear user-agent and paused between scrolls, aligning with ethical scraping tips (do not “hammer” sites).
No Sensitive Data: We only scraped business listings (public info). Clura inherently requires data to be visible on the page, so we never accessed any login or private data. We stored scraped data securely and encrypted in our database.
Email Legality: We complied with CAN-SPAM (opt-out in every email) and GDPR-like principles (we did not email EU residents without consent, as we targeted UK and US businesses). We only sent to business emails, not personal ones, to avoid privacy issues.
Monitoring Abuse: We built safeguards: if a website attempted to block us, our agent would stop (we noticed immediately if Maps showed a CAPTCHA). Also, we coded a throttle to limit email sends per day.
Disclosure: All outreach emails clearly identified our company and gave a way to unsubscribe. We didn’t engage in deceptive subject lines or spam content.
Agent Isolation: OpenClaw’s browser was sandboxed and isolated from my personal profile. The agent had no access to unrelated browser data. Our sending scripts had only limited scopes (SMTP) and did not store login credentials in code (used token-based OAuth).
Overall, we engineered the system to be ethical and above-board — the same rules a careful manual marketer would follow. This alignment with best practices (e.g. Google’s and Clura’s terms) kept our operations clean and legally sound.

Measurable Outcomes and KPIs
We tracked several key metrics to gauge success:

Lead Volume: Starting from ~50 leads/day in week 1, we scaled to ~200 leads/day by week 4 by adding more queries. This metric justifies the engine — each lead is one less thing we had to gather manually.
Email Open Rate: Averaged ~30%, meeting the expectation for cold B2B emails. If it dropped, we investigated (often an email formatting tweak or waiting a bit fixed it).
Reply Rate: Around 5–10%. Outreach benchmarks suggest ~12% response rate for effective sequences, so we considered 8% a success. Each positive reply was a conversation.
Conversion Rate: We measured how many qualified leads booked demos or trials. This moved from ~1% early on to ~3–5% as we refined messaging. The cost was essentially developer time, so each new customer had a high ROI.
System Uptime: Did the pipeline run daily without fail? We counted runs instead of downtime. By month two, our agents ran 95% of scheduled jobs without manual intervention.
These KPIs live on a simple dashboard. Whenever any key metric lagged, we’d review logs and tweak. For instance, a downward open rate might mean our sender IP got flagged, so we’d postpone or spread out sends.

Key takeaways: Workflow tools (Zapier, n8n) are easy but often lack custom scraping or complex logic. Direct function-calling (like OpenAI’s function API) ties you to one model vendor and requires reinventing if you switch. Pure code (cron jobs) is flexible but burdensome to maintain and scale.

By contrast, MCP with OpenClaw gives us structured, reusable interfaces. We got the flexibility of coding (we could add any tool) with the higher-level simplicity of the agent deciding the flow. As one analysis notes, MCP “reduces coding effort” with clear interfaces (list_tools, call_tool).

Thus, we leaned on MCP/OpenClaw to glue it all together. The table above is simplified, but it captures why an MCP-based agent made sense for a dynamic pipeline that mixes web scraping, AI, and automation.

Step-by-Step Implementation Guide
For those who want to replicate this setup, here’s a concise recipe:

Install OpenClaw: On your server or laptop, follow OpenClaw’s docs to install and set up a Gateway. Configure an LLM (GPT-4/Claude) for your agent.
Enable Tools: In OpenClaw’s config, enable essential tools: browser, exec/process, web_search, cron (see ). These come built-in.
Install Clura Extension: Add Clura from the [Chrome Web Store][48]. Log into Clura (or use a user profile) so it’s ready.
Build Scraper Helper: Write a small script (Node/Puppeteer) that:
Opens Google Maps for a query (or uses the current page).
Triggers Clura’s Maps template (via Clura’s JS API or DOM button).
Saves the output CSV/JSON.
Example pseudocode

const { chromium } = require('playwright'); (async () => {   const browser = await chromium.launch();   const page = await browser.newPage();   await page.goto(`https://maps.google.com/search?q=${encodeURIComponent(query)}`);   // Assume Clura injects a global function or button:   await page.click('#clura-scrape-button');   await page.waitForSelector('#download-link');   await page.click('#download-link'); // save leads.csv   await browser.close(); })();

Write Data Processing Scripts: e.g. Python scripts to clean CSV, find emails, and upsert into a database. These can be run via exec.
Setup Email Scripts: Use nodemailer (Node) or Python’s smtplib to send emails. Store credentials securely and set SPF/DKIM on your domain.
Create Agent Skill: In OpenClaw, write a skill (prompt) for the agent: “Given a list of search terms, for each do: open maps, scrape, clean data, send email.” Use the tools (browser, exec) in your messages.
Add Scheduling: Use OpenClaw’s cron tool to run the agent skill on a daily/weekly schedule.
Test & Warm-Up: Before scaling, test the full flow once or twice. Warm up the email domain by sending a few emails (to friends or tester accounts) gradually.
Monitor Metrics: Track leads and email stats. Tweak queries and messaging as needed.
Infrastructure: A basic VM (2 vCPU, 4GB RAM) running Linux is sufficient. Clura runs in the browser, so your user Chrome needs Clura. If unattended, you can use headless Chrome with extension (advanced). Cost: only your time and any paid API keys (LLM usage, enrichment API).

That’s the core! With these steps, an autonomous marketing agent can run itself daily.

Conclusion
We built not just a scraper, but a self-driving sales engine. By treating marketing as a series of data-driven steps, and plugging them together with an LLM + MCP agent, we automated away all the grunt work. Now leads come in automatically every morning, and replies flow out without us pushing a button.

The key insight was using our own product (Clura) on ourselves. If Clura can find leads for us, it proves the product works — and it builds momentum. The Model Context Protocol and OpenClaw gave us the flexibility to integrate Clura and other tools seamlessly. We turned marketing into code, and that code runs every day.

This system is not static; it will improve. We’ll add more skills (maybe a LinkedIn scraper), refine prompts, and let the metrics guide us. But even today, it’s a huge win: zero manual copy-paste, a growing pipeline, and the freedom to focus on the product itself.

If you’re an early-stage founder, I encourage you to ask: Can my product solve my problem? In our case, the answer was yes — and it changed everything.

DEV Community

How We Built an Autonomous Growth Engine with OpenClaw, MCP, and Clura

Top comments (0)