DEV Community: Ben

How to Scrape Emails from Google Maps (a Pay-As-You-Go Apollo & ZoomInfo Alternative)

Ben — Sat, 27 Jun 2026 14:26:57 +0000

Google Maps is the biggest free B2B database on the planet — every local business, with its website, phone and address. The one thing it won't give you is the email. So most people end up exporting businesses from one tool, then pasting domains into a second email-finder, then verifying in a third. Here's how to do the whole thing in one step, and pay only for what you get.

The usual workaround (and why it's painful)

The common stack is: a Maps scraper → a spreadsheet → an email-finder → a verifier. Three tools, three bills, and a lot of copy-pasting. Tools like Apollo or ZoomInfo bundle this, but they're priced for enterprise (annual contracts, per-seat fees) and they're thin on local SMEs, especially outside the US.

The one-step approach

The Google Maps Email Scraper does the search and the email enrichment in a single run. Give it a niche + a city, and for every business it:

Scrapes the listing — name, phone, address, website, rating, categories.
Visits that business's website — homepage, contact/Kontakt page and Impressum — and pulls the published email address.
If nothing is public, tests common business inboxes (info@, sales@, kontakt@…) and validates them with MX and SMTP checks.

You get one clean row per lead with a primary_email, a confidence score, and every email it found:

{
  "name": "Studio Berlin Marketing GmbH",
  "phone": "+49 30 1234567",
  "website": "https://www.studio-berlin.de",
  "primary_email": "info@studio-berlin.de",
  "email_confidence": 95,
  "email_source": "impressum",
  "rating": 4.8,
  "lead_score": 92
}

Why it's a strong Apollo / ZoomInfo alternative

Pay-as-you-go — a small fee per lead, plus a premium only when an email is actually found. No annual contract, no per-seat tax, no credit rationing.
You own the data — export to CSV/JSON or straight to your CRM via API, Make, Zapier or n8n.
Fresh and targeted — sourced live for your exact niche + city, not a stale shared database.
Especially strong in the DACH market — German sites are legally required to publish a contact email in their Impressum, so hit rates on German, Austrian and Swiss businesses are very high. Most US-centric tools are thin here.

Example: build a prospect list in one run

{
  "mode": "search",
  "query": "dentists",
  "location": "Munich, Germany",
  "maxResults": 100,
  "requireEmail": true
}

requireEmail: true means you only get leads that actually have an email — you're not charged the email fee for the ones without. Already have a list of companies? Switch to websites mode and paste your domains to enrich them directly.

Honest trade-offs

This finds business contact emails (the info@/sales@/Impressum kind), not a specific named person's inbox — for that, pair it with a name-based email finder. Published addresses are real and high-confidence; role-based guesses are validated by MX (and SMTP where the network allows), so filter by email_confidence if you only want the surest ones. And use the data for legitimate B2B outreach — honor GDPR/CAN-SPAM and opt-outs.

FAQ

Do I need an API key? No — a search query + location is enough.

Will it work outside Germany? Yes, anywhere Google Maps covers; the Impressum edge just makes DACH hit rates exceptionally high.

Can I schedule it? Yes — recurring runs and full API/Make/Zapier/n8n integration.

How much does it cost? Per result, with the email premium charged only when an email is found — far below an enterprise seat for the same verified-lead output.

Build verified B2B lead lists straight from Google Maps with the Google Maps Email Scraper.

ImmoScout24 Alternative: German Real-Estate Data Without the Blocks

Ben — Sat, 27 Jun 2026 13:31:22 +0000

ImmoScout24 is Germany's #1 property portal — and one of the hardest to scrape, with
aggressive blocking and pricey API access. The good news: most of the same German
real-estate data is available from sources that are far more accessible, including
private-landlord inventory ImmoScout under-represents. Here are three ImmoScout24
alternatives for actual data, by use case.

1. Private-landlord rentals & sales → Kleinanzeigen

Kleinanzeigen is where private owners list directly — often cheaper and absent from
the big portals. The
Kleinanzeigen Immobilien Scraper
parses rooms, living space, Kaltmiete/Warmmiete, deposit, address and PLZ into clean
fields, with city/radius search.

2. Agent-listed properties → Immowelt

For broader agent-listed inventory, the
Immowelt Scraper pulls
rental and sale listings with price, size, rooms, location and details — a strong
ImmoScout substitute for market analysis.

3. Shared flats & rentals (WGs) → WG-Gesucht

For the shared-apartment and rental market (huge in German cities), the
WG-Gesucht Scraper extracts
room/flat listings with price, location and availability — inventory you won't find on
ImmoScout at all.

{ "locationCode": "münchen", "radiusKm": 20, "maxResults": 200 }

Why these beat fighting ImmoScout24

Accessible — they don't wall you off the way ImmoScout does, so runs finish.
Different inventory — private-landlord and WG listings ImmoScout under-covers.
Pay-as-you-go — per-result pricing, no portal API contract.
Clean, typed fields — rooms, m², rent, deposit, address — ready for analysis.

Honest note

These are alternative sources of German property data, not ImmoScout24 scrapers. If
you specifically need ImmoScout's listings, you'll need ImmoScout. For market
analysis, private-landlord leads, rentals and WGs, this trio is cheaper, more
reliable, and covers inventory the big portal misses. (For Austria, add
willhaben.)

FAQ

Do I need an API key? No — give a city/PLZ and your filters.

Rent and buy? Yes — Kleinanzeigen and Immowelt cover both; WG-Gesucht covers shared/rentals.

Can I search by city? Yes — city names auto-resolve; radius search supported.

Is it legal? These read publicly available listing data. Use responsibly and within
each site's terms and applicable law.

German real-estate data without the ImmoScout wall: Kleinanzeigen Immobilien, Immowelt, WG-Gesucht.

Phantombuster Alternative: Pay-As-You-Go Social Media Data Extraction

Ben — Sat, 27 Jun 2026 13:27:32 +0000

Phantombuster is a handy social-automation suite — but it's subscription-based with
execution-time limits, and a lot of people only use it for one thing: pulling social
media data (posts, profiles, followers). If that's you, you can do the data side
pay-as-you-go, with no monthly seat and no time budget to ration.

What most people actually use it for

Extracting structured data from social platforms — a creator's posts, a profile's
stats, a follower list, hashtag/keyword streams. That's a scraping job, and you can
price it per result instead of renting hours.

The pay-as-you-go data stack

Pick the network:

Bluesky Scraper — posts, profiles, followers, search (no login needed for most).
Mastodon Scraper — hashtags, accounts, trends from any instance.
Instagram Intelligence Scraper — profiles, posts, comments, hashtags + lead scoring.
TikTok Profile Scraper — public profile stats at scale.
Reddit Scraper + Lemmy Scraper for discussion data.

Each returns clean JSON (text, engagement counts, author, media) ready for analysis or
an LLM:

{ "mode": "search", "searchTerms": ["#yourbrand"], "maxItems": 500 }

Why this beats a subscription

Per-result pricing — pay for the data you pull, no monthly seat or execution-time cap.
Own the data — export to CSV/JSON, no lock-in.
One consistent shape across networks — easy to pipe into a warehouse or LLM.
Emerging networks covered — Bluesky/Mastodon/Lemmy, where competition is thin.

Honest trade-offs

Phantombuster also does automation (auto-connect, auto-message — mostly LinkedIn
growth flows). These actors don't automate actions on your account; they extract
data. If you need account automation, that's a different (and ToS-riskier) job. For
clean social-data extraction and listening, pay-as-you-go scraping is cheaper and
simpler.

FAQ

Do I need logins? Mostly no (Bluesky/Mastodon/Lemmy public data); keyword search on
some networks works best with an optional app token.

Can I schedule it? Yes — recurring runs + API/Make/Zapier/n8n.

Which networks? Bluesky, Mastodon, Lemmy, Instagram, TikTok, Reddit today.

Is it legal? It reads publicly available data via public APIs/pages. Use it
responsibly and within each platform's terms.

Extract social data pay-as-you-go: Bluesky, Mastodon, Instagram, TikTok, Reddit.

ZoomInfo Alternative: Build B2B Contact Data Without the Enterprise Contract

Ben — Sat, 27 Jun 2026 13:23:52 +0000

ZoomInfo is powerful — and priced for enterprises, with annual contracts that start in
the five figures. If what you actually need is targeted company + contact data
(businesses in a vertical and region, with phone, website and email), you can source
it yourself from live public data for a tiny fraction of the cost, and own every row.

What you're really paying ZoomInfo for

A giant prebuilt contact database with intent signals and org charts, billed annually
per seat. That's overkill (and over-budget) if your job is building targeted
prospect lists for outbound — the most common use case by far.

The build-your-own stack

Find the companies. Pick your source by market:

Google Maps Business Scraper — any niche + city (name, address, phone, website, rating) with a lead score.
Gelbe Seiten Scraper (Germany) and local.ch Scraper (Switzerland) for DACH B2B directories.

Get the emails. Feed the companies' domains into the
Smart Email Finder & Verifier —
it tests common patterns, runs SMTP/MX checks, flags catch-all domains and returns a
confidence score, so your list is deliverable before you send.

{ "search": "logistics companies", "location": "Hamburg, Germany", "maxResults": 500 }

Why this beats an enterprise seat

Pay-as-you-go — per result, no annual contract or per-seat tax.
You own the data — export to CSV/CRM, no usage caps or credit rationing.
Fresh & targeted — sourced live for your exact ICP, not a stale shared DB.
DACH-strong — most US-centric tools are thin on German/Swiss SMEs; directory scrapers aren't.

Honest trade-offs

ZoomInfo's edge is scale, intent data and org hierarchy. If you need buying-intent
signals or a pre-enriched database of millions, that's a different product. For
building targeted, verified outbound lists on demand — especially in DACH — live
sourcing + verification is dramatically cheaper and just as actionable.

FAQ

Is this GDPR-compliant? You're collecting publicly listed business contact data;
use it for legitimate B2B outreach and honor opt-outs and local law.

Do I need an API key? No for the directory/Maps scrapers or the email finder.

Can I automate it? Yes — schedule runs and pipe results to your CRM via API, Make,
Zapier or n8n.

How accurate are the emails? SMTP/MX-verified with a confidence score; drop the
low-confidence ones before sending.

Build B2B lists on demand: Google Maps Business Scraper + Smart Email Finder & Verifier, plus Gelbe Seiten and local.ch for DACH.

mobile.de Alternative: Get Private-Seller German Used-Car Data

Ben — Sat, 27 Jun 2026 11:19:18 +0000

mobile.de is the default for German used-car data — but it's dealer-dominated,
heavily defended, and you're seeing the same inventory every other dealer and tool
sees. The biggest blind spot it leaves is private sellers, who list on
Kleinanzeigen (formerly eBay Kleinanzeigen). That's where the arbitrage, the cheaper
cars and the motivated sellers are — and it's far more accessible to pull.

Why private-seller data is the edge

Different inventory — private owners list cheaper and earlier than dealers; it's the half of the market mobile.de under-represents.
Arbitrage & sourcing — dealers and flippers buy private and resell; this is the sourcing side.
Cleaner economics — Kleinanzeigen is easier to access reliably than mobile.de's defenses.

The approach

The Kleinanzeigen Autos Scraper
parses the German car attributes most tools drop — Marke (make), Modell,
Erstzulassung (year), Kilometerstand (mileage), Kraftstoff (fuel), Getriebe
(gearbox), Leistung (power), price — into clean, typed fields, with city/PLZ +
radius search.

{ "locationCode": "berlin", "maxPrice": 20000, "minYear": 2016, "maxMileage": 120000 }

Output is one tidy row per car — make, model, year, mileage, fuel, transmission,
power, price, location, URL — ready for price analysis, arbitrage screening or a
search product.

Pair it for full DACH coverage

Used cars are one slice of the DACH market. The same engine powers the
Kleinanzeigen Immobilien Scraper
(real estate) and Kleinanzeigen Jobs Scraper
(local jobs), plus willhaben
for Austria — so you can cover the German-speaking market with one consistent data shape.

Honest note

This complements mobile.de rather than replacing it: mobile.de still has the deepest
dealer inventory. For private-seller cars, market-price analysis and sourcing,
Kleinanzeigen is the better, more accessible source — and it's the half most people miss.

FAQ

Do I need an API key? No — give a location and optional filters.

Can I filter by make/model? Best-effort by title keyword; for exact make filtering,
paste a Kleinanzeigen search URL with the make facet selected.

What fields do I get? Make, model, year, first registration, mileage, price, fuel,
transmission, power, condition, color and location, where the listing provides them.

Is it legal? It reads publicly available listing data. Use it responsibly and
within applicable laws and Kleinanzeigen's terms.

Get the private-seller side of the German car market with the Kleinanzeigen Autos Scraper — a data complement to mobile.de.

Zillow Alternative: Free Real-Estate Data Sources That Don't Block You

Ben — Sat, 27 Jun 2026 11:15:59 +0000

Everyone tries to scrape Zillow first — and hits a wall of CAPTCHAs, bans and a $300–
$800/month lead bill. The smarter move is to pull comparable real-estate data from
sources that are far easier to access and often carry inventory Zillow doesn't.
Here are three Zillow alternatives for actual data, by use case.

1. Off-market & FSBO leads → Craigslist

Zillow's leads are bought by every agent in your zip. Craigslist is full of
for-sale-by-owner sellers with no agent — exclusive, direct-contact leads that
aren't on the MLS. The
Craigslist Real Estate Scraper
pulls price, beds, baths, sqft, photos, contact availability and a 0–100 lead score,
with an owner-only filter for pure FSBO.

2. US homes for sale, rent & sold comps → Redfin

For structured listing data and sold comps (the part Zillow gates hardest), Redfin
is more accessible. The
Redfin Scraper returns price,
beds, baths, sqft, address and photos across for-sale, rental and sold — clean enough
for valuation models and market dashboards.

3. International / UK property → OnTheMarket

Zillow is US-only. For the UK, the
OnTheMarket Scraper pulls
for-sale and to-rent listings with price, beds, type, address, agent and even
latitude/longitude — without the heavy blocking Rightmove and Zoopla throw up.

Why these beat fighting Zillow

They don't ban you on sight — Craigslist, Redfin and OnTheMarket are far more scrapable than Zillow, so runs actually finish.
Different, exclusive inventory — FSBO and off-market deals Zillow never shows.
Pay-as-you-go — per-result pricing instead of a $300–800/month lead contract.
Own the data — export to CSV/Excel/JSON for your CRM, model or product.

Honest note

These aren't Zillow scrapers — they're alternative sources of comparable data
(listings, comps, leads). If you specifically need Zillow's Zestimate, you'll need
Zillow. For everything else — leads, listings, comps, market trends — these are
cheaper, more reliable, and often richer.

FAQ

Do I need an API key? No — give a city/location and your filters.

Can I get sold comps? Yes — via the Redfin scraper (for-sale, rent and sold).

FSBO only? Yes — the Craigslist scraper has an owner-only filter.

Is it legal? These read publicly available listing data. Use responsibly and
within each site's terms and applicable law.

Real-estate data without the Zillow wall: Craigslist Real Estate (FSBO), Redfin (US comps), OnTheMarket (UK).

Apollo.io Alternative: Build Your Own B2B Lead Lists (Pay-As-You-Go)

Ben — Sat, 27 Jun 2026 11:11:30 +0000

Apollo.io is great until the bill arrives and the seats stack up — and you're still
renting access to a contact database everyone else also rents. If you mainly need
targeted B2B lead lists (businesses in a niche + a location, with phone and
email), you can build them yourself from live public sources for a fraction of the
cost, and own the data outright. Here's the pay-as-you-go approach.

Where Apollo's cost comes from

Apollo charges per seat and gates exports/credits by plan. For a small team or a
solo operator running campaigns, that's $49–$149+/seat/month before you've sent a
single email — and the data is a shared database, not freshly sourced for your niche.

The build-your-own approach (two steps)

Step 1 — find the businesses. Google Maps is the largest live local-business
directory: name, address, phone, website, category, rating. Every "niche + city"
search is a lead list. The
Google Maps Business Scraper
turns a search like "dentists in Austin" into a clean, exportable table — with an AI
lead score so you prioritize.

Step 2 — get the emails. Maps gives you the website; feed those domains into the
Smart Email Finder & Verifier,
which tests common patterns, runs SMTP/MX checks, flags catch-all domains and returns
a confidence score. Now you've got deliverable, verified B2B contacts — sourced fresh
for your exact target, not pulled from a shared pool.

{ "search": "marketing agencies", "location": "Berlin, Germany", "maxResults": 300 }

Why this beats renting a database

Pay-as-you-go, not per-seat — you pay for the rows you pull, nothing else.
You own the data — export to CSV/CRM, no lock-in, no export caps.
Fresh & niche-specific — sourced live for your target, not a stale shared DB.
No team-seat tax — run it solo or wire it into Make/Zapier/n8n.

Honest trade-offs

Apollo has intent data, org charts and a massive prebuilt contact database — if you
need those, it's a different product. This approach is for the most common job:
building targeted, verified B2B lead lists cheaply and on demand. For that, live
sourcing + verification is faster and far cheaper.

FAQ

Is this really cheaper than Apollo? For list-building, yes — you pay per result
instead of per seat/month, with no export credits to ration.

Do I need an API key? No for Google Maps; the email finder needs no third-party
key either.

Will the emails be valid? They're SMTP/MX-verified with a confidence score, so you
can drop the risky ones before sending.

Can I automate it? Yes — schedule runs and connect to your CRM via API, Make,
Zapier or n8n.

Is this legal? It reads publicly available business data and verifies emails
without sending. Use it for legitimate, compliant outreach (respect GDPR/CAN-SPAM).

Build lead lists on demand with the Google Maps Business Scraper + Smart Email Finder & Verifier — pay per result, own the data.

How to Scrape Google Maps for Business Leads (Python + No-Code)

Ben — Sat, 27 Jun 2026 09:45:32 +0000

Google Maps is the world's biggest local-business database — name, address, phone,
website, rating, hours, category, for tens of millions of businesses. For B2B sales,
local agencies and market research, it's the single best source of fresh, targeted
leads. The official Places API is expensive, quota-limited and caps results; here's
how to get the same data at scale.

Why Google Maps for lead-gen?

Targeted by niche + location — "dentists in Austin", "plumbers in Berlin", "law firms in London". Every search is a lead list.
Contactable — phone and (often) website for direct outreach.
Enrichable — pair the website with an email finder and you've got a full B2B contact record.

The hard part

Google Maps is heavily defended: it lazy-loads results as you scroll, rate-limits
aggressively, and caps the official Places API at ~60 results per query. Doing this
reliably means a headless browser or a maintained scraping engine, residential
proxies, scroll/pagination handling, and de-duplication. A naive script gets blocked
fast — this is one to use a maintained tool for rather than build from scratch.

The no-code option

The Google Maps Business Scraper
on Apify does it — enter a search and location, click Run, get a clean lead list.

{
  "search": "dentists",
  "location": "Austin, TX",
  "maxResults": 200
}

Output is one clean row per business — name, full address, phone, website, rating,
review count, category, coordinates and hours — plus an AI lead score so you work
the best prospects first. Export to CSV/Excel for your CRM, or JSON for a pipeline.

Turn places into a real contact list

Maps gives you the business + website; the money move is enrichment. Feed the
websites into the
Smart Email Finder & Verifier
to get verified email addresses, and you've gone from "a map pin" to "a deliverable
B2B contact" — the exact workflow lead-gen agencies sell for real money.

Common use cases

🎯 B2B sales & agencies — build targeted prospect lists by industry + city.
📈 Market research — map competitor density, ratings and coverage by area.
📍 Local SEO / data products — power directories and local-data tools.
🤝 Partnerships — find every business of a type in a region in minutes.

FAQ

Do I need a Google API key? No — and you're not limited to the Places API's
~60-result cap.

Can I get emails? Maps gives the website; pair it with an email-finder actor to
get verified emails.

How many results? Set your max — it scrolls and paginates past the usual caps.

Can I run it on a schedule? Yes — schedule recurring runs to keep lead lists fresh,
or call it via API / Make / Zapier / n8n.

Is scraping Google Maps legal? It reads publicly available business listings. Use
it responsibly for legitimate lead-gen and within applicable laws and terms.

Building a lead pipeline? The Google Maps Business Scraper plus the Smart Email Finder & Verifier turn local search into deliverable B2B contacts.

How to Find FSBO Real Estate Leads on Craigslist (Python + No-Code)

Ben — Sat, 27 Jun 2026 09:41:15 +0000

Every real-estate agent and investor wants FSBO leads — for-sale-by-owner
listings, where the seller has no agent and is reachable directly. Zillow charges
$300–$800/month for lead lists that every other agent also buys. Meanwhile
Craigslist is full of FSBO sellers, off-market deals and rentals, with almost no
competition mining it. If you can pull those listings into a spreadsheet, you've got
exclusive leads for pennies.

Why Craigslist for real estate?

Direct-to-owner — FSBO sellers post themselves, so there's no listing agent gatekeeping the contact.
Off-market — these properties aren't on the MLS, so they're not bid up.
Cheap & open — Craigslist is far easier to read than Zillow/Realtor, and most agents ignore it entirely.

The manual way (Python)

Craigslist exposes its categories per city subdomain. Real estate for sale is rea,
apartments/rentals are apa. You can request the "by owner" filter and parse the
result cards:

import httpx
from bs4 import BeautifulSoup

# San Francisco Bay Area, real estate for sale, by owner:
url = "https://sfbay.craigslist.org/search/rea?purveyor=owner"
html = httpx.get(url, headers={"User-Agent": "Mozilla/5.0"}, timeout=30).text
soup = BeautifulSoup(html, "lxml")
for el in soup.select("li.cl-static-search-result"):
    title = el.select_one(".title").get_text(strip=True)
    price = el.select_one(".price")
    print(title, "—", price.get_text(strip=True) if price else "n/a")

That gets you titles and prices. The real value — beds, baths, sqft, photos, the
posting body, FSBO detection and contact availability — lives on each listing's
detail page, so you'd fetch and parse those too.

The catch: scale, detail pages & lead scoring

A usable lead list means paging through results, opening every detail page, parsing
the inconsistent attributes, detecting by-owner vs agent, pulling photos and contact
info, and ideally scoring each lead so you work the hot ones first. Doing that
reliably across cities is the part worth automating.

The no-code option

The Craigslist Real Estate Scraper
on Apify does all of it — pick a city, category and the owner-only filter, click Run.

{
  "mode": "search",
  "city": "sfbay",
  "category": "rea",
  "listingType": "owner",
  "minPrice": 200000,
  "maxPrice": 900000,
  "includeLeadScore": true,
  "maxListings": 100
}

Output is one clean row per listing — price, beds, baths, sqft, location, photos,
body text, an FSBO flag, contact availability and a 0–100 AI lead score — ready
for a spreadsheet, a CRM, or your outreach tool. It covers rentals too (apa), and
supports postal-code radius search.

Common use cases

🎯 Agents — find FSBO sellers for listing appointments before competitors do.
💰 Investors — source motivated, un-bid-up sellers for flips and wholesale.
🏢 Property managers — pull rental inventory and benchmark rates by area.
📊 Market research — track pricing and supply trends across cities over time.

FAQ

How do I get FSBO leads specifically? Set the listing type to "owner" — it keeps
only by-owner listings and flags + scores each one.

Does it include rentals? Yes — both for-sale and rental categories.

Do I need an API key or login? No — just a city and your filters.

Can I search a specific area? Yes — use a postal code + radius.

Is scraping Craigslist legal? It reads publicly available listing data. Use it
responsibly for legitimate lead-gen and follow applicable laws and Craigslist's terms.

Building a real-estate lead pipeline? The Craigslist Real Estate Scraper handles the scraping. See also the FSBO Real Estate Scraper and Zumper Rental Scraper.

How to Scrape UK Property Listings from OnTheMarket — Python + No-Code

Ben — Sat, 27 Jun 2026 09:21:40 +0000

If you want UK property data, everyone fights over Rightmove and Zoopla — both of
which block hard. OnTheMarket, Britain's third major portal, is far more
accessible and carries the same agent-listed for-sale and to-rent inventory. Better
still, it's a modern site that ships its listing data as structured JSON inside the
page, so you get clean, typed fields without brittle HTML scraping.

Why OnTheMarket?

Accessible — unlike Rightmove/Zoopla, it doesn't immediately wall off basic requests, so you don't need heavy residential proxies for modest runs.
Structured data built in — it's a Next.js site that embeds the full results set as JSON, so fields come out typed and consistent.
For sale and to rent — the same source covers both, nationwide.

Read the embedded JSON (Python)

OnTheMarket renders results into a __NEXT_DATA__ script tag. Parse it instead of
scraping cards:

import httpx, json, re

html = httpx.get("https://www.onthemarket.com/for-sale/property/london/",
                 headers={"User-Agent": "Mozilla/5.0"}, timeout=30).text
data = json.loads(re.search(r'<script id="__NEXT_DATA__"[^>]*>(.*?)</script>', html, re.S).group(1))
for p in data["props"]["initialReduxState"]["results"]["list"]:
    print(p["property-title"], "—", p["price"], "—", p["address"])

Each listing carries price, bedrooms, property type, address, agent and even
latitude/longitude. Pagination is just ?page=2, ?page=3, …

The catch: filters, paging & clean fields

A useful dataset means resolving location slugs, threading price/bedroom filters,
paging through thousands of results, mapping the raw keys into clean fields
(price_value as a number, geo-coordinates, agent contact) and skipping the promo
cards mixed into the list. That's the part worth automating.

The no-code option

The OnTheMarket Scraper on
Apify does it — pick for-sale or to-rent, a location, optional price/bedroom filters,
click Run.

{
  "listingType": "for-sale",
  "location": "manchester",
  "minPrice": 200000,
  "maxPrice": 500000,
  "minBedrooms": 2,
  "maxResults": 200
}

Output is one clean row per property — price (text + numeric), bedrooms, type,
address, agent name & phone, latitude/longitude, features, image and URL — ready for
mapping, dashboards or a property product.

Common use cases

UK market analysis — track asking prices and supply by area, type and beds.
Lead generation — build estate-agent and listing lead lists with contacts.
Proptech & portals — power a search product or alerting tool with fresh listings.
Investment sourcing — filter by price/beds/location, then map with the geo data.

FAQ

Do I need an API key? No — just a location (or a search URL) and your filters.

For sale and to rent? Both — set listingType.

Do listings include geo-coordinates? Yes — every listing has latitude/longitude.

Is it legal? You're reading publicly available listing data. Use it responsibly
and within applicable laws and OnTheMarket's terms.

Building something with UK property data? The OnTheMarket Scraper handles it. See also the Redfin Scraper and Craigslist Real Estate Scraper.

How to Scrape Mastodon (Hashtags, Accounts & Trends) — Python + No-Code

Ben — Sat, 27 Jun 2026 09:18:48 +0000

Mastodon is the largest open, federated social network — millions of active users
across thousands of independent servers. And like the rest of the fediverse, its data
is refreshingly easy to get: every instance exposes a public REST API, no login
required for public content. If you do social listening or trend research, Mastodon
is a clean, ad-free, bot-light signal that almost no one is mining.

Why Mastodon?

Public API on every instance — mastodon.social, mas.to, fosstodon.org and thousands more, all with the same endpoints, no key.
Federated reach — query one instance or the broader federated timeline it sees.
Clean signal — real communities, hashtags and trends, with engagement counts.

Hashtag posts (Python, no auth)

import httpx
from bs4 import BeautifulSoup

base = "https://mastodon.social"
r = httpx.get(f"{base}/api/v1/timelines/tag/bitcoin", params={"limit": 40}, timeout=30)
for s in r.json():
    text = BeautifulSoup(s["content"], "lxml").get_text(" ", strip=True)
    print(text[:80], "—", s["favourites_count"], "favs")

Post bodies are HTML, so strip the tags for clean text. A user's posts come from
/api/v1/accounts/lookup?acct=Gargron → /api/v1/accounts/{id}/statuses; the public
or federated firehose from /api/v1/timelines/public; and what's hot right now from
/api/v1/trends/statuses.

The catch: boosts, paging & HTML

Pagination uses max_id (older items); boosts (reblogs) carry their real content in a
nested reblog object you need to unwrap; and every post body is HTML to clean. Doing
that across hashtags or accounts is the part worth automating.

The no-code option

The Mastodon Scraper on Apify
handles it — pick an instance and a mode (hashtag, account, public, trends, profile),
click Run.

{
  "mode": "hashtag",
  "instance": "mastodon.social",
  "hashtags": ["ai", "bitcoin"],
  "maxItems": 500
}

Output is one clean row per post — text (HTML stripped), date, language, boosts,
favourites, replies, hashtags, media URLs and author — ready for a spreadsheet, a
database, or an LLM.

Common use cases

Social listening — track a topic or brand across the whole fediverse.
Trend & sentiment analysis — feed hashtag/trend streams into an LLM.
Open-social research — study communities and how content federates.
Influencer & audience research — profile stats and posting activity.

FAQ

Do I need an account or API key? No for hashtags, accounts, public timelines and
trends. Only keyword search needs an access token.

Which instance? Any — mastodon.social is the largest and sees most of the
federated timeline; niche servers are great for niche communities.

Can I scrape remote accounts? Yes — use the full user@instance handle.

Is it legal? You're reading publicly available data via Mastodon's own public API.
Use it responsibly and within each instance's terms.

Building something with social data? The Mastodon Scraper handles the API for you. See also the Bluesky Scraper and Lemmy Scraper.

How to Scrape Lemmy — the Federated Reddit Alternative (Python + No-Code)

Ben — Sat, 27 Jun 2026 09:16:06 +0000

When Reddit locked down its API, a chunk of its communities moved to Lemmy — an
open, federated Reddit alternative. The best part for anyone who needs discussion
data: Lemmy's API is completely public, no key, no login, and it returns clean
JSON with Reddit-style metrics (score, upvotes, downvotes, comments). If you build
RAG datasets, do social listening, or track communities, Lemmy is an easy, untapped
source.

Why Lemmy?

Open API — every instance (lemmy.world, lemmy.ml, beehaw.org, …) exposes /api/v3/ with no authentication for public data.
Federated — query one instance or the whole network; reach cross-instance communities like technology@lemmy.world.
Reddit-shaped data — posts, comments, communities, scores and vote counts, so it slots straight into anything you built for Reddit.

Front-page or community posts (Python)

import httpx

base = "https://lemmy.world"
# front page (whole federated network), sorted Hot:
r = httpx.get(f"{base}/api/v3/post/list",
              params={"sort": "Hot", "type_": "All", "limit": 50}, timeout=30)
for it in r.json()["posts"]:
    p, c = it["post"], it["counts"]
    print(p["name"], "—", c["score"], "pts,", c["comments"], "comments")

Want a specific community? Add community_name=technology. Want comments? Hit
/api/v3/comment/list. Want to find communities? /api/v3/community/list or
/api/v3/search?type_=Communities.

The catch: pagination & normalization

Lemmy paginates with page=N (up to 50 per page), and each item nests post,
creator, community and counts objects you'll want flattened into one clean row.
Across multiple communities and sort windows, that's the part worth automating.

The no-code option

The Lemmy Scraper on Apify does
it for you — pick a mode (posts, community, search, comments, communities), an
instance and a sort order, click Run.

{
  "mode": "community",
  "instance": "lemmy.world",
  "communities": ["technology", "asklemmy"],
  "sort": "TopWeek",
  "maxItems": 500
}

Output is one clean row per post (or comment/community) — title, body, link, score,
upvotes, downvotes, comment count, community, author and URLs — ready for a
spreadsheet, a database, or an LLM.

Common use cases

Reddit-migration research — follow communities and audiences that left Reddit.
Community & topic monitoring — track discussions across the fediverse.
RAG / LLM datasets — clean, open, license-friendly discussion data.
Trend & sentiment analysis — feed posts and comments into an LLM.

FAQ

Do I need an account or API key? No — public posts, comments and communities work
with no login.

Which instance should I use? Any. lemmy.world is the largest; with
listingType: All it sees most of the federated network.

Can I scrape a community on another instance? Yes — use community@instance.

Is it legal? You're reading publicly available data via Lemmy's own public API.
Use it responsibly and within each instance's terms.

Building something with discussion data? The Lemmy Scraper handles the API so you can focus on the product. See also the Reddit Scraper and Bluesky Scraper.