DEV Community

Cover image for How to scrape Redfin (US real estate) data with Node.js (no API key needed)
Freshactors
Freshactors

Posted on

How to scrape Redfin (US real estate) data with Node.js (no API key needed)

Redfin has some of the cleanest US real-estate data on the web — list prices, sold prices, beds/baths, square footage, lat/long, MLS IDs — but there's no public Redfin API you can just call. In this tutorial we'll pull structured Redfin listings as JSON using Node.js, without an API key, a login, or writing a single CSS selector.

We'll use a maintained Apify actor that handles the annoying parts (region resolution, Redfin's anti-bot "soft block" behavior, retries) so we can focus on the data.

Why not just fetch() Redfin yourself?

You can hit Redfin's internal GIS endpoint directly — for about a day. Two things bite you:

  1. The soft block. Redfin's data API returns a 200 with an envelope that says errorMessage: "Success" even when it's blocking you as a bot. Your scraper "succeeds" and writes zero rows. You don't notice until your dashboard is empty.
  2. Region IDs. Redfin doesn't take a city name — it takes an internal region_id and region_type. You have to resolve those first.

We'll offload both to the FreshActors Redfin Scraper, which parses the region straight from a Redfin URL and distinguishes a real zero-result from a block (retrying with backoff + IP rotation when it sees one).

Prerequisites

  • Node.js 18+
  • A free Apify account and your API token (Settings → Integrations)

Install the client:

npm install apify-client
Enter fullscreen mode Exit fullscreen mode

Step 1 — Call the actor

The only required input is a Redfin URL. Paste a city/ZIP/county page from redfin.com and the actor reads the region from it.

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });

const input = {
  redfinUrls: ['https://www.redfin.com/city/30818/TX/Austin'],
  listingType: 'forSale',   // or 'sold'
  sort: 'price-desc',
  maxListings: 500,
};

// Run the actor and wait for it to finish.
const run = await client.actor('freshactors/redfin-scraper').call(input);

// Pull the results from the run's dataset.
const { items } = await client.dataset(run.defaultDatasetId).listItems();

console.log(`Got ${items.length} listings`);
console.log(items[0]);
Enter fullscreen mode Exit fullscreen mode

That's the whole integration. No selectors, no proxy config, no Redfin region lookup.

Step 2 — What the output looks like

Each item is one property. A trimmed real record:

{
  "_type": "listing",
  "_schemaVersion": "1.0",
  "listingId": 216197035,
  "mlsId": "2116358131372161577",
  "mlsStatus": "Active",
  "address": "4602 Indian Wells Dr",
  "city": "Austin",
  "state": "TX",
  "zip": "78747",
  "price": 560000,
  "beds": 3,
  "baths": 2,
  "sqft": 2097,
  "pricePerSqFt": 267,
  "yearBuilt": 2019,
  "propertyType": "Single Family Residential",
  "latitude": 30.1327479,
  "longitude": -97.79,
  "daysOnMarket": 12,
  "url": "https://www.redfin.com/TX/Austin/4602-Indian-Wells-Dr-78747/home/31845362",
  "_scrapedAt": "2026-06-01T08:08:20.543Z"
}
Enter fullscreen mode Exit fullscreen mode

You also get propertyId, unit, hoa, lotSize, soldDate, and listingType. The schema is versioned (_schemaVersion), so fields don't silently vanish between runs.

Step 3 — Do something useful with it

Let's find the cheapest-per-square-foot homes — a quick proxy for value hunting:

const deals = items
  .filter((h) => h.price && h.sqft)
  .map((h) => ({
    address: `${h.address}, ${h.city} ${h.state}`,
    price: h.price,
    ppsf: h.pricePerSqFt ?? Math.round(h.price / h.sqft),
    dom: h.daysOnMarket,
    url: h.url,
  }))
  .sort((a, b) => a.ppsf - b.ppsf)
  .slice(0, 10);

console.table(deals);
Enter fullscreen mode Exit fullscreen mode

Combining a low pricePerSqFt with a high daysOnMarket is a classic motivated-seller filter.

Step 4 — Get sold comps (the "Redfin API" alternative people actually want)

For an AVM, a CMA tool, or a price-trend model, you want sold data. Flip one field:

const soldInput = {
  redfinUrls: ['https://www.redfin.com/zipcode/78701'],
  listingType: 'sold',           // sold homes
  sort: 'days-on-redfin-asc',    // most recent first
  maxListings: 300,
};

const soldRun = await client.actor('freshactors/redfin-scraper').call(soldInput);
const { items: sold } = await client.dataset(soldRun.defaultDatasetId).listItems();

const medianSoldPrice = (() => {
  const prices = sold.map((h) => h.price).filter(Boolean).sort((a, b) => a - b);
  const mid = Math.floor(prices.length / 2);
  return prices.length % 2 ? prices[mid] : (prices[mid - 1] + prices[mid]) / 2;
})();

console.log(`Median sold price in 78701: $${medianSoldPrice.toLocaleString()}`);
Enter fullscreen mode Exit fullscreen mode

Sold records carry a soldDate, so you can bucket by month and chart how a ZIP's median price moves over time.

Step 5 — Scale it up (or down)

Pricing is pay-per-result: $0.002 per listing. So 100 listings cost $0.20 and a 1,000-row run is $2.00. Use maxListings to keep any run within budget, and pass several URLs in redfinUrls to cover multiple cities in one call (the cap applies across all of them).

For ongoing data, schedule the actor (Apify → Schedules) to run daily and append to a dataset — that's your always-fresh Redfin feed.

A note on reliability

The reason I reach for this actor over a hand-rolled script is the daily canary: it runs the scraper against live Redfin every day and flags field drift or block-pattern changes, so it gets patched before your pipeline returns garbage. With a DIY scraper, you're the canary — and you usually find out when a report is already empty.

Wrap-up

With ~15 lines of Node you can pull structured Redfin listings — for-sale or sold — without an API key, region lookups, or anti-bot wrestling. Point it at a URL, get clean JSON, and build your analysis on top.

Actor reference and full input docs: apify.com/freshactors/redfin-scraper.

Happy scraping. If Redfin changes something, open an issue on the actor — that's what keeps it fresh.

Top comments (0)