DEV Community

agenthustler
agenthustler

Posted on

Scraping Rental Market Data: Analyzing Rent Trends by City

Understanding rental market trends helps renters negotiate better deals and helps investors identify opportunities. Let's build a rental data scraper that tracks prices across major cities.

The Rental Data Landscape

Sites like Zillow, Apartments.com, and Craigslist have massive rental listing datasets. By scraping and analyzing this data over time, we can identify trends like seasonal price drops, neighborhood gentrification signals, and fair market rent for any area.

Setting Up the Scraper

pip install requests beautifulsoup4 pandas matplotlib
Enter fullscreen mode Exit fullscreen mode

We'll use ScraperAPI for reliable access to rental sites that have aggressive bot detection:

import requests
from bs4 import BeautifulSoup
import pandas as pd
import json

API_KEY = "YOUR_SCRAPERAPI_KEY"

def scrape_listings(city, state, max_pages=5):
    """Scrape rental listings for a given city."""
    listings = []
    for page in range(1, max_pages + 1):
        url = (
            f"https://www.apartments.com/"
            f"{city}-{state}/{page}/"
        )
        resp = requests.get(
            "http://api.scraperapi.com",
            params={"api_key": API_KEY, "url": url, "render": "true"},
            timeout=60
        )
        soup = BeautifulSoup(resp.text, "html.parser")

        for card in soup.select(".placard"):
            name = card.select_one(".property-title")
            price = card.select_one(".property-pricing")
            beds = card.select_one(".bed-range")

            if name and price:
                listings.append({
                    "name": name.text.strip(),
                    "price": price.text.strip(),
                    "beds": beds.text.strip() if beds else "N/A",
                    "city": city,
                    "state": state
                })
    return listings
Enter fullscreen mode Exit fullscreen mode

Cleaning and Normalizing Price Data

Rental listings have messy pricing (ranges, "Call for Rent", etc.):

import re

def parse_price(price_str):
    """Extract numeric rent from various formats."""
    price_str = price_str.replace(",", "")
    # Handle ranges like "$1,200 - $1,800"
    matches = re.findall(r"\$([\d,]+)", price_str)
    if matches:
        prices = [int(m.replace(",", "")) for m in matches]
        return sum(prices) / len(prices)  # Average for ranges
    return None

def clean_listings(listings):
    """Clean and normalize listing data."""
    df = pd.DataFrame(listings)
    df["rent"] = df["price"].apply(parse_price)
    df = df.dropna(subset=["rent"])
    df = df[df["rent"].between(300, 10000)]  # Filter outliers
    return df
Enter fullscreen mode Exit fullscreen mode

Analyzing Trends Across Cities

import matplotlib.pyplot as plt

def compare_cities(cities):
    """Scrape and compare median rent across cities."""
    all_data = []
    for city, state in cities:
        listings = scrape_listings(city, state)
        df = clean_listings(listings)
        df["city_name"] = f"{city}, {state}"
        all_data.append(df)

    combined = pd.concat(all_data)
    summary = combined.groupby("city_name")["rent"].agg(
        ["median", "mean", "count"]
    ).round(0)

    print(summary.sort_values("median"))
    return combined

cities = [
    ("new-york", "ny"), ("austin", "tx"),
    ("denver", "co"), ("chicago", "il"),
    ("portland", "or"), ("miami", "fl"),
]
data = compare_cities(cities)
Enter fullscreen mode Exit fullscreen mode

Tracking Trends Over Time

Store daily snapshots to a SQLite database:

import sqlite3
from datetime import date

def save_snapshot(df, db_path="rental_data.db"):
    """Save daily rental data snapshot."""
    conn = sqlite3.connect(db_path)
    df["snapshot_date"] = date.today().isoformat()
    df.to_sql("listings", conn, if_exists="append", index=False)
    conn.close()

def get_trend(city, db_path="rental_data.db"):
    """Get median rent trend for a city."""
    conn = sqlite3.connect(db_path)
    query = """
        SELECT snapshot_date, AVG(rent) as median_rent
        FROM listings
        WHERE city = ?
        GROUP BY snapshot_date
        ORDER BY snapshot_date
    """
    trend = pd.read_sql(query, conn, params=[city])
    conn.close()
    return trend
Enter fullscreen mode Exit fullscreen mode

Handling Scale with Proxies

Scraping multiple cities means thousands of requests. Services like ThorData provide residential proxies that prevent blocks, while ScrapeOps helps monitor your scraper's success rate across different sites.

Visualization Dashboard

def plot_city_comparison(data):
    """Create box plot comparing rent distributions."""
    fig, ax = plt.subplots(figsize=(12, 6))
    data.boxplot(column="rent", by="city_name", ax=ax)
    ax.set_title("Rent Distribution by City")
    ax.set_ylabel("Monthly Rent ($)")
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.savefig("rent_comparison.png", dpi=150)
    plt.show()
Enter fullscreen mode Exit fullscreen mode

Key Insights You Can Extract

  • Seasonal patterns: Rent typically drops 5-10% in winter months
  • Neighborhood arbitrage: Adjacent neighborhoods can differ by 30%+
  • Supply signals: Rising listing counts often precede price drops
  • Amenity premiums: Calculate exactly how much "in-unit laundry" adds

Conclusion

Rental market data scraping gives you an information edge whether you're renting, investing, or building a proptech product. Start with one city, automate daily snapshots, and let the trends reveal themselves over weeks of data collection.

Top comments (0)