Understanding rental market trends helps renters negotiate better deals and helps investors identify opportunities. Let's build a rental data scraper that tracks prices across major cities.
The Rental Data Landscape
Sites like Zillow, Apartments.com, and Craigslist have massive rental listing datasets. By scraping and analyzing this data over time, we can identify trends like seasonal price drops, neighborhood gentrification signals, and fair market rent for any area.
Setting Up the Scraper
pip install requests beautifulsoup4 pandas matplotlib
We'll use ScraperAPI for reliable access to rental sites that have aggressive bot detection:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import json
API_KEY = "YOUR_SCRAPERAPI_KEY"
def scrape_listings(city, state, max_pages=5):
"""Scrape rental listings for a given city."""
listings = []
for page in range(1, max_pages + 1):
url = (
f"https://www.apartments.com/"
f"{city}-{state}/{page}/"
)
resp = requests.get(
"http://api.scraperapi.com",
params={"api_key": API_KEY, "url": url, "render": "true"},
timeout=60
)
soup = BeautifulSoup(resp.text, "html.parser")
for card in soup.select(".placard"):
name = card.select_one(".property-title")
price = card.select_one(".property-pricing")
beds = card.select_one(".bed-range")
if name and price:
listings.append({
"name": name.text.strip(),
"price": price.text.strip(),
"beds": beds.text.strip() if beds else "N/A",
"city": city,
"state": state
})
return listings
Cleaning and Normalizing Price Data
Rental listings have messy pricing (ranges, "Call for Rent", etc.):
import re
def parse_price(price_str):
"""Extract numeric rent from various formats."""
price_str = price_str.replace(",", "")
# Handle ranges like "$1,200 - $1,800"
matches = re.findall(r"\$([\d,]+)", price_str)
if matches:
prices = [int(m.replace(",", "")) for m in matches]
return sum(prices) / len(prices) # Average for ranges
return None
def clean_listings(listings):
"""Clean and normalize listing data."""
df = pd.DataFrame(listings)
df["rent"] = df["price"].apply(parse_price)
df = df.dropna(subset=["rent"])
df = df[df["rent"].between(300, 10000)] # Filter outliers
return df
Analyzing Trends Across Cities
import matplotlib.pyplot as plt
def compare_cities(cities):
"""Scrape and compare median rent across cities."""
all_data = []
for city, state in cities:
listings = scrape_listings(city, state)
df = clean_listings(listings)
df["city_name"] = f"{city}, {state}"
all_data.append(df)
combined = pd.concat(all_data)
summary = combined.groupby("city_name")["rent"].agg(
["median", "mean", "count"]
).round(0)
print(summary.sort_values("median"))
return combined
cities = [
("new-york", "ny"), ("austin", "tx"),
("denver", "co"), ("chicago", "il"),
("portland", "or"), ("miami", "fl"),
]
data = compare_cities(cities)
Tracking Trends Over Time
Store daily snapshots to a SQLite database:
import sqlite3
from datetime import date
def save_snapshot(df, db_path="rental_data.db"):
"""Save daily rental data snapshot."""
conn = sqlite3.connect(db_path)
df["snapshot_date"] = date.today().isoformat()
df.to_sql("listings", conn, if_exists="append", index=False)
conn.close()
def get_trend(city, db_path="rental_data.db"):
"""Get median rent trend for a city."""
conn = sqlite3.connect(db_path)
query = """
SELECT snapshot_date, AVG(rent) as median_rent
FROM listings
WHERE city = ?
GROUP BY snapshot_date
ORDER BY snapshot_date
"""
trend = pd.read_sql(query, conn, params=[city])
conn.close()
return trend
Handling Scale with Proxies
Scraping multiple cities means thousands of requests. Services like ThorData provide residential proxies that prevent blocks, while ScrapeOps helps monitor your scraper's success rate across different sites.
Visualization Dashboard
def plot_city_comparison(data):
"""Create box plot comparing rent distributions."""
fig, ax = plt.subplots(figsize=(12, 6))
data.boxplot(column="rent", by="city_name", ax=ax)
ax.set_title("Rent Distribution by City")
ax.set_ylabel("Monthly Rent ($)")
plt.xticks(rotation=45)
plt.tight_layout()
plt.savefig("rent_comparison.png", dpi=150)
plt.show()
Key Insights You Can Extract
- Seasonal patterns: Rent typically drops 5-10% in winter months
- Neighborhood arbitrage: Adjacent neighborhoods can differ by 30%+
- Supply signals: Rising listing counts often precede price drops
- Amenity premiums: Calculate exactly how much "in-unit laundry" adds
Conclusion
Rental market data scraping gives you an information edge whether you're renting, investing, or building a proptech product. Start with one city, automate daily snapshots, and let the trends reveal themselves over weeks of data collection.
Top comments (0)