agenthustler

Posted on Mar 26

Scraping Airbnb: Listings, Prices, and Availability Data

#python #tutorial #webdev #programming

Scraping Airbnb: Listings, Prices, and Availability Data

Airbnb data is gold for real estate investors, travel startups, and market researchers. In this guide, I'll show you how to extract listings, prices, and availability data from Airbnb using Python.

Why Scrape Airbnb?

Market research: Compare pricing across neighborhoods
Investment analysis: Track occupancy rates and revenue potential
Travel apps: Build price comparison tools
Academic research: Study short-term rental market trends

Setting Up Your Environment

pip install requests beautifulsoup4 pandas

Basic Scraper Structure

import requests
from bs4 import BeautifulSoup
import json
import time

class AirbnbScraper:
    def __init__(self, api_key=None):
        self.session = requests.Session()
        self.api_key = api_key
        self.base_url = "https://www.airbnb.com"

    def get_page(self, url):
        """Fetch page with optional proxy rotation."""
        if self.api_key:
            # Use ScraperAPI for reliable scraping
            proxy_url = f"http://api.scraperapi.com?api_key={self.api_key}&url={url}"
            response = self.session.get(proxy_url)
        else:
            response = self.session.get(url, headers={
                "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
            })
        return response

    def search_listings(self, location, checkin, checkout):
        """Search for listings in a specific location."""
        params = {
            "query": location,
            "checkin": checkin,
            "checkout": checkout
        }
        url = f"{self.base_url}/s/{location}/homes"
        response = self.get_page(url)
        return self.parse_listings(response.text)

    def parse_listings(self, html):
        """Extract listing data from search results."""
        soup = BeautifulSoup(html, "html.parser")
        listings = []

        # Airbnb uses JSON-LD for structured data
        scripts = soup.find_all("script", type="application/json")
        for script in scripts:
            try:
                data = json.loads(script.string)
                if "searchResults" in str(data):
                    listings.extend(self.extract_from_json(data))
            except (json.JSONDecodeError, TypeError):
                continue

        return listings

    def extract_from_json(self, data):
        """Parse structured listing data."""
        results = []
        # Navigate nested JSON structure
        if isinstance(data, dict):
            for key, value in data.items():
                if key == "listing":
                    results.append({
                        "title": value.get("name", ""),
                        "price": value.get("price", {}).get("amount", 0),
                        "currency": value.get("price", {}).get("currency", "USD"),
                        "rating": value.get("avgRating", 0),
                        "reviews": value.get("reviewsCount", 0),
                        "type": value.get("roomType", ""),
                        "beds": value.get("beds", 0),
                        "lat": value.get("lat", 0),
                        "lng": value.get("lng", 0)
                    })
                elif isinstance(value, (dict, list)):
                    results.extend(self.extract_from_json(value))
        elif isinstance(data, list):
            for item in data:
                results.extend(self.extract_from_json(item))
        return results

# Usage
scraper = AirbnbScraper()
listings = scraper.search_listings("Miami-FL", "2025-06-01", "2025-06-07")
for listing in listings:
    print(f"{listing['title']} - ${listing['price']}/night")

Tracking Price Changes Over Time

import pandas as pd
from datetime import datetime

def track_prices(scraper, listing_ids, days=30):
    """Monitor price changes for specific listings."""
    price_history = []

    for listing_id in listing_ids:
        url = f"https://www.airbnb.com/rooms/{listing_id}"
        response = scraper.get_page(url)
        price = extract_price(response.text)

        price_history.append({
            "listing_id": listing_id,
            "price": price,
            "timestamp": datetime.now().isoformat()
        })
        time.sleep(2)  # Be respectful

    df = pd.DataFrame(price_history)
    df.to_csv("airbnb_prices.csv", mode="a", index=False)
    return df

def extract_price(html):
    """Pull nightly price from listing page."""
    soup = BeautifulSoup(html, "html.parser")
    price_el = soup.find("span", class_="_tyxjp1")
    if price_el:
        return float(price_el.text.replace("$", "").replace(",", ""))
    return None

Handling Anti-Scraping Measures

Airbnb has aggressive bot detection. Here's how to handle it:

Use a proxy service: ScraperAPI handles rotation, CAPTCHAs, and headers automatically
Rotate user agents: Keep a pool of realistic browser strings
Respect rate limits: Add delays between requests
Use residential proxies: ThorData offers residential IPs that bypass geo-restrictions

# Using ScraperAPI for reliable Airbnb scraping
import requests

API_KEY = "your_scraperapi_key"  # Get one at scraperapi.com
url = "https://www.airbnb.com/s/New-York/homes"

response = requests.get(
    "http://api.scraperapi.com",
    params={"api_key": API_KEY, "url": url, "render": "true"}
)

Storing and Analyzing Data

import pandas as pd
import matplotlib.pyplot as plt

# Load scraped data
df = pd.read_csv("airbnb_prices.csv")

# Average price by neighborhood
avg_prices = df.groupby("neighborhood")["price"].mean().sort_values()
print(avg_prices)

# Price distribution
df["price"].hist(bins=50)
plt.xlabel("Price per Night ($)")
plt.ylabel("Number of Listings")
plt.title("Airbnb Price Distribution")
plt.savefig("price_distribution.png")

Ethical Considerations

Always check Airbnb's Terms of Service
Don't scrape personal host information
Rate-limit your requests to avoid server strain
Use data responsibly and in compliance with local regulations
Consider using official APIs where available

Scaling Up

For production scraping, consider using ScrapeOps to monitor your scrapers, track success rates, and get alerts when things break.

Conclusion

Airbnb scraping opens up powerful market intelligence for real estate, travel, and research applications. Start small, respect the platform, and scale gradually with proper proxy infrastructure.

Found this useful? Follow me for more web scraping tutorials!

DEV Community

Scraping Airbnb: Listings, Prices, and Availability Data

Scraping Airbnb: Listings, Prices, and Availability Data

Why Scrape Airbnb?

Setting Up Your Environment

Basic Scraper Structure

Tracking Price Changes Over Time

Handling Anti-Scraping Measures

Storing and Analyzing Data

Ethical Considerations

Scaling Up

Conclusion

Top comments (0)