E-commerce pricing is rarely static. On sites like Crate & Barrel, the cost of a sofa or dining set can fluctuate significantly based on seasonal sales, clearance events, or inventory shifts. Checking a page manually every day is a chore you're likely to forget, often missing the sale entirely.
This guide covers building an automated Crate & Barrel price monitor. We’ll use an open-source Playwright scraper to extract data, build a wrapper to track price history, and create a visual dashboard using Pandas and Matplotlib. By the end, you’ll have a system that watches prices and visualizes trends so you can buy when the price is lowest.
Prerequisites & Setup
We’ll use the Crateandbarrel.com-Scrapers repository as a foundation.
1. Clone the Repository
Open your terminal and clone the base scrapers:
git clone https://github.com/scraper-bank/Crateandbarrel.com-Scrapers.git
cd Crateandbarrel.com-Scrapers
2. Install Dependencies
You'll need Python 3.8 or higher. Install the libraries for browser automation and data visualization:
pip install playwright pandas matplotlib playwright-stealth
playwright install chromium
3. Get a ScrapeOps API Key
Crate & Barrel uses anti-bot measures that often block standard headless browsers. To ensure the monitor runs reliably, we’ll use ScrapeOps to handle proxy rotation and browser headers.
You can sign up for a free ScrapeOps API key here.
Step 1: Understanding the Base Scraper
In the cloned repository, navigate to the Playwright implementation:
python/playwright/product_data/scraper/crateandbarrel_scraper_product_data_v1.py.
This script handles the heavy lifting of data extraction. It uses a PROXY_CONFIG dictionary to route requests through ScrapeOps:
# python/playwright/product_data/scraper/crateandbarrel_scraper_product_data_v1.py
PROXY_CONFIG = {
"server": "http://residential-proxy.scrapeops.io:8181",
"username": "scrapeops",
"password": "YOUR_SCRAPEOPS_API_KEY"
}
The core of this scraper is the extract_data function. It parses JSON-LD (structured data embedded in the page) to find the most accurate price, name, and availability. If the price drops, the scraper captures both the current price and the preDiscountPrice.
Step 2: Configuring the Target List
Instead of scraping the whole site, we'll focus on specific items. Create a configuration file named products.py to store your target URLs.
# products.py
TARGET_PRODUCTS = [
"https://www.crateandbarrel.com/lounge-deep-3-piece-sectional-sofa/s533475",
"https://www.crateandbarrel.com/tate-walnut-gateleg-dining-table/s454238",
"https://www.crateandbarrel.com/direction-ash-wood-floor-lamp/s582497"
]
Step 3: Building the Monitor Script
The base scraper runs once and saves a JSONL file. For a price tracker, we need a wrapper that appends data to a master CSV file every time it runs. Create monitor.py and import the logic from the repository.
import asyncio
import pandas as pd
from datetime import datetime
import os
from playwright.async_api import async_playwright
from products import TARGET_PRODUCTS
from python.playwright.product_data.scraper.crateandbarrel_scraper_product_data_v1 import extract_data, PROXY_CONFIG
async def run_monitor():
results = []
scrape_date = datetime.now().strftime("%Y-%m-%d")
async with async_playwright() as p:
# Launch browser with ScrapeOps proxy
browser = await p.chromium.launch(proxy=PROXY_CONFIG)
page = await browser.new_page()
for url in TARGET_PRODUCTS:
print(f"Checking price for: {url}")
await page.goto(url, wait_until="domcontentloaded")
data = await extract_data(page)
if data:
results.append({
"date": scrape_date,
"name": data.name,
"price": data.price,
"url": url
})
await browser.close()
# Save to CSV
df_new = pd.DataFrame(results)
file_path = "price_history.csv"
if not os.path.isfile(file_path):
df_new.to_csv(file_path, index=False)
else:
df_new.to_csv(file_path, mode='a', header=False, index=False)
print(f"Updated price history for {len(results)} items.")
if __name__ == "__main__":
asyncio.run(run_monitor())
By appending to price_history.csv, we create a time-series dataset. Each row contains the date, product name, and price, which serves as the foundation for the dashboard.
Step 4: Automating the Run
To catch daily price changes, the script needs to run regularly.
Option A: Simple Loop
For quick testing, wrap the execution in a loop:
if __name__ == "__main__":
while True:
asyncio.run(run_monitor())
print("Waiting 24 hours for next check...")
time.sleep(86400)
Option B: Cron Job
In production, use a system scheduler like Cron. To run the monitor every morning at 8:00 AM, add this to your crontab (crontab -e):
0 8 * * * /usr/bin/python3 /path/to/monitor.py
Step 5: Visualizing the Data
We can turn the CSV file into a trend chart. Create dashboard.py using Pandas to clean the data and Matplotlib to plot it.
import pandas as pd
import matplotlib.pyplot as plt
import os
def generate_dashboard():
if not os.path.exists("price_history.csv"):
print("No data found. Run the monitor first!")
return
df = pd.read_csv("price_history.csv")
df['date'] = pd.to_datetime(df['date'])
df['price'] = pd.to_numeric(df['price'])
plt.figure(figsize=(10, 6))
for product in df['name'].unique():
product_data = df[df['name'] == product]
plt.plot(product_data['date'], product_data['price'], marker='o', label=product)
plt.title("Crate & Barrel Price Trends")
plt.xlabel("Date")
plt.ylabel("Price ($)")
plt.legend(loc='upper left', bbox_to_anchor=(1, 1))
plt.grid(True, linestyle='--', alpha=0.7)
plt.tight_layout()
plt.savefig("price_dashboard.png")
plt.show()
if __name__ == "__main__":
generate_dashboard()
This script groups data by product name. If you track three items, you’ll see three lines. When Crate & Barrel launches a sale, a sharp dip in the line chart is your signal to buy.
Practical Tips
-
Anti-Bot Protection: Crate & Barrel uses advanced detection. Running this script without a proxy or with basic
requestswill likely result in a 403 Forbidden error. Playwright,playwright-stealth, and ScrapeOps residential proxies are necessary for reliable monitoring. -
Handling Out-of-Stock: The base scraper extracts an
availabilityfield. You can modifymonitor.pyto record when an item is out of stock, as prices often change when items are replenished. -
Data Integrity: If Crate & Barrel changes their page layout, the CSS selectors might break. Periodically check
price_history.csvto ensure prices aren't returning as0.0orNone.
To Wrap Up
A custom price monitor provides a data-driven advantage. Instead of relying on luck, you can track the actual market behavior of the furniture you want.
Key Takeaways:
- Playwright handles dynamic content and JSON-LD extraction better than static scrapers.
- ScrapeOps bypasses anti-bot walls that would otherwise block automated checks.
- Pandas simplifies turning a CSV into a time-series analysis.
Try expanding this project by adding an email notification system using Python's smtplib. You can set a target price in products.py and have the script email you the moment the price hits your threshold.
Top comments (0)