Pinterest is a goldmine of visual content data — from trending design ideas to product inspiration. Whether you're building a visual content aggregator, doing market research, or analyzing design trends, extracting Pinterest data programmatically can save you hundreds of hours.
In this guide, I'll show you how to scrape Pinterest boards and pins using Python.
Why Scrape Pinterest?
Pinterest has over 450 million monthly active users pinning images across millions of boards. This data is valuable for:
- Market research: See what products and designs are trending
- Content strategy: Analyze what visual content gets the most engagement
- Competitive analysis: Track competitor boards and pin strategies
- Image dataset building: Collect visual data for ML projects
Setting Up Your Environment
import requests
from bs4 import BeautifulSoup
import json
import time
# Use a residential proxy to avoid blocks
PROXY = {
"http": "http://your-proxy:port",
"https": "http://your-proxy:port"
}
HEADERS = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
}
For reliable proxy rotation, I recommend ThorData residential proxies — they have excellent success rates for image-heavy sites like Pinterest.
Extracting Board Data
Pinterest loads content dynamically, so we need to target their internal API endpoints:
def scrape_pinterest_board(board_url):
"""Extract pins from a Pinterest board."""
response = requests.get(board_url, headers=HEADERS, proxies=PROXY)
soup = BeautifulSoup(response.text, "html.parser")
# Pinterest embeds data in script tags
scripts = soup.find_all("script", {"type": "application/json"})
pins = []
for script in scripts:
try:
data = json.loads(script.string)
# Navigate the nested structure to find pin data
if "props" in data:
pin_data = extract_pins_from_json(data)
pins.extend(pin_data)
except (json.JSONDecodeError, TypeError):
continue
return pins
def extract_pins_from_json(data):
"""Parse pin details from Pinterest's JSON data."""
pins = []
# Recursively search for pin objects
def find_pins(obj):
if isinstance(obj, dict):
if "images" in obj and "description" in obj:
pins.append({
"description": obj.get("description", ""),
"image_url": obj.get("images", {}).get("orig", {}).get("url", ""),
"link": obj.get("link", ""),
"repin_count": obj.get("repin_count", 0),
"comment_count": obj.get("comment_count", 0),
})
for value in obj.values():
find_pins(value)
elif isinstance(obj, list):
for item in obj:
find_pins(item)
find_pins(data)
return pins
Downloading Pin Images
import os
def download_pin_images(pins, output_dir="pinterest_images"):
os.makedirs(output_dir, exist_ok=True)
for i, pin in enumerate(pins):
if pin["image_url"]:
try:
img_response = requests.get(pin["image_url"], timeout=10)
filepath = os.path.join(output_dir, f"pin_{i}.jpg")
with open(filepath, "wb") as f:
f.write(img_response.content)
print(f"Downloaded pin {i}: {pin['description'][:50]}")
time.sleep(1) # Be respectful
except Exception as e:
print(f"Failed to download pin {i}: {e}")
Handling Pagination
Pinterest uses infinite scroll, which means content loads dynamically. To get all pins from a board:
def scrape_full_board(board_url, max_pins=500):
"""Scrape all pins from a board with pagination."""
all_pins = []
bookmark = None
while len(all_pins) < max_pins:
params = {"bookmark": bookmark} if bookmark else {}
response = requests.get(
board_url,
headers=HEADERS,
params=params,
proxies=PROXY
)
if response.status_code != 200:
print(f"Request failed: {response.status_code}")
break
data = response.json()
pins = data.get("resource_response", {}).get("data", [])
if not pins:
break
all_pins.extend(pins)
bookmark = data.get("resource", {}).get("options", {}).get("bookmark")
if not bookmark or bookmark == "-end-":
break
time.sleep(2) # Rate limiting
return all_pins[:max_pins]
The Easy Way: Use a Pre-Built Scraper
Building and maintaining a Pinterest scraper is complex — Pinterest frequently changes their frontend structure and API endpoints.
For production use, I recommend the Pinterest Scraper on Apify. It handles all the complexity — dynamic rendering, pagination, proxy rotation, and anti-bot bypasses — so you can focus on analyzing the data instead of fighting with selectors.
Storing Results
import csv
def save_to_csv(pins, filename="pinterest_data.csv"):
if not pins:
return
keys = pins[0].keys()
with open(filename, "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=keys)
writer.writeheader()
writer.writerows(pins)
print(f"Saved {len(pins)} pins to {filename}")
Best Practices
- Respect rate limits: Add delays between requests (2-5 seconds minimum)
- Use proxies: Pinterest actively blocks scrapers. ThorData provides reliable residential proxies perfect for this
- Cache results: Don't re-scrape data you already have
- Check robots.txt: Always review the site's robots.txt before scraping
- Handle errors gracefully: Network issues are common — implement retries with exponential backoff
Conclusion
Pinterest scraping opens up powerful possibilities for visual content analysis, market research, and data collection. Whether you build your own scraper or use a managed solution like the Pinterest Scraper on Apify, always scrape responsibly and respect the platform's terms of service.
Happy scraping!
Top comments (0)