Web scraping lets you automatically collect data from any website. Here's how to do it with Python.
Basic Scraping with BeautifulSoup
import requests
from bs4 import BeautifulSoup
def scrape_prices(url):
headers = {'User-Agent': 'Mozilla/5.0'}
soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')
return [item.text for item in soup.find_all('span', class_='price')]
prices = scrape_prices('https://example-shop.com')
print(f'Found {len(prices)} prices')
For JavaScript-Heavy Sites
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto('https://dynamic-site.com')
page.wait_for_selector('.data-table')
data = page.evaluate('() => document.querySelector(".data-table").innerText')
browser.close()
Use Cases
- Price monitoring across competitors
- Lead generation (business contacts)
- Job market research
- Real estate listings
- News aggregation
Legal Reminder
Always check robots.txt, respect rate limits, and only scrape public data.
Python Business Automation Toolkit includes a complete web scraping module with proxy support, rate limiting, and multiple output formats.
Top comments (0)