DEV Community

agenthustler
agenthustler

Posted on

How to Build a Supply Chain Visibility Tool with Web Scraping

How to Build a Supply Chain Visibility Tool with Web Scraping

Supply chain disruptions cost businesses billions. Real-time visibility into shipping, inventory signals, and supplier status means the difference between a minor delay and a major crisis.

What We'll Track

  • Port congestion and shipping delays
  • Supplier website changes (stock status, lead times)
  • Commodity price movements
  • Logistics provider status pages

Setup

import requests
from bs4 import BeautifulSoup
import json
import hashlib
from datetime import datetime

PROXY_URL = "https://api.scraperapi.com"
API_KEY = "YOUR_SCRAPERAPI_KEY"
Enter fullscreen mode Exit fullscreen mode

Shipping and logistics sites vary globally. ScraperAPI provides geo-targeted requests for international supply chain data.

Monitoring Port Congestion

def scrape_port_status(port_url):
    params = {
        "api_key": API_KEY,
        "url": port_url,
        "render": "true"
    }
    response = requests.get(PROXY_URL, params=params)
    soup = BeautifulSoup(response.text, "html.parser")

    vessels = []
    for row in soup.select("table.vessel-list tbody tr"):
        cells = row.select("td")
        if len(cells) >= 4:
            vessels.append({
                "vessel_name": cells[0].text.strip(),
                "status": cells[1].text.strip(),
                "eta": cells[2].text.strip(),
                "berth": cells[3].text.strip()
            })

    waiting = len([v for v in vessels if "waiting" in v["status"].lower()])
    return {
        "total_vessels": len(vessels),
        "waiting": waiting,
        "congestion_ratio": round(waiting/len(vessels)*100, 1) if vessels else 0,
        "vessels": vessels
    }
Enter fullscreen mode Exit fullscreen mode

Tracking Supplier Inventory

def check_supplier_status(supplier):
    params = {
        "api_key": API_KEY,
        "url": supplier["url"],
        "render": "true"
    }
    response = requests.get(PROXY_URL, params=params)
    soup = BeautifulSoup(response.text, "html.parser")

    products = []
    for item in soup.select(".product-item"):
        name = item.select_one(".product-name, h3")
        stock = item.select_one(supplier["stock_sel"])
        lead_time = item.select_one(".lead-time")

        products.append({
            "supplier": supplier["name"],
            "product": name.text.strip() if name else "",
            "in_stock": "in stock" in (stock.text.lower() if stock else ""),
            "lead_time": lead_time.text.strip() if lead_time else "",
            "page_hash": hashlib.md5(response.text.encode()).hexdigest()
        })
    return products
Enter fullscreen mode Exit fullscreen mode

Risk Scoring

def calculate_supply_risk(port_data, supplier_data, commodity_data):
    risk_score = 0
    factors = []

    if port_data["congestion_ratio"] > 30:
        risk_score += 3
        factors.append(f"Port congestion at {port_data['congestion_ratio']}%")

    out_of_stock = sum(1 for s in supplier_data if not s["in_stock"])
    if out_of_stock > len(supplier_data) * 0.2:
        risk_score += 3
        factors.append(f"{out_of_stock} supplier items out of stock")

    level = "LOW" if risk_score < 3 else "MEDIUM" if risk_score < 6 else "HIGH"
    return {"risk_score": risk_score, "risk_level": level, "factors": factors}
Enter fullscreen mode Exit fullscreen mode

Infrastructure

  • ScraperAPI — geo-targeting for port authorities and logistics sites
  • ThorData — residential proxies for global supply chain coverage
  • ScrapeOps — monitor uptime across your supply chain scraping network

Conclusion

Supply chain visibility through web scraping turns scattered public data into an early warning system. Start with your most critical suppliers and expand from there.

Top comments (0)