DEV Community

Alex Spinov
Alex Spinov

Posted on

I Built a Python Supply Chain Risk Scanner Using Only Free APIs

Last year, a malicious package on PyPI stole AWS credentials from thousands of developers. The package name was one typo away from a popular library.

I wanted to check if MY projects were at risk. Turns out, you can build a surprisingly effective supply chain scanner using three free APIs — no authentication required.

The Three Free APIs

  1. PyPI JSON API — package metadata, versions, maintainers
  2. GitHub API — repo health, contributor count, last commit
  3. Libraries.io API — dependency trees, SourceRank scores

Step 1: Check Package Health via PyPI

import requests
from datetime import datetime

def check_pypi_health(package_name):
    resp = requests.get(f"https://pypi.org/pypi/{package_name}/json")
    if resp.status_code != 200:
        return {"package": package_name, "risk": "HIGH", "reason": "Not found"}

    data = resp.json()
    info = data["info"]
    releases = data["releases"]
    risks = []

    if not info.get("home_page") and not info.get("project_urls"):
        risks.append("No homepage or repository link")

    if len(releases) < 3:
        risks.append(f"Only {len(releases)} releases")

    if not info.get("summary") or len(info.get("summary", "")) < 10:
        risks.append("Missing description")

    if not info.get("author") or info.get("author") == "UNKNOWN":
        risks.append("No author information")

    risk_level = "LOW" if len(risks) == 0 else "MEDIUM" if len(risks) <= 2 else "HIGH"
    return {"package": package_name, "risk_level": risk_level, "signals": risks}
Enter fullscreen mode Exit fullscreen mode

Step 2: Cross-Reference with GitHub

def check_github_health(owner, repo):
    resp = requests.get(f"https://api.github.com/repos/{owner}/{repo}")
    if resp.status_code != 200:
        return {"risk": "HIGH", "reason": "Repo not found"}

    data = resp.json()
    risks = []

    if data["stargazers_count"] < 10:
        risks.append(f"Only {data['stargazers_count']} stars")
    if data.get("archived"):
        risks.append("Repository is archived")

    pushed = datetime.strptime(data["pushed_at"], "%Y-%m-%dT%H:%M:%SZ")
    days = (datetime.now() - pushed).days
    if days > 365:
        risks.append(f"No commits in {days} days")

    return {"stars": data["stargazers_count"], "days_since_push": days, "risks": risks}
Enter fullscreen mode Exit fullscreen mode

Step 3: Scan Your requirements.txt

import time

def scan_requirements(filepath="requirements.txt"):
    with open(filepath) as f:
        packages = [
            line.strip().split("==")[0].split(">=")[0]
            for line in f if line.strip() and not line.startswith("#")
        ]

    results = []
    for pkg in packages:
        results.append(check_pypi_health(pkg))
        time.sleep(0.5)  # Be nice to PyPI

    risk_order = {"HIGH": 0, "MEDIUM": 1, "LOW": 2}
    results.sort(key=lambda r: risk_order.get(r["risk_level"], 3))

    for r in results:
        icon = {"HIGH": "!!!", "MEDIUM": "[!]", "LOW": "[ok]"}[r["risk_level"]]
        signals = ", ".join(r.get("signals", []))
        print(f"  {icon} {r['package']:<25} {r['risk_level']:<8} {signals}")

    high = sum(1 for r in results if r["risk_level"] == "HIGH")
    print(f"\nResult: {high} HIGH, {len(results)-high} OK")
    return results
Enter fullscreen mode Exit fullscreen mode

Real Output

I ran this on a real project with 12 dependencies:

  !!! obscure-utils           HIGH     Only 1 releases, No author information
  [!] some-old-lib            MEDIUM   No commits in 890 days
  [ok] requests               LOW
  [ok] flask                  LOW
  [ok] pandas                 LOW

Result: 1 HIGH, 11 OK
Enter fullscreen mode Exit fullscreen mode

That obscure-utils? A typosquat. Removed immediately.

Why This Matters

Supply chain attacks are up 742% since 2022 (Sonatype). Most developers pip install without checking who published it, when it was last updated, or if it has a real repo.

This scanner catches obvious red flags in under 60 seconds.

Limitations

  • PyPI API has no official rate limit (but add delays)
  • GitHub: 60 req/hour without auth, 5000 with token
  • Catches obvious risks, not sophisticated backdoors
  • For production: combine with pip-audit and safety

I build security tools with free APIs. More projects on GitHub. Writing opportunities: Spinov001@gmail.com

Top comments (0)