I Built a Script That Finds Hidden APIs on Any Website (Here's the Code)

#api #webdev #javascript #tutorial

Most websites have hidden JSON APIs that return cleaner data than their HTML pages. I built a simple Node.js script that discovers them automatically.

The Problem

Traditional web scraping is fragile:

CSS selectors break when sites redesign
HTML parsing is slow and error-prone
Anti-bot systems block headless browsers

But many sites expose internal JSON APIs that are faster, more stable, and return structured data.

The Discovery Script

const https = require("https");

async function findAPIs(domain) {
  const commonPaths = [
    "/api/v1",
    "/api/v2",
    "/api/graphql",
    "/_next/data",
    "/wp-json/wp/v2/posts",
    "/feed.json",
    "/sitemap.xml",
    "/.well-known/openid-configuration",
    "/robots.txt",
    "/manifest.json"
  ];

  const results = [];

  for (const path of commonPaths) {
    try {
      const res = await fetch(`https://${domain}${path}`);
      if (res.ok) {
        const contentType = res.headers.get("content-type") || "";
        results.push({
          path,
          status: res.status,
          type: contentType.split(";")[0],
          size: res.headers.get("content-length") || "unknown"
        });
      }
    } catch (e) {
      // Skip failed requests
    }
  }

  return results;
}

// Usage
findAPIs("dev.to").then(apis => {
  console.log(`Found ${apis.length} endpoints:\n`);
  apis.forEach(api => {
    console.log(`  ${api.path} → ${api.type} (${api.size} bytes)`);
  });
});

Real Examples I Found

Website	Hidden API	What It Returns
dev.to	`/api/articles`	Full article data as JSON
npm	`/registry.npmjs.org/{pkg}`	Complete package metadata
PyPI	`/pypi/{pkg}/json`	Package info, versions, downloads
Wikipedia	`/api/rest_v1/page/summary/{title}`	Clean article summaries
GitHub	`/api/v3/repos/{owner}/{repo}`	Repository data

Why This Matters

For one of my projects, switching from HTML scraping to the hidden JSON API:

Reduced code from 120 lines to 15 lines
Increased speed by 10x (no browser needed)
Eliminated breakage — JSON APIs rarely change structure

Pro Tips

Check Network tab in DevTools — filter by XHR/Fetch to see what APIs the frontend calls
Look for /api/, /_next/data/, /wp-json/ — these are the most common patterns
Check robots.txt — it often reveals API paths the site doesn't want crawled (but are technically public)
Try adding .json to any URL — many frameworks return JSON when you do this