Building scrapers taught me that a lot of scraping is unnecessary. A surprising amount of valuable public data ships as clean JSON behind endpoints with no key and no signup. Here are the ones I reach for most.
SEC EDGAR
The US securities regulator publishes every corporate filing as JSON. The path data.sec.gov/submissions/CIK##########.json gives a company's entire filing history, and efts.sec.gov runs full text search across filings. No key. The only rules are a fair use rate limit and a descriptive user agent. Insider trades, 8-K events, fund holdings, all there.
openFDA
openFDA exposes drug approvals, recalls, adverse events, and device clearances at api.fda.gov. One gotcha that cost me an hour: drug sponsor names are case sensitive and want uppercase, while device applicant names are not. A 404 just means an empty result set, not an error.
USAspending
Every US federal contract and grant is queryable at api.usaspending.gov. You POST a JSON filter and get awards with amounts, recipients, and dates. If you want new awards rather than old multi year giants, filter by date signed, not the default action date.
npm and PyPI
The npm registry search at registry.npmjs.org/-/v1/search returns packages with maintainers and metadata. On the Python side, pypi.org/simple is the entire package index as one document, and pypi.org/pypi/{package}/json gives structured metadata per project. No key, very stable, great for mapping an ecosystem.
A few more worth a look
ClinicalTrials.gov has a clean v2 API for trials. Google Patents has an undocumented xhr query endpoint that returns JSON. Hacker News ships a Firebase API. Wikipedia and OpenStreetMap both have generous public endpoints.
Why this matters
Half the time I see someone reaching for a paid data vendor or spinning up a headless browser, the data is already sitting behind one of these. Read the docs slowly, send a polite user agent, respect the rate limits, and you skip a lot of pain.
These endpoints quietly power a good chunk of the actors I publish on Apify at https://apify.com/scrapemint, and I keep notes on new ones in the Discord at https://discord.gg/Ed2VNSHbr.
What is your favorite keyless API that more people should know about?
Top comments (1)
This is a great point about unnecessary scraping