SEC EDGAR for Developers: The Free Fundamentals API Hiding in Plain Sight

#investing #finance #beginners #productivity

You finished the comparison shopping between Polygon and Alpha
Vantage, maybe even signed up for a free tier of each. Now you
want to pull fundamentals — revenue, earnings, free cash flow —
for your screener.

Don't reach for the paid tier just yet. The SEC publishes every
filing from every public US company through a free, well-
structured JSON API at data.sec.gov. It's unrated, public, and
the data is straight from the source.

What EDGAR actually exposes

Three endpoints do most of the work:

/submissions/CIK{cik}.json — every filing a company has made, with dates, accession numbers, and direct links to the documents.
/api/xbrl/companyfacts/CIK{cik}.json — every reported XBRL fact for that company, across all filings, indexed by concept (Revenues, NetIncomeLoss, Assets, etc.) and unit.
/api/xbrl/companyconcept/CIK{cik}/us-gaap/{concept}.json — just one concept across all reported periods.

The companyfacts endpoint is the workhorse. One JSON request
and you have every quarterly and annual fact the company has
filed with the SEC since they switched to XBRL in 2009.

The annoying part: CIK lookup

EDGAR keys companies by CIK (Central Index Key), not ticker.
You'll need to maintain or fetch a ticker → CIK map. The SEC
publishes one:

curl -A 'your-name you@example.com' \
  https://www.sec.gov/files/company_tickers.json

The User-Agent header is required — EDGAR rate-limits anonymous
requests and asks for an identifying string. They throttle at 10
requests/second across all clients; respect it.

A minimal example

Here's the smallest useful thing — pulling the last 4 quarterly
revenues for Apple (CIK 0000320193):

import requests

headers = {'User-Agent': 'side-project you@example.com'}
url = (
    'https://data.sec.gov/api/xbrl/companyconcept/'
    'CIK0000320193/us-gaap/Revenues.json'
)

r = requests.get(url, headers=headers)
data = r.json()

# units may include USD and USD/shares; pick USD
usd = data['units']['USD']
# 10-Q quarterly filings only
quarterly = [f for f in usd if f.get('form') == '10-Q']
quarterly.sort(key=lambda f: f['end'], reverse=True)
for f in quarterly[:4]:
    print(f"{f['end']}: ${f['val']:,}")

That's it. Free, structured, official.

When EDGAR wins

No rate-limit drama for any volume a side project can generate.
Authoritative — straight from filings. No vendor between you and the company's own numbers.
Historical depth since 2009 for most large filers, earlier in some cases.

Where EDGAR struggles

Concept fragmentation. Companies don't all use the same XBRL concept for the same thing. Apple uses Revenues; some others have used SalesRevenueNet or company-specific extensions. Real cleanup work.
Restated filings. When a company restates an earlier quarter, EDGAR contains both the original and the restated values. Your code has to decide which one is "the truth" for backtest purposes.
Calendar mismatch. Companies report on different fiscal calendars. You can't naively compare one issuer's Q1 ending in December to another's Q1 ending in September.
No price data. EDGAR is filings, not market data. You still need Polygon, Alpha Vantage, or similar for OHLC.

A reasonable production setup

For a magic-formula-style screener:

Maintain a local cik_map table (refresh weekly from SEC's company_tickers.json).
For each ticker in your universe, fetch companyfacts once and cache. Refresh on a quarterly cadence — fundamentals don't change daily.
Normalize concepts to your own internal names (revenues, net_income, total_assets, etc.) with a hand-curated mapping that tolerates fragmentation.
Get prices from your paid or free price-data API.
Run your ranking once a week — it's cheaper than the daily refresh cadence most tutorials suggest.

The hardest part of building a fundamentals pipeline is not the
API — it's the concept normalization. Budget more time for
"decide what counts as revenue across 500 issuers" than for
"wire up the HTTP request." Get it wrong and your screener
silently ranks the inconsistent reporters.

Closing note

The fundamentals data that paid APIs charge for is largely a
cleaned-up, normalized version of what EDGAR already gives you
for free. If your project is willing to do the cleanup, EDGAR is
the better foundation. If you'd rather pay a vendor to handle the
normalization, that's also a reasonable choice. Either way,
knowing the raw source exists changes how you think about the
cost of building.

Originally published at pickuma.com. Subscribe to the RSS or follow @pickuma.bsky.social for new reviews.