DEV Community

NexGenData
NexGenData

Posted on • Originally published at thenextgennexus.com

Free Crunchbase Alternative: Company + Funding Data from Official Sources (No API Key)

If you've priced Crunchbase or PitchBook lately, you know company and funding data is mostly locked behind expensive seats. But a surprising amount of it is public and free — you just have to pull it from the primary sources instead of a reseller. This guide shows how to assemble a company profile (legal identity + real funding rounds + financials) from the SEC and the global LEI system, with no API key and no scraping of gated sites.

1. Funding rounds → SEC Form D

When a private US company raises a round under Regulation D (most venture and private placements), it files a Form D with the SEC. It's a public filing, and it contains the parts you actually want: total offering amount, amount sold, industry, and date.

Find a company's Form D filings via EDGAR full-text search, then pull the offering amounts from the filing's XML:

# 1) find Form D filings for an issuer
curl -s 'https://efts.sec.gov/LATEST/search-index?q=%22Databricks%22&forms=D' \
  -H 'User-Agent: yourname you@example.com'

# 2) each hit has a CIK + accession; fetch the Form D primary_doc.xml
curl -s 'https://www.sec.gov/Archives/edgar/data/<CIK>/<ACCESSION_NODASHES>/primary_doc.xml' \
  -H 'User-Agent: yourname you@example.com'
Enter fullscreen mode Exit fullscreen mode

The XML carries <totalOfferingAmount>, <totalAmountSold>, and <industryGroupType>. A recent Databricks Form D, for example, discloses an offering north of $1B — real funding data, filed by the company, free to read.

Two honest caveats (these matter, and most scrapers get them wrong):

  • Match the issuer name strictly. Full-text search returns anything mentioning the term. Stripe Milton LLC is not Stripe; a fund named after a startup is not the startup. Normalize and require an exact legal-name match, and exclude SPV/fund vehicles.
  • Coverage is partial. Plenty of famous startups raise through structures that never file a Form D under their own name. Form D is excellent where it exists — don't treat its absence as "no funding."

Always send a declared User-Agent with contact info; that's SEC's fair-access requirement.

2. Legal identity → GLEIF (the LEI system)

The Global Legal Entity Identifier Foundation publishes legal-entity data as fully open data — legal name, HQ country, registered address, status. Resolve a brand name to an entity:

# fuzzy-match a name to candidate LEIs
curl -s 'https://api.gleif.org/api/v1/fuzzycompletions?field=entity.legalName&q=Anthropic'

# then fetch the full record
curl -s 'https://api.gleif.org/api/v1/lei-records/<LEI>'
Enter fullscreen mode Exit fullscreen mode

Caveat: common brand names collide (there can be several "Stripe" entities in different countries), so confirm the country/jurisdiction before trusting a match — ideally cross-check against a US SEC filing.

3. Industry + financials → SEC EDGAR

For any SEC-registered company (public companies and Form D filers), the submissions API gives you SIC industry, state of incorporation, business address, former names, and tickers:

curl -s 'https://data.sec.gov/submissions/CIK0000320193.json' -H 'User-Agent: yourname you@example.com'
Enter fullscreen mode Exit fullscreen mode

For public companies, XBRL company-facts give real reported numbers (Apple's latest annual revenue comes back as ~$416B):

curl -s 'https://data.sec.gov/api/xbrl/companyconcept/CIK0000320193/us-gaap/Revenues.json' \
  -H 'User-Agent: yourname you@example.com'
Enter fullscreen mode Exit fullscreen mode

Putting it together

Stitch those three and you have a legitimate, official-source company profile — identity, funding signals, and financials — without paying for Crunchbase and without scraping anything gated. The hard part is the glue: entity resolution, strict issuer matching, XML parsing, and rate-limit-friendly EDGAR access.

If you'd rather skip the plumbing, the Company Data Aggregator actor does exactly this in one call — give it a company name or domain and it returns GLEIF legal identity, SEC Form D funding signals, EDGAR industry/financials, and a domain/tech profile. No API key.

Useful neighbors if you're building company intelligence:

All of this is informational, official-source data — not investment advice. Build responsibly, declare your User-Agent, and respect the SEC's rate limits.

Top comments (0)