DEV Community

Ava Torres
Ava Torres

Posted on

How PE Firms Can Automate Deal Screening with Free SEC, State, and Federal Data

A friend at a mid-market PE shop told me their analysts spend 6-8 hours per target company just pulling basic background data before the first IC meeting. Business registration status, SEC filings, federal contract history, regulatory violations, OSHA citations, nonprofit affiliations.

All public data. All available for free. All being pulled manually from 8+ different government websites.

That's not due diligence. That's data entry.

What PE Deal Screening Actually Requires

Before you even model the financials, you need answers to:

  1. Is the entity legally active? Check Secretary of State business registrations across all states where the target operates
  2. Any SEC enforcement actions or material filings? 10-K annual reports, 8-K material events, enforcement proceedings
  3. Federal contract exposure? If the target has government revenue, you need to know the contract details, performance history, and whether they're on any exclusion lists
  4. Regulatory risk? OSHA violations, FDA warning letters, NHTSA recalls, EPA enforcement
  5. Litigation risk? Federal court docket searches for pending cases
  6. Financial health signals? IRS 990 filings for nonprofit targets, SEC quarterly reports for public companies

Most PE firms pay $30-80K/year for platforms that aggregate some of this. But the underlying data is free -- it's published by the US government.

The Free Data Stack

Here are the actual data sources, with APIs or scrapers that make them programmatically accessible:

Secretary of State Business Registrations

Every state publishes business entity data -- formation date, registered agent, status, officers. This is your first check on whether a company is legally active.

The problem: 50 different state portals, 50 different interfaces, zero standardization.

I built scrapers for the top states by entity count:

Search by company name, get back registration status, formation date, registered agent, and filing history. Run them on a schedule to monitor status changes.

SEC EDGAR Filings

The SEC publishes every filing made by public companies and regulated entities. Full-text search across 10-K, 10-Q, 8-K, and more.

SEC EDGAR Company Filings Search -- $3.49/1K results

Search by company name, ticker, CIK number, or filing type. Get filing dates, document URLs, and descriptions. Set up alerts for material event filings (8-K) on portfolio companies.

Federal Contract Data (USASpending)

USASpending.gov tracks every federal dollar spent. If your target has government revenue, this tells you exactly how much, which agencies, and whether contracts are active or completed.

USASpending Federal Spending Search -- $3.49/1K results

Search by recipient name, NAICS code, awarding agency, or contract amount. Critical for targets where government revenue is a material percentage.

SAM.gov Federal Contracts & Grants

SAM.gov is the primary database for federal contract opportunities and entity registrations. If a company does business with the federal government, they're registered here.

SAM.gov Federal Contracts Search -- $3.49/1K results

Search active and archived contract opportunities by keyword, NAICS code, agency, or set-aside type.

OSHA Violations

OSHA publishes inspection results and violations for every workplace they've inspected. If your target has manufacturing, construction, or warehouse operations, this is a material risk factor.

OSHA Workplace Inspections Search -- $3.49/1K results

Search by company name or establishment. Returns inspection dates, violation types, penalties, and current status.

FDA Drug and Device Data

For healthcare-adjacent targets, FDA publishes adverse event reports, drug recalls, and device complaints.

OpenFDA Drug Adverse Events & Recalls -- $3.49/1K results

Search by drug name, manufacturer, or event type.

IRS 990 Nonprofit Filings

If the target is a nonprofit or has nonprofit affiliates, IRS 990 filings reveal revenue, expenses, executive compensation, and program details.

IRS 990 Nonprofit Filings Search -- $3.49/1K results

Search by organization name, EIN, state, or filing year.

Federal Register

Track regulatory changes that could affect your target's industry. The Federal Register publishes every proposed rule, final rule, and executive order.

Federal Register Search -- $3.49/1K results

Search by keyword, agency, or document type. Set up schedule to monitor regulatory risk.

Building an Automated Screening Pipeline

Here's how I'd set this up for a PE firm doing 5-10 deal screenings per month:

Step 1: Company intake. When a new target enters the pipeline, trigger searches across all relevant data sources using the company name.

Step 2: Automated data pull. Run SOS searches in relevant states, SEC EDGAR search, USASpending lookup, OSHA search, and any industry-specific checks (FDA for healthcare, NHTSA for automotive).

Step 3: Structured output. Each scraper returns structured JSON. Aggregate into a single deal screening report.

Step 4: Ongoing monitoring. Set up scheduled runs on portfolio companies. Get alerts when new SEC filings appear, OSHA violations are recorded, or business registration status changes.

Total cost per screening: Under $10 in API credits for a comprehensive multi-source check. Compare that to $50K/year for a data platform that covers maybe 60% of the same sources.

The Real Value

The point isn't that free data is better than Bloomberg or PitchBook. Those platforms add layers of analysis, historical financials, and deal comps that raw government data doesn't provide.

The point is that the first 80% of deal screening -- the "is this company real, is it in good standing, does it have regulatory skeletons" part -- can be automated for nearly nothing. Your analysts should be spending their time on financial modeling and strategic analysis, not copying business registration numbers from state websites.

Every hour an analyst spends on data entry is an hour they're not spending on analysis that actually drives investment decisions.


All the scrapers mentioned above are available on Apify with pay-per-result pricing. No subscriptions, no minimums.

Top comments (0)