DEV Community

Ava Torres
Ava Torres

Posted on

9 Free Public Records Tools Every Journalist Should Bookmark

There is more investigative data available to the public than most journalists ever use. Federal agencies publish company filings, environmental violations, workplace safety records, federal contracts, and product recalls -- all of it free, all of it searchable. The problem is not access. The problem is that the portals are slow, the search interfaces are terrible, and pulling anything useful at scale requires clicking through paginated results until your eyes go numb.

I spend a lot of time automating public records research. Over the past year, I've built scrapers for most of the databases below -- partly because I needed the data, partly because the official search interfaces are genuinely painful. Here are the nine I come back to most.


1. Secretary of State Business Filings

What it has: Corporate registration records for businesses in every US state -- registered agent, formation date, entity type, status (active/dissolved/revoked), and sometimes officer names and addresses.

Investigative use case: Tracing shell company networks. A subject forms three LLCs with different names but the same registered agent. Each state portal shows one piece. Cross-referencing across states manually takes days. Automated, it takes minutes.

Tool: US Business Entity Search -- covers NY, TX, CA, and expanding. Feed it a name, get structured records back.


2. SEC EDGAR

What it has: Every public company filing since the early 1990s -- 10-Ks, 10-Qs, 8-Ks, proxy statements, beneficial ownership disclosures, and more. EDGAR is genuinely one of the most underused investigative databases in existence.

Investigative use case: Following money through executive compensation, related-party transactions, and undisclosed conflicts. The proxy statement alone (DEF 14A) contains more embarrassing information per page than almost any other public document.

Tool: SEC EDGAR Company Filings -- search by company name or CIK, filter by filing type, pull structured results.


3. IRS 990 Nonprofit Filings

What it has: Annual financial disclosures for every US nonprofit -- revenue, expenses, executive compensation, grants paid, and contractor payments. Required by law. All public.

Investigative use case: Executive pay at nonprofits claiming financial hardship. Grants flowing between organizations with overlapping board members. Foundations that spend 90% of donations on "administrative costs." The 990 answers all of these questions if you know where to look.

Tool: IRS 990 Nonprofit Search -- search by organization name or EIN, get structured financial data without touching the IRS portal.


4. EPA Toxic Release Inventory

What it has: Annual reports from industrial facilities on toxic chemical releases -- what was released, how much, into which environmental medium (air, water, land). Covers over 650 chemicals across thousands of facilities nationwide.

Investigative use case: Environmental justice reporting. Which facilities release the most toxics? Which communities are downwind? How have releases changed over time after a facility claimed to be "cleaning up"? The TRI is the starting point for almost every serious environmental investigation.

Tool: EPA TRI Toxic Release Search -- query by chemical, facility, state, or year. Structured output ready for analysis.


5. OSHA Inspections

What it has: Every OSHA workplace inspection since the 1970s -- inspection date, establishment, industry code, violations cited, penalty amounts, and whether the citation was contested or reduced.

Investigative use case: A company claims its safety record is exemplary. OSHA says otherwise: three willful violations in four years, each penalty negotiated down to a fraction of the original. This database tells you which employers are repeat offenders and what their violations actually cost them.

Tool: OSHA Inspection Search -- filter by employer name, state, industry, or date range.


6. FDA Recalls

What it has: Every FDA recall since the early 2000s -- product name, company, recall reason, classification (Class I is the most serious), affected lots, and distribution scope.

Investigative use case: A food manufacturer has had four Class I recalls in three years. Each one was covered individually as a brief news item. Nobody connected them. FDA Recalls is how you connect them.

Tool: FDA Recall Search -- search by company, product type, or date range. Structured output with classification and reason fields.


7. NHTSA Complaints

What it has: Consumer complaints filed with the National Highway Traffic Safety Administration about vehicle defects -- make, model, year, component, failure description, and whether a crash or injury was involved.

Investigative use case: An automaker denies knowledge of a defect before a recall. The NHTSA complaint database shows 340 consumer reports of the same failure going back two years before the recall was issued. This is how you establish a timeline.

Tool: NHTSA Complaints Search -- query by make, model, year, or component. Returns structured complaint records.


8. USASpending

What it has: Every federal contract, grant, and loan -- recipient, awarding agency, amount, performance period, place of performance, and NAICS code. The most complete picture of where federal money actually goes.

Investigative use case: Tracking contracts to politically connected firms. A contractor with no employees and a residential address wins $4 million in no-bid contracts from an agency whose director came from that same firm's parent company. USASpending is how you find the pattern.

Tool: USASpending.gov Search -- filter by recipient, agency, award type, date, or amount. Structured output with full award details.


9. FEMA Disaster Declarations

What it has: Every presidentially declared disaster and emergency since 1953 -- declaration type, state, county, incident type, declaration date, and program authorizations (which federal assistance programs were activated).

Investigative use case: Disaster response equity and delay. Which counties waited longest for declarations after comparable events? Which types of disasters consistently trigger faster federal response? FEMA data is the starting point for any serious investigation into emergency management failures.

Tool: FEMA Disaster Search -- filter by state, incident type, date range, or declaration type.


Building a Monitoring Pipeline

One-off lookups are useful. Ongoing monitoring is where this becomes powerful.

Most investigative stories are not breaking news -- they are patterns that develop over months. A facility that files a new toxic release report every year. A contractor that keeps winning federal awards despite poor performance ratings. An LLC that gets dissolved and reregistered under a slightly different name.

You can build automated pipelines that pull from these databases on a schedule, store the results, and alert you when something new matches your criteria. All of the tools above are built on Apify, which has a built-in scheduler and webhook support. Set a daily run on OSHA inspections filtered to your target employer, pipe the results to a spreadsheet or Slack channel, and you have a passive monitoring system that catches new violations without any manual checking.

That is the actual value here -- not faster one-time lookups, but persistent automated coverage of sources that most newsrooms check manually, infrequently, or not at all.

The data exists. The portals just make it hard enough that most people give up. Automate past the friction, and you end up with coverage nobody else has.

Top comments (0)