Phase I Environmental Site Assessments are not glamorous work. You spend hours cross-referencing EPA Envirofacts, OSHA inspection records, state environmental databases, and historical maps — most of it manual, most of it repetitive, and all of it subject to the same deadline pressure as every other deliverable.
I have been automating data pipelines for environmental consulting firms for a while now, and the Phase I workflow is one of the clearest cases where scripted data pulls replace hours of copy-paste work. Here is what I actually do.
The Data Sources That Matter
A standard Phase I ESA under ASTM E1527-21 requires checking a specific set of federal and state databases. For the federal side, the two heaviest hitters are:
- EPA Envirofacts / ECHO — enforcement and compliance history, facility-level discharge data, Superfund proximity, RCRA hazardous waste handlers
- OSHA inspection records — particularly useful for industrial properties; a history of serious citations tells you a lot about how a facility was operated
Both databases have search interfaces designed for single lookups, not batch assessments. If you are doing five properties a week, that is five separate manual sessions per database. Multiply by the number of databases on a standard EDR list and you are burning a full day just on records searches.
Pulling EPA Enforcement Data Programmatically
EPA ECHO has a REST API, but it is genuinely painful to work with if you are not comfortable constructing multi-parameter queries. You need to first query for a facility identifier, handle pagination, parse nested compliance history objects, and normalize across the NPDES, CAA, RCRA, and SDWA program areas. Doable, but several hundred lines of code before you have something production-usable.
The faster path is using a pre-built actor that handles pagination and normalization for you. The EPA enforcement data actor on Apify takes a facility name, address, or EPA facility ID and returns structured compliance records — violations, enforcement actions, penalties, and inspection history.
Example input:
{
"facilityName": "Acme Manufacturing",
"state": "OH",
"maxResults": 50
}
You get back records with violation dates, regulatory program, enforcement response type, and penalty amounts — exactly what goes into the regulatory agency records section of a Phase I report.
OSHA Inspection Records for Industrial Properties
OSHA inspection data is underused in Phase I work. An industrial property with a pattern of serious or willful citations for chemical handling, storage tank violations, or respiratory hazard controls is worth scrutinizing more closely — even if the EPA record looks clean. The two data sources tell different stories: EPA tracks environmental discharge and waste, OSHA tracks how the facility was actually operated day-to-day.
OSHA has a public enforcement portal, but it is designed for single-facility lookups with no bulk export. The OSHA inspection actor returns structured inspection history — inspection dates, activity type, citation counts by severity level, violation descriptions, and penalty totals — with a single API call:
curl -X POST https://api.apify.com/v2/acts/pink_comic~osha-inspections/run-sync-get-dataset-items \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"establishmentName": "Acme Manufacturing", "state": "OH"}'
A Practical Phase I Records Workflow
Here is how I structure an automated records pull for a Phase I assessment:
Step 1: Geocode the subject property. Get lat/lng for the address. EPA ECHO supports radius searches from coordinates, which is more reliable than name matching when the facility has changed ownership or operating names.
Step 2: Pull EPA ECHO data for the property and a 0.5-mile radius. You want both the subject property compliance history and any Superfund sites, RCRA facilities, or underground storage tank records in the immediate vicinity. The standard EDR report covers a specific radius per database type — replicate that logic in your query parameters.
Step 3: Pull OSHA inspection history for the facility name. Match on name and state. You will get false positives if the name is common — filter by NAICS code or zip code to narrow it down.
Step 4: Flag and summarize. Enforcement actions in the last 5 years, any open violations, penalties over $10K, active RCRA corrective action status — these trigger a closer review note in the Phase I narrative.
With this pipeline handling steps 1 through 3 automatically, a typical Phase I records section goes from 2-3 hours of manual searching to about 20-30 minutes of review and narrative writing.
What This Does Not Replace
Automated data pulls handle the federal records component of Phase I well. They do not replace:
- Site reconnaissance and physical inspection
- State agency records without accessible APIs
- Historical aerial photo and Sanborn map review
- Interviews with current and former owners or operators
- Local fire department records
Those still require human judgment and site-specific context. But the federal records search component is almost entirely automatable, and it is typically the most time-consuming part of the Phase I scope when done manually.
The Business Case
If you bill Phase I assessments at a flat rate and records search takes 3 hours, automating it to 30 minutes recovers 2.5 hours of staff time per project. At 10 Phase Is a month, that is 25 hours — roughly half an FTE's productive capacity in that single task.
For consulting firms doing volume work, that margin improvement compounds fast. For individual consultants billing hourly, it means you can take on more projects or reduce turnaround time without extending hours.
The EPA and OSHA actors are available on Apify and can be called from Python, JavaScript, or directly via the REST API. The JSON output is structured for downstream processing — drop it into a database, a report template, or a workflow automation tool without reformatting.
If you are integrating this into an existing Phase I workflow tool or environmental management system, drop a comment below — happy to talk through specific integration patterns.
Top comments (0)