Building permit data is public record in most US cities, but accessing it at scale is genuinely painful. The data exists - it's just buried in inconsistent government APIs with different schemas, field types, and date formats across every city.
The use case I was solving for: contractors doing lead gen. A solar installer wants to know who just pulled a solar permit in their area. A roofing company wants fresh leads. The data is all public - it just isn't usable in bulk without writing against each city's API individually.
What I built
An Apify actor that normalises permit data across 36 US cities (Chicago, NYC, LA, Seattle, Miami, Columbus, Washington DC, and more) into a single clean schema. Launched on the Apify marketplace at $1.50/1,000 permits.
What the normalisation problem actually looked like
Cities use three completely different open data platforms - Socrata, ArcGIS, and Accela - each with different APIs, pagination logic, and authentication. Then within each platform, every city has its own field naming conventions. "Permit type" means something different in every city. Date fields are sometimes ISO strings, sometimes Unix timestamps, sometimes just free text.
I added a flag (issueDateIsString) for cities that store dates as strings so the query layer knows not to try date arithmetic on them. That kind of one-off fix exists for almost every city.
What the data actually looks like
I ran the actor against Chicago for Nov–Dec 2025. Results:
- 2,755 permits in 2 months
- $783M in declared project value
- 205 solar installs - each with address, project value ($7k–$26k), contractor name and license type, exact issue date
- 394 roofing permits with the same fields attached
Every record comes out in the same schema regardless of which city it came from. That's the whole point.
What I learned about data quality
- Chicago updates daily. LA's data is 12–24 months behind. Same product category, completely different freshness guarantees.
- Some cities include contractor names and license numbers. Others publish nothing useful for lead gen.
- The Socrata, ArcGIS, and Accela APIs are all publicly documented - but almost none of the city-specific field names are. That reverse-engineering work is where most of the time went.
Distribution via Apify marketplace
Apify has a marketplace with real search traffic from developers and data buyers. Pricing is pay-per-use, which aligns well with the use case - buyers only pay for what they pull.
One thing I've been thinking about since launch: the buyers on Apify skew toward developers building lead gen tools, not the contractors themselves. The end contractor might be the beneficiary but the person paying for the data pipeline is usually someone building on top of it. That changes the messaging somewhat - worth thinking about if you're building anything similar.
Happy to answer questions about the Apify marketplace, government open data APIs, or the normalisation approach.
Actor: https://apify.com/handstands.io/us-building-permit-scraper
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.