Real estate wholesaling is a data game. You need to find distressed properties before anyone else does. Foreclosure filings, delinquent tax records, code violations, 311 complaints. These are all public records. Most wholesalers check them manually. I automated it.
The System: Crawl OS
Crawl OS is a set of 5 Supabase edge functions that scrape 14 Texas counties every night. Each function handles a different data source:
- Foreclosure filings from county clerk websites
- Delinquent tax records from county tax assessor portals
- Code violations from city code enforcement databases
- 311 complaints from municipal service request systems
- Normalization function that standardizes addresses and deduplicates across sources
The Pipeline
Each night at 2 AM Central, pg_cron triggers the scraping functions. They hit the county websites, parse the HTML or API responses, extract the relevant records, and insert them into staging tables.
The normalization function runs at 3 AM. It standardizes addresses (matching "123 Main St" with "123 Main Street"), geocodes new properties, and merges records from different sources into a single lead record.
At 4 AM, the scoring function runs. Each lead gets a score based on:
- Number of distress signals (foreclosure + tax delinquent = higher score)
- Property value (from county appraisal data)
- Days since first distress signal
- Neighborhood trend data
By 5 AM, the dashboard shows fresh scored leads. The AI outreach agent starts making calls at 9 AM.
The Numbers
14 counties. Roughly 125 scored leads per nightly run. The leads feed into Load Bearing Capital's wholesale pipeline at loadbearingcapitaltx.com.
What I Learned
County websites are terrible. Different formats, different update schedules, different levels of data quality. Some counties have APIs. Most have HTML tables from 2004. You write a scraper for each one and you handle failures gracefully because at least 2 of the 14 will be down on any given night.
Isolation matters. Each county gets its own pg_cron job. If Harris County's website is down, it does not block the Galveston County scrape. Failures are logged and retried the next night.
The Legal Part
All of this data is public record. County governments publish it specifically so citizens can access it. Scraping public records is legal. Using that data to contact property owners with legitimate purchase offers is legal. Just do not pretend to be a government agency and you are fine.
Build It Yourself
If you are in real estate and you are still checking county websites manually, you are giving your competition a 12-hour head start every single day. Automate it.
Top comments (0)