DEV Community

lulzasaur
lulzasaur

Posted on

I Replaced 5 State Website Scrapers with One API Call

I Replaced 5 State Website Scrapers with One API Call

If you've ever had to verify a contractor or nurse license, you know the pain. Each state has its own website, its own search form, its own output format. California's CSLB looks nothing like Texas's TDLR, which looks nothing like Florida's DBPR.

I was building a compliance tool for a staffing platform and kept writing throwaway scrapers for each state. After the third one broke because a state board redesigned their site, I just built an API that normalizes all of them.

Here's the before/after.

The Manual Way (Per State)

For California contractors, you'd go to the CSLB site, fill out a form, parse the HTML response. For Texas, it's TDLR with a completely different flow. Florida? DBPR, another UI.

Each state takes 5-10 minutes to look up manually. If you're verifying 50 contractors for a construction company, that's 4+ hours of tab-switching.

And if you're doing it programmatically? You need a separate scraper for each state, each with its own selectors, error handling, and maintenance burden.

The API Way

One REST call, same JSON schema regardless of state:

curl "https://your-api/contractor-license/verify?state=CA&licenseNumber=1096738"
Enter fullscreen mode Exit fullscreen mode
{
  "results": [
    {
      "state": "CA",
      "licenseNumber": "1096738",
      "businessName": "Pacific Construction Inc",
      "personName": "John Smith",
      "licenseStatus": "Active",
      "licenseType": "General Building Contractor",
      "expirationDate": "2026-12-31",
      "city": "Los Angeles"
    }
  ],
  "count": 1,
  "processingMs": 4523
}
Enter fullscreen mode Exit fullscreen mode

Same call for Texas:

curl "https://your-api/contractor-license/verify?state=TX&name=Smith"
Enter fullscreen mode Exit fullscreen mode

Same JSON schema. Same field names. No per-state logic in your app.

Nurse Licenses Too

The same API handles nurse credential verification:

curl "https://your-api/nurse-license/verify?state=FL&lastName=Johnson"
Enter fullscreen mode Exit fullscreen mode
{
  "results": [
    {
      "state": "FL",
      "licenseNumber": "RN9876543",
      "personName": "Sarah Johnson",
      "licenseStatus": "Active - Clear",
      "licenseType": "Registered Nurse",
      "expirationDate": "2027-04-30"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

This matters for healthcare staffing platforms and credentialing departments that need to verify both trade licenses (contractors, electricians) and healthcare credentials (nurses, therapists) — normally two completely separate verification workflows.

Time Savings Breakdown

Task Manual API
Single license lookup 5-10 min 3-5 sec
50 contractor verifications 4+ hours 2 minutes (batch)
Adding a new state Build new scraper (days) Already supported
Handling site redesign Fix broken scraper API maintainer handles it

The real value isn't the individual lookup speed — it's the maintenance burden. State websites change their markup regularly. Someone has to keep those scrapers working.

Building a Compliance Workflow

Here's a simple Node.js script that checks license status for a list of contractors:

const contractors = [
  { state: 'CA', licenseNumber: '1096738' },
  { state: 'TX', name: 'Smith Construction' },
  { state: 'FL', licenseNumber: 'CBC1234567' },
];

const results = await Promise.all(
  contractors.map(async (c) => {
    const params = new URLSearchParams(c);
    const res = await fetch(
      `https://your-api/contractor-license/verify?${params}`,
      { headers: { 'X-Api-Key': process.env.API_KEY } }
    );
    return res.json();
  })
);

const expired = results
  .flatMap(r => r.results)
  .filter(r => r.licenseStatus !== 'Active');

if (expired.length > 0) {
  console.log('⚠️ Expired/inactive licenses:', expired);
  // Send alert to compliance team
}
Enter fullscreen mode Exit fullscreen mode

Runs on a daily cron, alerts your team when anything expires. Total code: ~20 lines.

What States Are Supported?

Currently: California (CSLB), Texas (TDLR), Florida (DBPR), and New York (NYC Open Data) for contractors. Florida (DOH) and New York (NYSED) for nurses.

More states are being added — the normalization layer means new states slot in without changing your integration code.

I built this because I needed it for my own projects. It's on RapidAPI if you want to try it — free tier included.

Top comments (0)