DEV Community

Ava Torres
Ava Torres

Posted on

How to Search Secretary of State Business Filings Programmatically (Multi-State)

If you work in KYC, compliance, sales, or due diligence, you've probably spent time on state Secretary of State websites manually searching for business entities. The experience is universally terrible -- each state has a different system, most are built on 2000s-era ASP.NET, and none of them talk to each other.

There's no national business entity database. Each of the 50 states maintains its own registry independently. If you need to verify that "Acme LLC" is registered and active, you have to know which state it's registered in and search that state's specific portal.

Here's what I've learned building tools to automate this across multiple states.

The state-by-state landscape

California

  • Portal: bizfileonline.sos.ca.gov
  • Searchable by: entity name, entity number
  • Data available: entity name, number, type, status, formation date, jurisdiction, agent info
  • API: CA has an unofficial API that returns JSON. It's the cleanest of any state.

Texas

  • Portal: SOSDirect (sos.state.tx.us)
  • Searchable by: entity name, entity number, registered agent, officer/director name
  • Data available: entity name, type, status, formation date, registered agent, officers
  • API: No public API. The portal is a classic ASP.NET form with ViewState.

New York

  • Portal: appext20.dos.ny.gov/corp_public
  • Searchable by: entity name, DOS ID
  • Data available: entity name, type, status, jurisdiction, filing date, county, process address
  • API: No public API. Returns HTML tables.

Florida

  • Portal: search.sunbiz.org
  • Searchable by: entity name, officer/agent name, document number
  • Data available: entity name, type, status, filing date, officers, registered agent, annual reports
  • API: No API, but FL publishes bulk data via SFTP (weekly). This is the exception.

The common problems

1. No standard schema. California calls it "Entity Number", Texas calls it "Filing Number", New York calls it "DOS ID", Florida calls it "Document Number". Same concept, different field name, different format.

2. Anti-bot protection is increasing. As of 2026, Florida and several other states have added Cloudflare or Akamai protection. Texas and New York still work with direct HTTP requests. California's unofficial API is the easiest to work with.

3. Search is fuzzy in unhelpful ways. Some states return exact matches only. Others return wildcard matches that bury what you're looking for in hundreds of results. Name formatting (LLC vs L.L.C. vs Limited Liability Company) causes missed matches.

4. Rate limits are invisible. No state documents their rate limits. Hit them too fast and you get soft-blocked (empty results or 503 errors) without any error message explaining why.

Building it yourself vs. using existing tools

If you need to search one state occasionally, the web portal is fine. If you need to:

  • Search across multiple states in a single workflow
  • Do bulk entity verification (KYC/compliance)
  • Monitor entity status changes over time
  • Feed results into a CRM or database automatically

...then you need programmatic access.

I built individual actors for each state that handle the portal differences, anti-bot measures, and output normalization:

Each one takes a search query and returns normalized JSON:

{
  "entityName": "ACME HOLDINGS LLC",
  "entityNumber": "202312345678",
  "entityType": "Limited Liability Company",
  "status": "Active",
  "formationDate": "2023-05-15",
  "jurisdiction": "California",
  "registeredAgent": "CT Corporation System",
  "principalAddress": "123 Market St, San Francisco, CA 94105"
}
Enter fullscreen mode Exit fullscreen mode

Why fragmentation is the real problem (and the opportunity)

The reason no one has built a clean multi-state business entity search is that it's 50 separate engineering problems. Each state has different:

  • Web technology (ASP.NET, PHP, Java servlets, React)
  • Search parameters and result formats
  • Anti-bot measures
  • Data fields and naming conventions
  • Update frequency and data freshness

Services like OpenCorporates, Cobalt Intelligence, and LexisNexis aggregate this data but charge enterprise pricing ($500-5000/month). If you need occasional lookups or want to build your own pipeline, the per-state tools above cost pennies per search.

What's coming

I'm expanding to cover all 50 states. The top 5 by entity count (CA, TX, NY, FL, IL) are done. Next: IL, PA, OH, GA, NJ. The goal is programmatic access to every state's business entity registry with normalized output.

If you're building KYC/compliance tooling and need specific states, drop a comment -- I'll prioritize based on demand.


I build automation tools for public records data. State SOS portals are my specialty -- and my recurring nightmare.

Top comments (0)