DEV Community

Sachh Moka
Sachh Moka

Posted on

I built a REST API for Australian business data - here's how

If you've ever tried to programmatically look up an Australian Business Number, you've probably met the ABR's SOAP/XML web service. If you've tried to get ASIC company details or ACNC charity records, you've discovered they don't have an API at all — just weekly CSV dumps.

I built OzRegAPI to solve this. It's a REST/JSON API that wraps all three government data sources into one interface. Here's how I built it, what decisions I made, and what I learned shipping my first paid API product.

The problem

Australia has three separate business registries:

  • ABR (Australian Business Register) — ABN lookups, entity details, GST status. Has an API, but it's SOAP/XML with a WSDL you need to parse.
  • ASIC (Australian Securities and Investments Commission) — Company registrations, types, directors. No API. They publish a ~400MB CSV file weekly.
  • ACNC (Australian Charities and Not-for-profits Commission) — Charity records. Also no API. Also a weekly CSV.

If you're building accounting software, a customer onboarding flow, or any compliance tool in Australia, you probably need data from at least two of these. That means writing a SOAP client for ABR, a CSV ingestion pipeline for ASIC and ACNC, some kind of cross-referencing layer, and your own caching.

I figured other developers had the same problem.

The stack

I kept it deliberately simple:

  • Python + FastAPI — fast to build, auto-generates OpenAPI specs, async support
  • SQLite — ASIC (3.9M companies) and ACNC (64K charities) loaded from CSV into a single DB file, baked into the Docker image
  • zeep — Python SOAP client for talking to ABR
  • Google Cloud Run — scale to zero, pay nothing when idle
  • RapidAPI — handles billing, API keys, rate limiting, and marketplace distribution

No Redis, no Postgres, no Kubernetes. The whole thing runs as a single container.

What it does

16 endpoints across three tiers:

Free (200 requests/day):

  • ABN lookup — pass an ABN, get entity name, status, type, GST
  • ACN lookup — ASIC company details
  • Name search — find businesses by name
  • ABN validation — checksum + active status check

Pro ($9.99/month):

  • /business/{identifier} — the main endpoint. Pass an ABN or ACN, get a combined response from all three registries. Entity details, GST registration dates, ASIC company classification, ACNC charity data, name history, all in one call.
  • Batch lookup — up to 100 ABNs in one request

Ultra ($29.99/month):

  • Filter by postcode, ABN status, GST registration, charity type
  • Monitor new registrations by month
  • Track record changes since a date

Decisions I'd make again

SQLite over Postgres. The ASIC and ACNC data is read-only and refreshed weekly. SQLite handles 3.9M rows fine, queries take <10ms, and baking the DB into the Docker image means zero infrastructure. No managed database, no connection pooling, no credentials to manage.

Scale-to-zero on Cloud Run. Early-stage APIs get very little traffic. Paying for an always-on server while waiting for your first subscriber is waste. Cloud Run costs literally $0 when nobody's calling.

RapidAPI for distribution. Yes, they take 25%. But they handle billing, API key management, rate limiting, usage analytics, and put you in front of 200K+ developers. Building all of that yourself takes longer than building the API.

Three pricing tiers. Free tier is the sales funnel — developers test before they buy. Pro is the actual product — enriched data and batch operations. Ultra is for power users who need bulk filtering and monitoring. The free tier converts to paid when someone moves from "evaluating" to "integrating into production."

Decisions I'm still thinking about

Baking the database into the Docker image means every data refresh requires a full rebuild and redeploy. At weekly refresh cadence this is fine. If I ever need daily refreshes, I'll move the DB to a Cloud Storage mount.

No server-side rate limiting. RapidAPI handles per-user rate limits. I added a concurrency semaphore on ABR calls (max 5 concurrent) to protect the government API quota, but there's no per-IP limiting on the backend itself. If I add a direct billing channel later, I'll need to add this.

What I learned

The government data is messier than you'd expect. The ASIC CSV is tab-delimited (not comma), uses different column names than you'd guess, and has multiple rows per company (current name + historical names). The ABR SOAP responses nest entity names inside a _value_1 array and use different field structures for companies vs sole traders. I spent more time parsing quirky government data than writing API endpoints.

Stress testing found real bugs. I ran 175 edge case tests (SQL injection, XSS, unicode, boundary values) and caught that sole trader ABNs returned empty names — the ABR stores individual names in legalName fields instead of mainName. Would have been a bad look if a customer found that.

Security auditing was worth it. I ran three parallel audits (security, code quality, spec compliance) before deploying. Found 25 issues including a timing-attack-vulnerable API key comparison, a Docker container running as root, and filter endpoints that weren't caching responses. All fixed before launch.

The listing copy matters less than you think. I rewrote the RapidAPI description three times trying to make it "sell." What actually matters is that the endpoints work, the responses are fast, and the documentation is accurate. Developers evaluate APIs by testing them, not reading marketing copy.

The numbers

  • 16 endpoints across 3 tiers
  • 149 unit tests + 175 edge case tests + 24 integration tests against live ABR
  • Load tested to 50 concurrent users with zero failures
  • p95 latency: <10ms cached, ~500-2000ms uncached (ABR SOAP overhead)
  • Infrastructure cost: $0 at current traffic (Cloud Run free tier)
  • Time to build: about a week of focused work

Try it

OzRegAPI is live on RapidAPI with a free tier: https://rapidapi.com/sachhm/api/ozregapi

Quick test:

curl -H "X-RapidAPI-Key: YOUR_KEY" \
     -H "X-RapidAPI-Host: ozregapi.p.rapidapi.com" \
     "https://ozregapi.p.rapidapi.com/abn/49004028077"
Enter fullscreen mode Exit fullscreen mode

That returns BHP's full business record as clean JSON.

If you're building something with Australian business data and have feedback on what endpoints or features would be useful, I'd genuinely like to hear it.


Built with Python, FastAPI, and too many hours reading SOAP documentation.

Top comments (0)