DEV Community

Omar Eldeeb
Omar Eldeeb

Posted on • Originally published at datatooly.xyz

Read Company Hiring Signals From Public Job Board APIs (with code)

A company's open roles are the most honest document it publishes. The careers page is marketing; the job board is the budget. If you learn to read company hiring signals straight from the open requisitions, you can infer where a business is investing months before it shows up in a press release.

And the best part for developers: most of the data is sitting behind public, no-auth JSON APIs. The applicant tracking systems (ATS) that power those careers pages — Greenhouse, Lever, Ashby, SmartRecruiters — expose job boards as plain endpoints. You can fetch them, parse them, and classify the role mix yourself.

This article shows you how to do that, with a snippet that actually runs in a browser console.

Why open roles encode strategy

Headcount is the clearest expression of intent a company has. Every requisition is a funded decision someone fought for in a planning meeting. So the mix of roles, not just the count, tells a story:

  • A wave of Account Executives and Sales Engineers → they have a product that works and are pouring fuel on go-to-market. Likely just raised, or hitting a revenue inflection.
  • A spike in backend / infra / platform engineers → scaling pains. The thing is growing faster than the architecture can handle.
  • New "AI", "ML", or "Applied Scientist" titles where there were none → a strategic bet that didn't exist last quarter.
  • Roles concentrated in a new city or country → geographic expansion. A "Country Manager, Germany" is a market-entry announcement disguised as a job post.
  • Recruiters and People Ops hiring → they expect to hire a lot soon. Recruiting hires are often a leading indicator of broader expansion.
  • First Compliance / Legal / Finance leadership → maturing toward a fundraise, audit, or exit.

This is exactly the kind of intelligence that sales teams pay for under the label "hiring intent" or "buying signals." You can derive a useful slice of it yourself.

The data source: public ATS job boards

Greenhouse runs a dedicated read-only API for board content. The shape is dead simple:

GET https://boards-api.greenhouse.io/v1/boards/{board_token}/jobs
Enter fullscreen mode Exit fullscreen mode

The board_token is usually the company's slug — stripe, airbnb, etc. No API key, no OAuth, no header dance. It returns 200 OK with Content-Type: application/json and, crucially for front-end code, Access-Control-Allow-Origin: *. That wildcard CORS header means the request genuinely succeeds from a browser on any origin — you can paste the fetch below straight into DevTools and it works.

Here's the response shape (illustrative values — run it yourself for live data), so you know what you're parsing:

{
  "jobs": [
    {
      "id": 1234567,
      "title": "Account Executive, Enterprise",
      "updated_at": "2026-05-20T16:58:18-04:00",
      "location": { "name": "San Francisco, CA" },
      "absolute_url": "https://example.com/jobs/search?gh_jid=1234567"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Each job gives you title, location.name, updated_at, and a link. That's all you need to map role mix to intent.

Fetch + classify in ~50 lines

Below is a self-contained function. It pulls a board, buckets each role into a category by keyword, and returns a sorted intent profile plus a naive "primary signal." Drop it in your browser console with any Greenhouse board token.

const SIGNALS = {
  sales:       /\b(account executive|ae|sales|business development|bdr|sdr|revenue)\b/i,
  marketing:   /\b(marketing|growth|demand gen|content|brand|seo)\b/i,
  engineering: /\b(engineer|developer|sre|devops|infrastructure|platform|backend|frontend)\b/i,
  ai_ml:       /\b(machine learning|ml engineer|applied scientist|\bai\b|research scientist|nlp)\b/i,
  product:     /\b(product manager|\bpm\b|product designer|ux|ui designer)\b/i,
  recruiting:  /\b(recruiter|talent|people ops|hr business partner)\b/i,
  finance_legal:/\b(finance|accounting|controller|legal|counsel|compliance)\b/i,
  support:     /\b(support|customer success|csm|implementation|onboarding)\b/i,
};

function classify(title) {
  for (const [label, re] of Object.entries(SIGNALS)) {
    if (re.test(title)) return label;
  }
  return "other";
}

async function hiringSignals(boardToken) {
  const url = `https://boards-api.greenhouse.io/v1/boards/${boardToken}/jobs`;
  const res = await fetch(url);
  if (!res.ok) throw new Error(`Board "${boardToken}" returned ${res.status}`);
  const { jobs } = await res.json();

  const counts = {};
  const byCity = {};
  for (const job of jobs) {
    const cat = classify(job.title);
    counts[cat] = (counts[cat] || 0) + 1;
    const city = job.location?.name || "Unknown";
    byCity[city] = (byCity[city] || 0) + 1;
  }

  const profile = Object.entries(counts).sort((a, b) => b[1] - a[1]);
  const topCities = Object.entries(byCity).sort((a, b) => b[1] - a[1]).slice(0, 5);

  return {
    board: boardToken,
    totalRoles: jobs.length,
    roleMix: profile,
    topLocations: topCities,
    primarySignal: profile[0]?.[0],
  };
}

// Try it:
hiringSignals("stripe").then(console.log);
Enter fullscreen mode Exit fullscreen mode

Run it and you get something shaped like this (example numbers — boards change daily, so your run will differ):

{
  board: "stripe",
  totalRoles: 470,
  roleMix: [["engineering", 160], ["sales", 70], ["product", 40], ...],
  topLocations: [["San Francisco, CA", 64], ["New York, NY", 38], ...],
  primarySignal: "engineering"
}
Enter fullscreen mode Exit fullscreen mode

Note primarySignal is just roleMix[0][0] — the highest-count category — and classify() files each title under its first matching pattern, so treat both as a rough first read, not gospel. From there, the interesting analysis isn't the snapshot — it's the delta. Save today's roleMix and diff it next week. A category that jumps from 3 to 18 roles is the signal. A new city appearing in topLocations is the signal. Absolute counts are noisy; changes are where intent lives.

Sharpen the read

A few things to layer on once the basics work:

  • Weight by recency. Roles with a fresh updated_at reflect current priorities more than ones reposted for months. Filter to roles updated in the last 30 days.
  • Watch for firsts. The first role in a category (first "Enterprise AE", first "Solutions Architect") often matters more than the tenth. Track which categories crossed from zero.
  • Seniority skew. A batch of "Head of" / "Director" / "VP" postings signals a layer being built out — usually ahead of an org's scaling phase.
  • Cross-reference with funding. Sales-and-marketing hiring spikes that line up with a recent raise are the strongest go-to-market-expansion tell.

The regexes above are deliberately simple. Real titles are messy ("Staff Software Engineer, Payments Risk Platform"), and a keyword bucket will misfile some. For anything beyond exploration, an LLM classifier handling each title against your taxonomy is far more robust than brittle patterns — but start with regex to understand your data.

Want to eyeball one company right now?

If you just want to point at a single Greenhouse board and see the role mix without writing code, there's a free browser tool that runs the same idea live: datatooly.xyz/company-hiring-signals (disclosure: I built it, and the Apify actor mentioned later). It fetches the public board client-side (thanks to that wildcard CORS header) and renders the intent breakdown. Good for a quick check on one prospect.

The other ATS platforms (and the hard one)

Greenhouse is the easiest, but it's not alone. Several major ATS platforms expose public job boards:

Endpoint shapes and headers drift over time — test each ATS before depending on it in production.

  • Leverhttps://api.lever.co/v0/postings/{company}?mode=json returns a plain JSON array with text, categories.team, categories.location, and hostedUrl.
  • Ashby — a public posting API keyed by job-board name, and it also sends Access-Control-Allow-Origin: *.
  • SmartRecruiters — a public postings endpoint per company.

Each has a slightly different response shape, so you'd normalize them into one schema (title, location, team, updated date, url) before classifying.

Then there's Workday, which is the genuinely hard one. Workday tenants serve postings through a per-tenant CXS endpoint that you have to discover, and pagination is done via POST with an offset body rather than a clean GET — no friendly wildcard CORS, no single base URL. A meaningful share of large enterprises run on Workday, so any "company hiring signals" pipeline that ignores it has a blind spot exactly where the biggest budgets are.

Doing this at scale

Reading one board by hand is a five-minute task. Tracking 25,000+ companies, normalizing four-plus ATS schemas (including the Workday pagination dance), running an AI classifier over messy titles, and diffing week-over-week to fire alerts when a category spikes — that's a data pipeline, not a console snippet.

If you'd rather not build and maintain all of that, the ATS Hiring-Intent Scraper on Apify does the heavy lifting: it pulls across the major ATS platforms, classifies role mix into intent categories, and is built for running on a schedule so you catch the changes rather than just snapshots. Useful if hiring signals feed a sales or research workflow and you need them reliably, not as a one-off.

But for learning the concept and prototyping on a handful of targets, the fetch-and-classify snippet above is all you need — and it's a genuinely fun afternoon of code.


One honest note: these endpoints are public because companies want their jobs found, but they're meant for candidates, not bulk harvesting. Keep request rates polite, cache aggressively, respect each platform's Terms of Service and robots.txt, and don't republish personal data. Read the strategy, not the people.

Top comments (0)