DEV Community

Cover image for How to scrape Workable jobs data with Python (no API key needed)
Freshactors
Freshactors

Posted on

How to scrape Workable jobs data with Python (no API key needed)

Tens of thousands of companies run their careers page on Workable, and every one of those boards is backed by a public JSON endpoint — no API key, no login, no headless browser required. In this tutorial we'll pull a company's full job board (titles, locations, employment type, remote flags, and the complete description for every role) as clean structured JSON, in a few lines of Python.

Why not just fetch() the board yourself?

You can. Workable's public widget API lives at:

https://apply.workable.com/api/v1/widget/accounts/{shortcode}?details=true
Enter fullscreen mode Exit fullscreen mode

Pass details=true and a single GET returns the entire board with full descriptions inline — which is genuinely convenient. The catch is the usual one for an undocumented endpoint: the moment Workable reshapes a field or the response, your parser silently returns nothing, and you find out when your dashboard goes blank. And if you're tracking many companies, you're now maintaining that parser forever.

A cleaner path: hand a list of company shortcodes to an actor that already normalizes the board into a stable schema — the same schema as Greenhouse and Lever — and just consume the output. Here's how with the Workable Jobs Scraper.

Step 1 — Install the Apify client

pip install apify-client
Enter fullscreen mode Exit fullscreen mode

Grab your Apify API token from Settings → Integrations in the Apify Console, and read it from an environment variable so it never lands in source control:

export APIFY_TOKEN="apify_api_xxx"
Enter fullscreen mode Exit fullscreen mode

Step 2 — Run the actor with a list of companies

The actor takes a companies array of Workable shortcodes (pearltalent) or board URLs (https://apply.workable.com/pearltalent/). The shortcode is the slug in the company's apply.workable.com careers URL.

import os
from apify_client import ApifyClient

client = ApifyClient(os.environ["APIFY_TOKEN"])

run_input = {
    "companies": ["pearltalent", "https://apply.workable.com/walletconnect/"],
    "includeDescription": True,
    "maxJobsPerCompany": 500,
}

# Blocks until the run finishes, then returns run metadata.
run = client.actor("freshactors/workable-jobs-scraper").call(run_input=run_input)

print("Run status:", run["status"])
print("Dataset id:", run["defaultDatasetId"])
Enter fullscreen mode Exit fullscreen mode

.call() is synchronous — it waits for the run to complete and hands you the run object, including the defaultDatasetId where results land.

Step 3 — Read the normalized output

Every posting comes back in the same shape. Fields Workable doesn't provide are null, never missing keys, so your downstream code can rely on the schema:

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(
        f'{item["company"]:<14} '
        f'{item["title"]}  '
        f'({item.get("workplaceType") or "n/a"}, {item.get("location") or "n/a"})'
    )
Enter fullscreen mode Exit fullscreen mode

A single record looks like this:

{
  "_type": "job",
  "_schemaVersion": "1.0",
  "_source": "workable",
  "company": "walletconnect",
  "jobId": "2EB3D29E9B",
  "title": "Product Manager - Merchant Experience",
  "department": "Product",
  "team": null,
  "location": "United Kingdom",
  "allLocations": ["United Kingdom", "Portugal"],
  "workplaceType": "remote",
  "commitment": null,
  "country": "United Kingdom",
  "url": "https://apply.workable.com/j/2EB3D29E9B",
  "applyUrl": "https://apply.workable.com/j/2EB3D29E9B/apply",
  "postedAt": "2026-04-27",
  "updatedAt": null,
  "descriptionText": "About the role...",
  "_scrapedAt": "2026-06-02T16:40:46.908Z"
}
Enter fullscreen mode Exit fullscreen mode

This is the same record shape the Greenhouse & Lever scraper emits — so if you're already pulling ATS data, Workable companies slot into the exact same pipeline with no special-casing.

Step 4 — A practical filter (remote roles, posted recently)

Say you only want remote roles posted in the last two weeks. With one schema, the filter is trivial:

from datetime import datetime, timedelta, timezone

cutoff = (datetime.now(timezone.utc) - timedelta(days=14)).date().isoformat()
remote_recent = []

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    if item.get("workplaceType") != "remote":
        continue
    if (item.get("postedAt") or "") >= cutoff:
        remote_recent.append(item)

print(f"{len(remote_recent)} remote roles posted in the last 14 days")
Enter fullscreen mode Exit fullscreen mode

Step 5 — Lighter, faster runs

Two knobs control cost and speed:

  • includeDescription: false drops the full descriptionText from each record — handy when you only need titles, departments, and locations for a hiring-signal dashboard.
  • maxJobsPerCompany caps postings per company (1–5000), so a 1,000-role employer doesn't dominate your run.
run_input = {
    "companies": ["pearltalent", "gbg", "walletconnect"],
    "includeDescription": False,   # metadata only
    "maxJobsPerCompany": 200,
}
Enter fullscreen mode Exit fullscreen mode

Prefer Node.js?

Same actor, same input, the JavaScript client:

npm install apify-client
Enter fullscreen mode Exit fullscreen mode
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });

const run = await client.actor('freshactors/workable-jobs-scraper').call({
    companies: ['pearltalent', 'https://apply.workable.com/walletconnect/'],
    includeDescription: true,
    maxJobsPerCompany: 500,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const job of items) {
    console.log(`${job.company}${job.title} (${job.workplaceType ?? 'n/a'})`);
}
Enter fullscreen mode Exit fullscreen mode

What about cost?

It's pay-per-event: $0.02 per company board fetched and $0.0005 per job posting returned. So 5 companies returning 100 postings total is 5 × $0.02 + 100 × $0.0005 = $0.15. No subscription — you pay for what you pull, and one request returns the whole board including descriptions, so runs stay cheap.

Why use the actor instead of hitting the board directly?

You can curl that public endpoint yourself. The reason to use the actor is maintenance: it normalizes the board into a stable schema (shared with Greenhouse + Lever), isolates per-company failures so one dead board never kills the run, and is monitored by a daily canary so a silent change to Workable's response doesn't quietly empty your pipeline. That operational reliability is the whole point.

If you want a clean Workable jobs feed without owning a parser, the actor is here: Workable Jobs Scraper on Apify. Run it on a schedule, point it at your target companies, and consume one normalized JSON feed.

Happy scraping.

Top comments (0)