Freshactors

Posted on Jun 2

How to scrape Workable jobs data with Python (no API key needed)

#webscraping #api #python #tutorial

Tens of thousands of companies run their careers page on Workable, and every one of those boards is backed by a public JSON endpoint — no API key, no login, no headless browser required. In this tutorial we'll pull a company's full job board (titles, locations, employment type, remote flags, and the complete description for every role) as clean structured JSON, in a few lines of Python.

Why not just `fetch()` the board yourself?

You can. Workable's public widget API lives at:

https://apply.workable.com/api/v1/widget/accounts/{shortcode}?details=true

Pass details=true and a single GET returns the entire board with full descriptions inline — which is genuinely convenient. The catch is the usual one for an undocumented endpoint: the moment Workable reshapes a field or the response, your parser silently returns nothing, and you find out when your dashboard goes blank. And if you're tracking many companies, you're now maintaining that parser forever.

A cleaner path: hand a list of company shortcodes to an actor that already normalizes the board into a stable schema — the same schema as Greenhouse and Lever — and just consume the output. Here's how with the Workable Jobs Scraper.

Step 1 — Install the Apify client

pip install apify-client

Grab your Apify API token from Settings → Integrations in the Apify Console, and read it from an environment variable so it never lands in source control:

export APIFY_TOKEN="apify_api_xxx"

Step 2 — Run the actor with a list of companies

The actor takes a companies array of Workable shortcodes (pearltalent) or board URLs (https://apply.workable.com/pearltalent/). The shortcode is the slug in the company's apply.workable.com careers URL.

import os
from apify_client import ApifyClient

client = ApifyClient(os.environ["APIFY_TOKEN"])

run_input = {
    "companies": ["pearltalent", "https://apply.workable.com/walletconnect/"],
    "includeDescription": True,
    "maxJobsPerCompany": 500,
}

# Blocks until the run finishes, then returns run metadata.
run = client.actor("freshactors/workable-jobs-scraper").call(run_input=run_input)

print("Run status:", run["status"])
print("Dataset id:", run["defaultDatasetId"])

.call() is synchronous — it waits for the run to complete and hands you the run object, including the defaultDatasetId where results land.

Step 3 — Read the normalized output

Every posting comes back in the same shape. Fields Workable doesn't provide are null, never missing keys, so your downstream code can rely on the schema:

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(
        f'{item["company"]:<14} '
        f'{item["title"]}  '
        f'({item.get("workplaceType") or "n/a"}, {item.get("location") or "n/a"})'
    )

A single record looks like this:

{
  "_type": "job",
  "_schemaVersion": "1.0",
  "_source": "workable",
  "company": "walletconnect",
  "jobId": "2EB3D29E9B",
  "title": "Product Manager - Merchant Experience",
  "department": "Product",
  "team": null,
  "location": "United Kingdom",
  "allLocations": ["United Kingdom", "Portugal"],
  "workplaceType": "remote",
  "commitment": null,
  "country": "United Kingdom",
  "url": "https://apply.workable.com/j/2EB3D29E9B",
  "applyUrl": "https://apply.workable.com/j/2EB3D29E9B/apply",
  "postedAt": "2026-04-27",
  "updatedAt": null,
  "descriptionText": "About the role...",
  "_scrapedAt": "2026-06-02T16:40:46.908Z"
}

This is the same record shape the Greenhouse & Lever scraper emits — so if you're already pulling ATS data, Workable companies slot into the exact same pipeline with no special-casing.

Step 4 — A practical filter (remote roles, posted recently)

Say you only want remote roles posted in the last two weeks. With one schema, the filter is trivial:

from datetime import datetime, timedelta, timezone

cutoff = (datetime.now(timezone.utc) - timedelta(days=14)).date().isoformat()
remote_recent = []

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    if item.get("workplaceType") != "remote":
        continue
    if (item.get("postedAt") or "") >= cutoff:
        remote_recent.append(item)

print(f"{len(remote_recent)} remote roles posted in the last 14 days")

Step 5 — Lighter, faster runs

Two knobs control cost and speed:

includeDescription: false drops the full descriptionText from each record — handy when you only need titles, departments, and locations for a hiring-signal dashboard.
maxJobsPerCompany caps postings per company (1–5000), so a 1,000-role employer doesn't dominate your run.

run_input = {
    "companies": ["pearltalent", "gbg", "walletconnect"],
    "includeDescription": False,   # metadata only
    "maxJobsPerCompany": 200,
}

Prefer Node.js?

Same actor, same input, the JavaScript client:

npm install apify-client

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });

const run = await client.actor('freshactors/workable-jobs-scraper').call({
    companies: ['pearltalent', 'https://apply.workable.com/walletconnect/'],
    includeDescription: true,
    maxJobsPerCompany: 500,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const job of items) {
    console.log(`${job.company} — ${job.title} (${job.workplaceType ?? 'n/a'})`);
}

What about cost?

It's pay-per-event: $0.02 per company board fetched and $0.0005 per job posting returned. So 5 companies returning 100 postings total is 5 × $0.02 + 100 × $0.0005 = $0.15. No subscription — you pay for what you pull, and one request returns the whole board including descriptions, so runs stay cheap.

Why use the actor instead of hitting the board directly?

You can curl that public endpoint yourself. The reason to use the actor is maintenance: it normalizes the board into a stable schema (shared with Greenhouse + Lever), isolates per-company failures so one dead board never kills the run, and is monitored by a daily canary so a silent change to Workable's response doesn't quietly empty your pipeline. That operational reliability is the whole point.

If you want a clean Workable jobs feed without owning a parser, the actor is here: Workable Jobs Scraper on Apify. Run it on a schedule, point it at your target companies, and consume one normalized JSON feed.

Happy scraping.

DEV Community

How to scrape Workable jobs data with Python (no API key needed)

Why not just `fetch()` the board yourself?

Step 1 — Install the Apify client

Step 2 — Run the actor with a list of companies

Step 3 — Read the normalized output

Step 4 — A practical filter (remote roles, posted recently)

Step 5 — Lighter, faster runs

Prefer Node.js?

What about cost?

Why use the actor instead of hitting the board directly?

Top comments (0)

Why not just fetch() the board yourself?

Step 1 — Install the Apify client

Step 2 — Run the actor with a list of companies

Step 3 — Read the normalized output

Step 4 — A practical filter (remote roles, posted recently)

Step 5 — Lighter, faster runs

Prefer Node.js?

What about cost?

Why use the actor instead of hitting the board directly?

Why not just `fetch()` the board yourself?