Freshactors

Posted on Jun 2

How to scrape Recruitee jobs data (with salary) in Python — no API key

#webscraping #python #api #tutorial

Recruitee powers the careers sites of thousands of companies (bunq, Channable, Vandebron, and many more), and every board is backed by a public Offers API — no API key, no login, no headless browser. The best part: a single request returns every offer with its full description, requirements, and — unusually for an ATS — salary. In this tutorial we'll pull a company's whole job board as clean structured JSON in a few lines of Python.

The endpoint

Recruitee exposes one endpoint per company subdomain:

https://{company}.recruitee.com/api/offers/

One GET returns the entire board — metadata, HTML description, requirements, and a salary object — in a single response. No detail call, no pagination. That's it.

So why not just requests.get() it yourself? You can — but then you own the parser: stripping the HTML, normalizing employment-type codes like fulltime_permanent, filtering out draft/closed offers, flattening the salary object, and fixing it the day a field name changes and your pipeline goes empty. A cleaner path: hand a list of company identifiers to an actor that returns one stable schema — the same schema as Greenhouse, Lever, Workable, and SmartRecruiters. Here's how with the Recruitee Jobs Scraper.

Step 1 — Install the Apify client

pip install apify-client

Read your Apify API token (Console → Settings → Integrations) from an environment variable:

export APIFY_TOKEN="apify_api_xxx"

Step 2 — Run the actor with a list of companies

companies accepts identifiers (bunq) or board URLs (https://bunq.recruitee.com). The identifier is the subdomain in the company's {name}.recruitee.com careers URL.

import os
from apify_client import ApifyClient

client = ApifyClient(os.environ["APIFY_TOKEN"])

run_input = {
    "companies": ["bunq", "https://channable.recruitee.com"],
    "includeDescription": True,
    "maxJobsPerCompany": 500,
}

run = client.actor("freshactors/recruitee-jobs-scraper").call(run_input=run_input)
print("Dataset id:", run["defaultDatasetId"])

Step 3 — Read the normalized output (including salary)

Every offer comes back in the same shape, with null (never missing keys) where Recruitee lacks a field:

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    pay = item.get("salary")
    pay_str = f'{pay["min"]}–{pay["max"]} {pay["currency"]}' if pay else "n/a"
    print(f'{item["company"]:<12} {item["title"]}  ({item.get("location") or "n/a"})  pay: {pay_str}')

A single record:

{
  "_type": "job",
  "_schemaVersion": "1.0",
  "_source": "recruitee",
  "company": "bunq",
  "jobId": "2620732",
  "title": "Senior Backend Engineer",
  "department": "Engineering",
  "location": "Amsterdam, Netherlands",
  "workplaceType": "hybrid",
  "commitment": "Full-time",
  "country": "NL",
  "url": "https://careers.bunq.com/o/senior-backend-engineer",
  "applyUrl": "https://careers.bunq.com/o/senior-backend-engineer/c/new",
  "postedAt": "2026-05-29T09:45:21.000Z",
  "salary": { "min": 65000, "max": 90000, "period": "year", "currency": "EUR" },
  "descriptionText": "About the role... Requirements...",
  "_scrapedAt": "2026-06-02T18:00:00.000Z"
}

This is the same record shape the Greenhouse & Lever, Workable, and SmartRecruiters scrapers emit — plus a Recruitee-only salary object — so Recruitee companies drop into the same pipeline with zero special-casing.

Step 4 — Lighter output

The description + requirements arrive in the same request, so they're free. If you only need metadata (titles, departments, locations, salary) — say, for a comp-benchmarking or hiring-signal dashboard — set includeDescription: False to trim the payload:

run_input = {
    "companies": ["bunq", "vandebron"],
    "includeDescription": False,   # smaller records; same cost & speed
    "maxJobsPerCompany": 200,
}

maxJobsPerCompany (1–5000) caps volume so a large employer doesn't dominate your run.

Prefer Node.js?

npm install apify-client

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });

const run = await client.actor('freshactors/recruitee-jobs-scraper').call({
    companies: ['bunq', 'https://channable.recruitee.com'],
    includeDescription: true,
    maxJobsPerCompany: 500,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const job of items) console.log(`${job.company} — ${job.title} (${job.salary ? job.salary.currency : 'no pay listed'})`);

What about cost?

Pay-per-event: $0.02 per company board fetched and $0.0005 per job posting returned. So 5 companies returning 100 postings total is 5 × $0.02 + 100 × $0.0005 = $0.15 — descriptions and salary included. No subscription.

Why use the actor instead of the API directly?

You can call the Offers API yourself. The reason to use the actor is maintenance: it filters to published offers, normalizes everything into one schema (shared with Greenhouse/Lever/Workable/SmartRecruiters), flattens the salary object, isolates per-company failures, and is monitored by a daily canary so a silent API change doesn't quietly empty your pipeline.

The actor is here: Recruitee Jobs Scraper on Apify. Point it at your target companies and consume one normalized JSON feed.

Happy scraping.

DEV Community