Personio is the ATS standard of the German-speaking SMB world — thousands of companies in Germany, Austria, and Switzerland run their careers page on a {tenant}.jobs.personio.de portal. Here's the part most people miss: every one of those portals serves a public XML feed of its published positions — no API key, no login, no headless browser. One GET returns the whole board with departments, seniority levels, and full descriptions. In this tutorial we'll pull a company's job board as clean structured JSON in a few lines of Python.
The endpoint
Every Personio career portal serves its feed at the explicit /xml path:
GET https://{tenant}.jobs.personio.de/xml
GET https://{tenant}.jobs.personio.de/xml?language=en (optional localization)
The response is a <workzag-jobs> document with one <position> per job — id, office, department, name, employmentType, seniority, schedule, createdAt, and labeled description sections as CDATA HTML.
So why not just requests.get() it yourself? You can — but then you own the parser: handling CDATA sections, stripping the HTML, decoding entities, splitting multi-office strings, and fixing it the day the feed shape shifts and your pipeline goes quietly empty. A cleaner path: hand a list of tenants to an actor that returns one stable schema — the same schema as Greenhouse, Lever, Workable, SmartRecruiters, Recruitee, and Teamtailor. Here's how with the Personio Jobs Scraper.
Step 1 — Install the Apify client
pip install apify-client
Read your Apify API token (Console → Settings → Integrations) from an environment variable:
export APIFY_TOKEN="apify_api_xxx"
Step 2 — Run the actor with a list of tenants
companies accepts tenant subdomains (lanch) or {tenant}.jobs.personio.de URLs.
import os
from apify_client import ApifyClient
client = ApifyClient(os.environ["APIFY_TOKEN"])
run_input = {
"companies": ["teamative", "https://lanch.jobs.personio.de"],
"includeDescription": True,
"maxJobsPerCompany": 100,
}
run = client.actor("freshactors/personio-jobs-scraper").call(run_input=run_input)
print("Dataset id:", run["defaultDatasetId"])
Step 3 — Read the normalized output (departments + seniority included)
Every position comes back in the same shape, with null (never missing keys) where Personio's feed lacks a field:
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f'{item["company"]:<11} {item["title"]} [{item.get("department") or "n/a"} / {item.get("seniority") or "n/a"}]')
A single record (a real one, from teamative's portal):
{
"_type": "job",
"_schemaVersion": "1.0",
"_source": "personio",
"company": "teamative",
"jobId": "2623782",
"title": "Initiativbewerbung (m/w/d)",
"department": "Marketing",
"seniority": "experienced",
"location": "DE - Stuttgart",
"allLocations": ["DE - Stuttgart"],
"commitment": "Full-or-part-time",
"url": "https://teamative.jobs.personio.de/job/2623782",
"applyUrl": "https://teamative.jobs.personio.de/job/2623782",
"postedAt": "2026-05-05T08:58:57.000Z",
"descriptionText": "Über uns:\nteamative bietet Beratung, Entwicklung und... (labeled sections, clean text)",
"_scrapedAt": "2026-06-10T12:34:05.149Z"
}
This is the same record shape our Greenhouse & Lever, Workable, SmartRecruiters, Recruitee, and Teamtailor scrapers emit — plus Personio's department and seniority, segmentation fields most ATS feeds don't expose.
Step 4 — Localized or lighter output
Want English titles/descriptions where the company maintains them? Pass a language code. Only need metadata for a hiring-signal dashboard? Drop the descriptions:
run_input = {
"companies": ["teamative", "lanch"],
"language": "en", # localized where provided
"includeDescription": False, # smaller records; same cost & speed
"maxJobsPerCompany": 100,
}
Prefer Node.js?
npm install apify-client
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('freshactors/personio-jobs-scraper').call({
companies: ['teamative', 'https://lanch.jobs.personio.de'],
includeDescription: true,
maxJobsPerCompany: 100,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const job of items) console.log(`${job.company} — ${job.title} [${job.department ?? 'n/a'} / ${job.seniority ?? 'n/a'}]`);
What about cost?
Pay-per-event: $0.02 per company portal fetched and $0.0005 per job posting returned. So 5 companies returning 100 postings total is 5 × $0.02 + 100 × $0.0005 = $0.15 — departments, seniority, and full descriptions included. No subscription.
Why use the actor instead of the feed directly?
You can parse the XML yourself. The reason to use the actor is maintenance: it normalizes everything into one schema (shared with our five other ATS scrapers), handles CDATA/entities/multi-office strings, isolates per-company failures (an unknown tenant never kills your run), and is monitored by a daily canary — so a silent feed change doesn't quietly empty your pipeline.
The actor is here: Personio Jobs Scraper on Apify. Point it at your target companies and consume one normalized JSON feed.
Happy scraping.
Top comments (0)