Abdulswamad Rama

Posted on Jun 1

Gemini called it a public API. Careerjet's registration portal disagreed.

#ai #python #webdev #api

I was building something that needed job listings. Gemini flagged Careerjet as a solid option:

"Careerjet has a public API — structured data, no scraping, Kenya support."

Four requirements it didn't mention.

What I built first

The old Careerjet affiliate endpoint (public.api.careerjet.net/search) responds to unauthenticated requests. You pass an affid parameter — originally used so Careerjet could track clicks and split revenue with partner sites. I passed a placeholder and moved on.

params = urlencode({
    "keywords": "software engineer",
    "location": "Kenya",
    "locale_code": "en_KE",
    "affid": "your-affid",       # placeholder — I planned to "sort this out later"
    "pagesize": 100,
})
url = f"http://public.api.careerjet.net/search?{params}"

It returned JSON. I wrote a parser around it. I shipped the file.

This would have worked right up until it didn't — either Careerjet deprecates the old endpoint, tightens auth requirements, or (more likely) nothing gets properly attributed and whatever publisher agreement exists is quietly violated. But the response came back clean, so I moved on.

Then I actually registered

Out of curiosity, I went to Careerjet's publisher portal. What I found:

An API key tied to my registered domain
A mandatory IP whitelist — I had to declare which server IPs were allowed to call the API before anything worked
HTTP Basic auth on every request: API key as the username, empty string as the password

The real endpoint is https://search.api.careerjet.net/v4/query. Note the v4, the https, the different subdomain, and the complete absence of an affid param.

"Public" in Careerjet's vocabulary means available to any publisher who registers, without an approval process. Not unauthenticated. The AI had collapsed two different meanings into one confident sentence.

The corrected implementation

import base64
import urllib.request
import json
from urllib.parse import urlencode

API_URL = "https://search.api.careerjet.net/v4/query"

def fetch_jobs(keywords: str, location: str, api_key: str, user_ip: str) -> dict:
    params = urlencode({
        "locale_code": "en_KE",
        "keywords": keywords,
        "location": location,
        "sort": "date",
        "page_size": 100,
        "user_ip": user_ip,
        "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    })

    # Basic auth: base64(api_key + ":")  — password is empty string
    credentials = base64.b64encode(f"{api_key}:".encode()).decode()

    req = urllib.request.Request(f"{API_URL}?{params}")
    req.add_header("Authorization", f"Basic {credentials}")
    req.add_header("Referer", "https://your-site.com/jobs/")

    with urllib.request.urlopen(req, timeout=20) as resp:
        return json.loads(resp.read())

Three things the AI's description had entirely omitted:

Basic auth with an empty password. The format is base64("your-api-key-here" + ":"). Username is the key, password is blank. Standard HTTP Basic auth — but you'd only know to look for it in the actual docs.

user_ip and user_agent are required fields. This one is the most interesting. The v4 API was designed for publishers embedding job search in their website — a real user types a query, clicks a button, the API fires. Careerjet wants the user's IP and browser string for analytics.

My use case: a cron job, no browser, no user, runs at 23:00 UTC. There is no user IP to pass. The spec says "the IP of the user whose action triggered the API call." The nightly cron is the trigger. I pass the server's outbound IP (203.0.113.1 in these examples). The API accepts it.

IP whitelisting is mandatory before anything works. In the publisher dashboard there's a text box — up to 8 IP addresses. Your server's outbound IP must be there or every request returns 403 regardless of correct auth headers. Register it first, then deploy.

Handle the response type

The API returns either "JOBS" or "LOCATIONS". The LOCATIONS case happens when your location param is ambiguous:

data = fetch_jobs(...)

if data.get("type") != "JOBS":
    # LOCATIONS response: Careerjet couldn't resolve the location
    # log data.get("message") and bail
    return []

jobs = data.get("jobs", [])

"Kenya" and "Nairobi" both resolve cleanly. Vaguer strings may not.

Each job object has title, company, locations, description (excerpt), salary, and url. The salary field comes back as a human-readable string like "KES 80,000 – 120,000 per month" — more useful than trying to parse salary out of a scraped description.

What the AI got wrong

It wasn't lying. The old affiliate endpoint at public.api.careerjet.net is real, responds to requests, and technically works. The description "public API, no scraping" isn't false — it's just about a different version of the API than the one publishers are supposed to use.

The failure mode is: the AI described behavior that was accurate at some point, for some use case, and I didn't verify which use case it was describing. "Public API" is a phrase that means something specific to the developer who wrote the docs and something looser to a model that's synthesizing from across the web.

The fix took twenty minutes — register, whitelist the IP, swap the endpoint, add the auth header. The registration is instant, no approval needed. The documentation is clear.

The cost of not checking: an integration that works in development, silently misattributes traffic, and breaks when the old endpoint eventually goes away.

Top comments (1)

Harjot Singh • Jun 1

it's interesting how public APIs can have hidden complexities that aren't obvious at first. your experience with Careerjet highlights that well. if you ever need a quick way to get your own app up and running, Moonshift can deploy a full next.js + postgres + auth build in about 7 minutes, and you own the code on your github. let me know if you want to give it a shot for free.