DEV Community

agenthustler
agenthustler

Posted on • Edited on

How to Scrape LinkedIn Job Listings in 2026 (Python + Public API, No Login Required)

LinkedIn is one of the largest job boards in the world, but it doesn't offer a free public API for job listings. The good news? You don't need one. LinkedIn exposes a public guest endpoint that serves job data without authentication.

In this guide, I'll show you how to scrape LinkedIn job listings in 2026 using Python — legally, efficiently, and without logging in.

Skip the Setup — Use Our Ready-Made Scraper

Building and maintaining a LinkedIn scraper takes days of debugging proxies, parsing HTML, and handling rate limits. Our LinkedIn Jobs Scraper is production-ready: 56+ users, anti-detection built in, structured JSON output, and 5,000 results per run on the free plan.

Try it free →


How LinkedIn's Public Jobs Endpoint Works

LinkedIn serves job listings to non-logged-in visitors through a guest-facing API. When you visit a LinkedIn job search page without being signed in, your browser hits endpoints under linkedin.com/jobs-guest/. These return HTML that can be parsed for structured job data.

The two key endpoints:

  • Job search: https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search?keywords={query}&location={location}&start={offset}
  • Job details: https://www.linkedin.com/jobs-guest/jobs/api/jobPosting/{job_id}

No API key. No OAuth. No login. These are public pages LinkedIn serves to search engines and anonymous visitors.

Query Parameters You Can Use

The search endpoint supports several useful parameters:

Parameter Example Description
keywords python+developer Job title or skill keywords
location United+States Geographic filter
start 25 Pagination offset (increments of 25)
f_TPR r86400 Time posted: last 24h (r86400), week (r604800), month (r2592000)
f_E 2 Experience level: 1=Internship, 2=Entry, 3=Associate, 4=Mid-Senior, 5=Director, 6=Executive
f_JT F Job type: F=Full-time, P=Part-time, C=Contract, T=Temporary, I=Internship
f_WT 2 Remote: 1=On-site, 2=Remote, 3=Hybrid

These parameters let you build very targeted job searches without any authentication.

Is This Legal?

Scraping publicly accessible data is generally legal, especially after the hiQ Labs v. LinkedIn ruling where the court affirmed that scraping public data does not violate the Computer Fraud and Abuse Act. That said:

  • Only scrape public endpoints (no login required)
  • Respect robots.txt and rate limits
  • Don't scrape personal profile data — stick to job listings
  • Don't hammer their servers — add delays between requests

This guide only uses public, unauthenticated endpoints.

Scraping LinkedIn Job Listings with Python

Step 1: Search for Jobs

import requests
from bs4 import BeautifulSoup
import time

def search_linkedin_jobs(keywords, location, num_jobs=25):
    jobs = []
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                      "AppleWebKit/537.36 (KHTML, like Gecko) "
                      "Chrome/120.0.0.0 Safari/537.36"
    }

    for start in range(0, num_jobs, 25):
        url = (
            "https://www.linkedin.com/jobs-guest/jobs/api/"
            "seeMoreJobPostings/search"
            f"?keywords={keywords}"
            f"&location={location}"
            f"&start={start}"
        )

        response = requests.get(url, headers=headers)
        if response.status_code != 200:
            print(f"Got status {response.status_code}, stopping.")
            break

        soup = BeautifulSoup(response.text, "html.parser")
        job_cards = soup.find_all("div", class_="base-card")

        for card in job_cards:
            title_el = card.find("h3", class_="base-search-card__title")
            company_el = card.find("h4", class_="base-search-card__subtitle")
            location_el = card.find("span", class_="job-search-card__location")
            link_el = card.find("a", class_="base-card__full-link")

            jobs.append({
                "title": title_el.text.strip() if title_el else None,
                "company": company_el.text.strip() if company_el else None,
                "location": location_el.text.strip() if location_el else None,
                "url": link_el["href"].split("?")[0] if link_el else None,
            })

        time.sleep(2)  # Be respectful

    return jobs

# Example usage
results = search_linkedin_jobs("python developer", "United States", num_jobs=50)
for job in results[:5]:
    print(f"{job['title']} at {job['company']} - {job['location']}")
Enter fullscreen mode Exit fullscreen mode

Step 2: Get Job Details

Each job listing has a numeric ID in its URL. Use it to fetch the full description:

def get_job_details(job_id):
    url = f"https://www.linkedin.com/jobs-guest/jobs/api/jobPosting/{job_id}"
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                      "AppleWebKit/537.36 (KHTML, like Gecko) "
                      "Chrome/120.0.0.0 Safari/537.36"
    }

    response = requests.get(url, headers=headers)
    if response.status_code != 200:
        return None

    soup = BeautifulSoup(response.text, "html.parser")

    description_el = soup.find("div", class_="show-more-less-html__markup")
    criteria = soup.find_all("li", class_="description__job-criteria-item")

    details = {
        "description": description_el.text.strip() if description_el else None,
        "criteria": {}
    }

    for item in criteria:
        label = item.find("h3")
        value = item.find("span")
        if label and value:
            details["criteria"][label.text.strip()] = value.text.strip()

    return details

# Extract job ID from URL and fetch details
job_url = "https://www.linkedin.com/jobs/view/3812345678"
job_id = job_url.split("/")[-1]
details = get_job_details(job_id)
if details:
    print(details["criteria"])
    print(details["description"][:500])
Enter fullscreen mode Exit fullscreen mode

Step 3: Full Pipeline — Search, Extract, and Save

Connecting search to detail extraction with proper error handling:

import csv
import json

def extract_job_id(url):
    if not url:
        return None
    parts = url.rstrip("/").split("/")
    return parts[-1] if parts[-1].isdigit() else None

def full_pipeline(keywords, location, num_jobs=25, output="linkedin_jobs"):
    results = search_linkedin_jobs(keywords, location, num_jobs=num_jobs)
    enriched = []

    for i, job in enumerate(results):
        job_id = extract_job_id(job["url"])
        if job_id:
            details = get_job_details(job_id)
            if details:
                job["description"] = details["description"]
                job.update(details["criteria"])
            print(f"[{i+1}/{len(results)}] {job['title']} at {job['company']}")
            time.sleep(2)
        enriched.append(job)

    # Save as JSON
    with open(f"{output}.json", "w", encoding="utf-8") as f:
        json.dump(enriched, f, indent=2, ensure_ascii=False)

    # Save as CSV
    if enriched:
        keys = set()
        for j in enriched:
            keys.update(j.keys())
        with open(f"{output}.csv", "w", newline="", encoding="utf-8") as f:
            writer = csv.DictWriter(f, fieldnames=sorted(keys))
            writer.writeheader()
            writer.writerows(enriched)

    print(f"Saved {len(enriched)} jobs to {output}.json and {output}.csv")
    return enriched

# Run the full pipeline
jobs = full_pipeline("data engineer", "Remote", num_jobs=50)
Enter fullscreen mode Exit fullscreen mode

Handling Rate Limits and Blocks

LinkedIn will start returning 429 errors if you scrape too fast. A few practical tips:

  1. Add delays: 2-3 seconds between requests minimum
  2. Rotate User-Agents: Use a pool of realistic browser UA strings
  3. Use proxy rotation: Essential for any serious volume
  4. Implement exponential backoff: Double your wait time after each 429 error

User-Agent Rotation

import random

USER_AGENTS = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/131.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_0) AppleWebKit/537.36 Chrome/131.0.0.0 Safari/537.36",
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 Chrome/131.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:132.0) Gecko/20100101 Firefox/132.0",
]

def get_random_headers():
    return {"User-Agent": random.choice(USER_AGENTS)}
Enter fullscreen mode Exit fullscreen mode

Proxy Solutions for LinkedIn Scraping

For anything beyond light testing, you will want rotating proxies. I have had good results with ScraperAPI — it handles proxy rotation, retries, and CAPTCHAs automatically. You just prefix your target URL:

# Using ScraperAPI for proxy rotation
def search_with_proxy(keywords, location):
    target_url = (
        "https://www.linkedin.com/jobs-guest/jobs/api/"
        f"seeMoreJobPostings/search?keywords={keywords}"
        f"&location={location}&start=0"
    )

    api_url = (
        f"http://api.scraperapi.com"
        f"?api_key=YOUR_SCRAPERAPI_KEY"
        f"&url={target_url}"
    )

    response = requests.get(api_url)
    return response.text
Enter fullscreen mode Exit fullscreen mode

ScraperAPI offers a free tier with 5,000 requests, which is enough to test your pipeline. For production scraping, their plans handle the IP rotation and retry logic so you don't have to.

The DIY Approach vs. a Managed Scraper

Building your own LinkedIn scraper is educational, but maintaining it is another story. LinkedIn changes their HTML structure every few weeks, rate limits shift without notice, and proxy management becomes a full-time job at scale.

Here's how the approaches compare:

Factor DIY Python Script Managed Scraper (Apify)
Setup time Hours to days Minutes
Maintenance Constant (HTML changes break parsers) Handled by the provider
Proxy management You handle it Built in
Output format Whatever you build Structured JSON, CSV, Excel
Scheduling Cron jobs, manual Built-in scheduling
Anti-detection You implement Pre-built
Cost Free (+ proxy costs) Free tier available

Production-Ready Solution: LinkedIn Jobs Scraper on Apify

If you need reliable LinkedIn job data without the maintenance headache, I built a LinkedIn Jobs Scraper on Apify that handles all of the above. It uses the same public endpoints covered in this guide, with:

  • Anti-detection: Built-in proxy rotation and request throttling
  • Structured output: Clean JSON with title, company, location, description, salary, and more
  • Scheduling: Set it to run daily/weekly and get fresh data automatically
  • Scale: Scrape thousands of jobs in a single run
  • Free tier: 5,000 results per run at no cost

It's used by 56+ recruiters, researchers, and data teams who need LinkedIn job data without building infrastructure.

Try the LinkedIn Jobs Scraper free →

Storing Results

For anything beyond quick scripts, save to a structured format:

import csv

def save_to_csv(jobs, filename="linkedin_jobs.csv"):
    if not jobs:
        return
    keys = jobs[0].keys()
    with open(filename, "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=keys)
        writer.writeheader()
        writer.writerows(jobs)
    print(f"Saved {len(jobs)} jobs to {filename}")

results = search_linkedin_jobs("machine learning", "New York", num_jobs=50)
save_to_csv(results)
Enter fullscreen mode Exit fullscreen mode

Key Takeaways

  • LinkedIn's jobs-guest endpoints are public and don't require authentication
  • You can search jobs and fetch full descriptions with basic Python (requests + BeautifulSoup)
  • Use query parameters (f_TPR, f_E, f_JT, f_WT) to target specific job types and experience levels
  • Respect rate limits — add delays, rotate user agents, and consider a proxy service like ScraperAPI for volume
  • For a production-ready, maintenance-free solution, use the LinkedIn Jobs Scraper on Apify — free tier included
  • Always scrape responsibly — public data only, reasonable request rates, no personal data

Happy scraping.

Pro tip: For reliable proxy rotation and residential IPs, check out ThorData — they offer competitive rates for web scraping at scale.


Related Articles

Top comments (0)