agenthustler

Posted on Mar 25 • Edited on Apr 19

How to Scrape LinkedIn Job Listings in 2026: Public Data Without Login

#python #datascience #tutorial #webdev

LinkedIn is one of the richest sources of job market data — millions of listings updated daily across every industry and geography. Whether you're building an HR tech tool, researching salary trends, or aggregating job boards, scraping LinkedIn job listings is a high-value skill.

The good news? LinkedIn exposes a public guest API for job listings that requires no login or authentication. In this guide, I'll show you exactly how to use it with Python.

Why Scrape LinkedIn Jobs?

Market research: Track hiring trends by industry, location, or company size
Competitive intelligence: Monitor what roles your competitors are filling
Job board aggregation: Feed listings into your own platform
Salary analysis: Combine with other data sources for compensation benchmarking
Academic research: Study labor market dynamics at scale

The LinkedIn Jobs Guest API

LinkedIn serves job listings to non-logged-in visitors through a public-facing endpoint. This is the same data you see when you Google "software engineer jobs LinkedIn" and click through without signing in.

The base endpoint:

https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search

Key Parameters

Parameter	Description	Example
`keywords`	Job title or search terms	`python developer`
`location`	City, state, or country	`San Francisco, CA`
`geoId`	LinkedIn geo identifier	`103644278` (US)
`f_TPR`	Time posted filter	`r86400` (past 24h)
`f_E`	Experience level	`2` (entry), `3` (associate)
`start`	Pagination offset	`0`, `25`, `50`...
`f_C`	Company ID filter	`1441` (Google)

Basic Scraper: Fetching Job Listings

Let's build a practical scraper step by step.

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

Extracting Job Details

Each listing has a detail page you can fetch for full descriptions:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

Advanced Filtering

LinkedIn's guest API supports powerful filtering. Here's how to combine filters:

def search_with_filters(
    keywords,
    location,
    experience_level=None,
    job_type=None,
    posted_within=None,
    company_id=None,
    remote=False
):
    params = {
        "keywords": keywords,
        "location": location,
    }

    # Experience: 1=Internship, 2=Entry, 3=Associate, 4=Mid-Senior, 5=Director, 6=Executive
    if experience_level:
        params["f_E"] = experience_level

    # Job types: F=Full-time, P=Part-time, C=Contract, T=Temporary, I=Internship
    if job_type:
        params["f_JT"] = job_type

    # Time filters: r86400=24h, r604800=week, r2592000=month
    if posted_within:
        params["f_TPR"] = posted_within

    if company_id:
        params["f_C"] = company_id

    if remote:
        params["f_WT"] = "2"

    return params


# Example: Remote senior Python jobs posted in last 24 hours
filters = search_with_filters(
    keywords="python developer",
    location="United States",
    experience_level="4",
    job_type="F",
    posted_within="r86400",
    remote=True
)

Handling Pagination at Scale

When scraping thousands of listings, you need robust pagination:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

Dealing with Rate Limits and Blocks

LinkedIn will throttle or block you if you hit their servers too aggressively. Here are practical strategies:

1. Rotate User Agents

import random

USER_AGENTS = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36",
]

headers = {"User-Agent": random.choice(USER_AGENTS)}

2. Use a Proxy Aggregator

For production workloads, a proxy rotation service is essential. ScrapeOps is a proxy aggregator that routes your requests through multiple proxy providers, automatically finding the cheapest working proxy for each request:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

3. Managed Scraping API

If you'd rather skip the infrastructure altogether, ScraperAPI handles proxy rotation, CAPTCHAs, and retries for you:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

Saving Results to CSV

import csv

def save_to_csv(jobs, filename="linkedin_jobs.csv"):
    if not jobs:
        return

    fieldnames = jobs[0].keys()

    with open(filename, "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(jobs)

    print(f"Saved {len(jobs)} jobs to {filename}")

Legal and Ethical Considerations

Public data only: This guide covers publicly accessible job listings — the same data visible to any Google searcher without a LinkedIn account
Respect robots.txt: Check LinkedIn's robots.txt and honor crawl-delay directives
Rate limit yourself: Add delays between requests (2-5 seconds minimum)
Don't scrape profiles: Personal profile data has different legal implications than public job posts
Check LinkedIn's ToS: Terms change — review them periodically
GDPR considerations: If you're storing data about EU individuals, ensure compliance

Complete Working Example

Here's the full script you can run right now:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

Wrapping Up

LinkedIn's public jobs API is a goldmine for job market data. The key principles:

Use the guest API — no login needed for public job listings
Paginate with the start parameter — increment by 25
Filter aggressively — use f_TPR, f_E, f_JT to narrow results
Respect rate limits — add delays, rotate user agents
Use proxies for scale — ScrapeOps for proxy aggregation or ScraperAPI for fully managed scraping

Happy scraping!

DEV Community