DEV Community

agenthustler
agenthustler

Posted on

How to Scrape LinkedIn in 2026: Jobs, Profiles, and Company Data

LinkedIn is the largest professional network with 1B+ members, and its data is gold for recruiters, market researchers, and job seekers. But scraping it has a reputation for being difficult.

Good news: LinkedIn's public jobs and company pages are accessible without authentication in 2026. Here's exactly how to do it.

Is Scraping LinkedIn Legal?

Yes, for public data. The landmark hiQ Labs v. LinkedIn ruling (2022) confirmed that scraping publicly accessible data does not violate the Computer Fraud and Abuse Act. LinkedIn's public job listings, company pages, and public profiles are fair game.

What you cannot do:

  • Log in with fake accounts to access private data
  • Scrape data behind authentication walls
  • Violate GDPR by collecting EU personal data without lawful basis

Stick to public endpoints and you're on solid legal ground.

LinkedIn's Public Jobs API

LinkedIn exposes job listings through a public-facing endpoint that requires no authentication:

https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search?keywords={query}&location={location}&start={offset}
Enter fullscreen mode Exit fullscreen mode

Each request returns 25 job cards with: title, company, location, posting date, and a link to the full listing.

Python Example

import requests
from bs4 import BeautifulSoup

def scrape_linkedin_jobs(keyword, location, num_pages=3):
    jobs = []
    for page in range(num_pages):
        url = (
            f"https://www.linkedin.com/jobs-guest/jobs/api/"
            f"seeMoreJobPostings/search"
            f"?keywords={keyword}&location={location}"
            f"&start={page * 25}"
        )
        resp = requests.get(url, headers={
            "User-Agent": "Mozilla/5.0"
        })
        soup = BeautifulSoup(resp.text, "html.parser")

        for card in soup.find_all("div", class_="base-card"):
            title = card.find("h3", class_="base-search-card__title")
            company = card.find("h4", class_="base-search-card__subtitle")
            location_el = card.find("span", class_="job-search-card__location")
            link = card.find("a", class_="base-card__full-link")

            jobs.append({
                "title": title.text.strip() if title else None,
                "company": company.text.strip() if company else None,
                "location": location_el.text.strip() if location_el else None,
                "url": link["href"] if link else None,
            })
    return jobs

results = scrape_linkedin_jobs("python developer", "United States")
print(f"Found {len(results)} jobs")
for job in results[:3]:
    print(f"  {job['title']} at {job['company']} - {job['location']}")
Enter fullscreen mode Exit fullscreen mode

Sample Output

Found 75 jobs
  Senior Python Developer at Google - Mountain View, CA
  Python Engineer at Stripe - San Francisco, CA
  Backend Developer (Python) at Shopify - Remote
Enter fullscreen mode Exit fullscreen mode

Scaling with Proxies

The public endpoint works great for small queries, but LinkedIn rate-limits aggressively. After 50-100 requests from the same IP, you'll start getting 429 responses.

For production scraping, you need proxy rotation. ScrapeOps works well here — their smart routing handles LinkedIn's rate limiting by distributing requests across multiple providers.

The Easy Way: Use a Pre-Built Actor

If you don't want to build and maintain your own scraper, I've published a ready-to-use LinkedIn Jobs Scraper on Apify that handles all of this for you:

  • No code required — configure via UI or API
  • Built-in proxy rotation — handles rate limits automatically
  • Structured JSON output — title, company, location, salary, description
  • Scales to thousands of jobs — pagination and retries built in

Sample Input

{
  "searchKeywords": "data engineer",
  "location": "New York",
  "maxResults": 100
}
Enter fullscreen mode Exit fullscreen mode

Sample Output

[
  {
    "title": "Senior Data Engineer",
    "company": "JPMorgan Chase",
    "location": "New York, NY",
    "postedDate": "2 days ago",
    "applicants": "142 applicants",
    "url": "https://www.linkedin.com/jobs/view/3847291023"
  }
]
Enter fullscreen mode Exit fullscreen mode

Try it free: apify.com/cryptosignals/linkedin-jobs-scraper

Company Data

LinkedIn company pages are also public. You can extract:

  • Company name, industry, size, and headquarters
  • Employee count and growth trends
  • Recent job postings
  • Company description and specialties

The URL pattern is https://www.linkedin.com/company/{slug}/ and the data is rendered in the initial HTML for most fields.

Tips for Reliable LinkedIn Scraping

  1. Rotate User-Agents — LinkedIn checks this. Rotate between 10+ real browser UAs.
  2. Add random delays — 2-5 seconds between requests minimum. LinkedIn's rate limiter is aggressive.
  3. Use residential proxies — Datacenter IPs get blocked quickly on LinkedIn.
  4. Cache aggressively — Job listings don't change often. Cache for 24h to reduce requests.
  5. Respect robots.txt — LinkedIn allows /jobs-guest/ in their robots.txt. Stay within allowed paths.

What You Can Build

  • Job market dashboards — Track hiring trends by role, company, or location
  • Salary research tools — Aggregate salary data from job descriptions
  • Competitor intelligence — Monitor who your competitors are hiring
  • Recruitment automation — Alert when matching jobs are posted
  • Market research — Analyze demand for specific skills over time

LinkedIn's public data is one of the most valuable datasets for anyone in recruiting, HR tech, or market research. The tools exist — the LinkedIn Jobs Scraper handles it out of the box, or build your own with the code above.

Have questions about LinkedIn scraping? Drop them in the comments.

Top comments (0)