DEV Community

agenthustler
agenthustler

Posted on • Edited on

How to Scrape Indeed in 2026: Job Listings, Salaries, and Company Reviews

Indeed is the largest job aggregator with over 300 million unique visitors per month. Whether you are building a job board aggregator, analyzing salary trends, or monitoring hiring activity in specific industries, Indeed data is incredibly valuable.

This guide shows you how to extract job listings, salary data, and company reviews from Indeed in 2026 — with working code, honest limitations, and practical workarounds.

What Data Can You Get from Indeed?

Indeed exposes several types of data:

  • Job listings: title, company, location, salary (when posted), description, date posted
  • Salary data: reported salaries by job title and location
  • Company reviews: ratings, pros/cons, CEO approval
  • Job trends: hiring velocity by industry and region

The most common use case is job listing aggregation — pulling thousands of postings for a specific role or location and analyzing salary ranges, required skills, or remote work availability.

Method 1: Scraping Indeed Search Results with Requests

Indeed search results load server-side, which means you can get listing data with plain HTTP requests — no browser automation needed for the initial results.

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Enter fullscreen mode Exit fullscreen mode

Getting Full Job Descriptions

The search results only show a snippet. To get the full job description, you need to fetch each individual job page:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Enter fullscreen mode Exit fullscreen mode

Method 2: Playwright for Dynamic Content

Indeed has been progressively moving to client-side rendering for some features. If the requests approach starts returning incomplete data, Playwright is your fallback:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Enter fullscreen mode Exit fullscreen mode

Scraping Indeed Salary Data

Indeed has a dedicated salary section that aggregates reported salaries by job title and location. This data is useful for market research:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Enter fullscreen mode Exit fullscreen mode

Scraping Company Reviews

Indeed company review pages contain ratings, written reviews, and pros/cons:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Enter fullscreen mode Exit fullscreen mode

Handling Anti-Bot Protection

Indeed bot detection is moderate compared to sites like LinkedIn or Zillow. Here is what you will face:

  1. Rate limiting: More than ~1 request per second from the same IP will trigger CAPTCHAs
  2. Cookie requirements: Sessions without proper cookies get flagged
  3. CAPTCHA walls: Automated requests eventually hit a CAPTCHA page
  4. IP blocking: Persistent scraping from datacenter IPs gets blocked within minutes

Solutions That Work

For small-scale scraping (under 500 pages): Add delays (2-3 seconds between requests), rotate User-Agent strings, and use a residential proxy.

For production-scale scraping: Use a scraping API that handles anti-bot automatically. ScraperAPI works well for Indeed — their proxy pool and header rotation handles the CAPTCHA problem:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Enter fullscreen mode Exit fullscreen mode

For comparing scraping API options (pricing, success rates, speed), ScrapeOps maintains a helpful proxy comparison dashboard that benchmarks different providers against popular targets including Indeed.

Exporting to Structured Data

Here is a complete pipeline that scrapes, cleans, and exports Indeed data:

import csv
import json
from datetime import datetime

def export_jobs(jobs: list, fmt: str = "csv"):
    """Export scraped jobs to CSV or JSON."""

    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

    # Clean the data
    clean_jobs = []
    seen_keys = set()

    for job in jobs:
        key = job.get("job_key")
        if key and key in seen_keys:
            continue
        if key:
            seen_keys.add(key)

        salary = job.get("salary", "")
        if salary:
            salary = salary.replace("\xa0", " ").strip()

        clean_jobs.append({
            "title": job.get("title", "").strip(),
            "company": job.get("company", "").strip(),
            "location": job.get("location", "").strip(),
            "salary": salary or "Not disclosed",
            "posted": job.get("posted", ""),
            "job_key": key,
            "scraped_at": datetime.now().isoformat()
        })

    if fmt == "csv":
        filename = f"indeed_jobs_{timestamp}.csv"
        with open(filename, "w", newline="", encoding="utf-8") as f:
            writer = csv.DictWriter(f, fieldnames=clean_jobs[0].keys())
            writer.writeheader()
            writer.writerows(clean_jobs)
    else:
        filename = f"indeed_jobs_{timestamp}.json"
        with open(filename, "w", encoding="utf-8") as f:
            json.dump(clean_jobs, f, indent=2, ensure_ascii=False)

    print(f"Exported {len(clean_jobs)} jobs to {filename}")
    return filename

export_jobs(jobs, fmt="csv")
Enter fullscreen mode Exit fullscreen mode

Legal and Ethical Notes

Indeed robots.txt disallows most scraping paths. Their ToS explicitly prohibits automated data collection. That said:

  • Public job listings are widely considered factual, non-copyrightable data
  • The hiQ v. LinkedIn precedent suggests that scraping public data is not a CFAA violation
  • However, commercial redistribution of Indeed data could trigger a civil lawsuit
  • Indeed actively sends cease-and-desist letters to scrapers that redistribute their data

Best practices: Scrape at respectful rates, do not redistribute raw data, use the data for analysis and insights rather than building a competing job board, and consider whether Indeed official APIs or partner programs might serve your needs.

The Easy Path: Pre-Built Job Scrapers

If you need job data without maintaining scraping infrastructure, the Apify Store has ready-to-use job scrapers including our Glassdoor Jobs and Reviews Scraper for a complementary data source. These handle proxy rotation, anti-bot bypasses, and structured output — you just define your search parameters and get clean JSON or CSV output.

For job market analysis, combining data from multiple sources (Indeed + Glassdoor + LinkedIn) gives you the most complete picture of salary ranges and hiring trends.


Building something cool with job data? Drop a comment below or check out our other scrapers on the Apify Store.

Top comments (0)