DEV Community

agenthustler
agenthustler

Posted on

ZipRecruiter Scraping: Extract Job Listings and Salary Data

ZipRecruiter is one of the largest job marketplaces in the US, with millions of active job listings, salary estimates, and employer profiles. Scraping ZipRecruiter gives you access to real-time labor market data — job titles, compensation ranges, required skills, company information, and application trends.

In this comprehensive guide, we'll cover how ZipRecruiter structures its data, what's available for extraction, and how to build a production-ready scraper using JavaScript and the Apify platform. Whether you're building a job aggregator, analyzing salary trends, or researching hiring patterns, this guide has you covered.

Why Scrape ZipRecruiter?

ZipRecruiter aggregates job listings from hundreds of thousands of employers. Each listing contains valuable structured data:

  • Job titles and descriptions with detailed requirements
  • Salary estimates — both employer-provided and ZipRecruiter's AI-generated ranges
  • Company profiles with ratings, size, industry, and culture info
  • Location data including remote/hybrid/on-site designations
  • Application metrics like "one-click apply" availability
  • Skills and qualifications extracted from job descriptions
  • Posted dates and urgency signals ("hiring urgently", "few applicants")

This data powers workforce analytics, compensation benchmarking, talent market research, and job aggregation platforms.

Understanding ZipRecruiter's Structure

ZipRecruiter organizes content around search results, job detail pages, company profiles, and salary guides.

Search Results Pages

The primary entry point is the search interface:

https://www.ziprecruiter.com/jobs-search?search=software+engineer&location=San+Francisco
Enter fullscreen mode Exit fullscreen mode

Key URL parameters:

  • search — job title or keywords
  • location — city, state, or "Remote"
  • radius — search radius in miles
  • days — posted within N days
  • refine_by_salary — minimum salary filter

Search results are paginated and typically show 20 jobs per page.

Job Detail Pages

Each job has a dedicated page with the full listing:

https://www.ziprecruiter.com/c/CompanyName/Job/Job-Title/-in-City,ST?jid=ABC123
Enter fullscreen mode Exit fullscreen mode

These pages contain the complete job description, requirements, salary info, company details, and application options.

Company Profile Pages

Employer profiles aggregate all of a company's listings:

https://www.ziprecruiter.com/c/CompanyName/Jobs
Enter fullscreen mode Exit fullscreen mode

Salary Pages

ZipRecruiter also has dedicated salary information pages:

https://www.ziprecruiter.com/Salaries/Software-Engineer-Salary
Enter fullscreen mode Exit fullscreen mode

These show national averages, state-by-state breakdowns, and related job titles.

Setting Up the Scraper

We'll build this scraper using Apify's Crawlee framework with Puppeteer for browser automation.

import { PuppeteerCrawler, Dataset } from 'crawlee';
import { Actor } from 'apify';

await Actor.init();

const input = await Actor.getInput() ?? {};
const {
    searchQueries = [
        { keyword: 'software engineer', location: 'Remote' }
    ],
    maxResults = 100,
    scrapeSalaryPages = true,
    scrapeCompanyProfiles = false,
} = input;

const crawler = new PuppeteerCrawler({
    maxConcurrency: 2,
    maxRequestsPerMinute: 10,
    navigationTimeoutSecs: 60,
    requestHandlerTimeoutSecs: 120,

    launchContext: {
        launchOptions: {
            headless: true,
            args: [
                '--no-sandbox',
                '--disable-setuid-sandbox',
                '--disable-blink-features=AutomationControlled',
            ],
        },
    },

    async requestHandler({ request, page, enqueueLinks, log }) {
        const { label } = request.userData;

        switch (label) {
            case 'SEARCH':
                await handleSearchResults(page, enqueueLinks, log, request);
                break;
            case 'JOB_DETAIL':
                await handleJobDetail(page, log);
                break;
            case 'COMPANY':
                await handleCompanyProfile(page, log);
                break;
            case 'SALARY':
                await handleSalaryPage(page, log);
                break;
        }
    },

    async failedRequestHandler({ request, log }) {
        log.error(`Failed: ${request.url}`);
        await Dataset.pushData({
            type: 'error',
            url: request.url,
            errors: request.errorMessages,
        });
    },
});
Enter fullscreen mode Exit fullscreen mode

Scraping Search Results

Search results are your starting point for discovering job listings. Here's how to extract the data from search result pages:

async function handleSearchResults(page, enqueueLinks, log, request) {
    log.info(`Processing search results: ${request.url}`);

    // Wait for job cards to load
    await page.waitForSelector(
        '[class*="job_result"], [data-testid="job-card"]',
        { timeout: 15000 }
    );

    // Extract job summaries from search results
    const jobs = await page.evaluate(() => {
        const jobCards = document.querySelectorAll(
            '[class*="job_result"], [data-testid="job-card"]'
        );

        return Array.from(jobCards).map(card => {
            const titleEl = card.querySelector(
                'a[class*="job_link"], [data-testid="job-title"] a'
            );
            const companyEl = card.querySelector(
                '[class*="company_name"], [data-testid="company-name"]'
            );
            const locationEl = card.querySelector(
                '[class*="location"], [data-testid="job-location"]'
            );
            const salaryEl = card.querySelector(
                '[class*="salary"], [data-testid="salary-estimate"]'
            );
            const snippetEl = card.querySelector(
                '[class*="snippet"], [class*="description"]'
            );
            const postedEl = card.querySelector(
                '[class*="posted"], time'
            );
            const urgencyEl = card.querySelector(
                '[class*="urgent"], [class*="hiring"]'
            );
            const easyApplyEl = card.querySelector(
                '[class*="one_click"], [class*="easy-apply"]'
            );

            return {
                title: titleEl?.textContent?.trim() || '',
                url: titleEl?.href || '',
                company: companyEl?.textContent?.trim() || '',
                location: locationEl?.textContent?.trim() || '',
                salary: salaryEl?.textContent?.trim() || null,
                snippet: snippetEl?.textContent?.trim() || '',
                postedDate: postedEl?.textContent?.trim() || '',
                isUrgent: !!urgencyEl,
                hasOneClickApply: !!easyApplyEl,
            };
        });
    });

    log.info(`Found ${jobs.length} jobs on search page`);

    // Save summary data
    for (const job of jobs) {
        await Dataset.pushData({
            type: 'search_result',
            ...job,
            searchQuery: request.userData.keyword,
            scrapedAt: new Date().toISOString(),
        });
    }

    // Enqueue job detail pages
    await enqueueLinks({
        selector: 'a[class*="job_link"], [data-testid="job-title"] a',
        userData: { label: 'JOB_DETAIL' },
    });

    // Handle pagination
    await enqueueLinks({
        selector: 'a[aria-label="Next"], [class*="next-page"] a',
        userData: {
            label: 'SEARCH',
            keyword: request.userData.keyword,
        },
    });
}
Enter fullscreen mode Exit fullscreen mode

Extracting Job Details

The job detail page is where you get the full picture — complete descriptions, requirements, benefits, and salary data:

async function handleJobDetail(page, log) {
    log.info(`Processing job detail: ${page.url()}`);

    await page.waitForSelector(
        '[class*="job_description"], [data-testid="job-description"]',
        { timeout: 15000 }
    );

    const jobData = await page.evaluate(() => {
        // Job title and basic info
        const title = document.querySelector(
            'h1, [class*="job-title"]'
        )?.textContent?.trim();
        const company = document.querySelector(
            '[class*="company_name"], [data-testid="company-name"]'
        )?.textContent?.trim();
        const location = document.querySelector(
            '[class*="location"], [data-testid="location"]'
        )?.textContent?.trim();

        // Salary information
        const salarySection = document.querySelector(
            '[class*="salary"], [data-testid="salary-section"]'
        );
        const salaryText = salarySection?.textContent?.trim() || '';
        const salaryMatch = salaryText.match(
            /\$([\d,]+)(?:\s*[-–to]+\s*\$([\d,]+))?/
        );
        const salary = {
            raw: salaryText,
            min: salaryMatch?.[1]?.replace(/,/g, '') 
                ? parseInt(salaryMatch[1].replace(/,/g, '')) 
                : null,
            max: salaryMatch?.[2]?.replace(/,/g, '') 
                ? parseInt(salaryMatch[2].replace(/,/g, '')) 
                : null,
            type: salaryText.toLowerCase().includes('year') 
                ? 'annual' 
                : salaryText.toLowerCase().includes('hour') 
                    ? 'hourly' 
                    : 'unknown',
            isEstimate: salaryText.toLowerCase().includes('estimated'),
        };

        // Full job description
        const descriptionEl = document.querySelector(
            '[class*="job_description"], [data-testid="job-description"]'
        );
        const fullDescription = descriptionEl?.textContent?.trim() || '';
        const descriptionHtml = descriptionEl?.innerHTML || '';

        // Extract requirements / qualifications
        const requirementsList = [];
        const listItems = descriptionEl?.querySelectorAll('li') || [];
        listItems.forEach(li => {
            requirementsList.push(li.textContent.trim());
        });

        // Employment type
        const employmentType = document.querySelector(
            '[class*="employment-type"], [class*="job_type"]'
        )?.textContent?.trim();

        // Benefits
        const benefits = Array.from(
            document.querySelectorAll(
                '[class*="benefit"] li, [data-testid="benefit-item"]'
            )
        ).map(el => el.textContent.trim());

        // Application info
        const hasOneClickApply = !!document.querySelector(
            '[class*="one_click_apply"], [data-testid="easy-apply-btn"]'
        );
        const applyUrl = document.querySelector(
            'a[class*="apply"], [data-testid="apply-button"]'
        )?.href;

        // Posted info
        const postedDate = document.querySelector(
            '[class*="posted"], time[datetime]'
        )?.getAttribute('datetime') || document.querySelector(
            '[class*="posted"]'
        )?.textContent?.trim();

        // Workplace type
        const isRemote = [location, title, fullDescription].some(
            text => text?.toLowerCase().includes('remote')
        );
        const isHybrid = [location, fullDescription].some(
            text => text?.toLowerCase().includes('hybrid')
        );

        return {
            type: 'job_detail',
            title,
            company,
            location,
            salary,
            fullDescription,
            requirements: requirementsList,
            employmentType,
            benefits,
            hasOneClickApply,
            applyUrl,
            postedDate,
            workplaceType: isRemote ? 'remote' : isHybrid ? 'hybrid' : 'on-site',
            url: window.location.href,
            scrapedAt: new Date().toISOString(),
        };
    });

    await Dataset.pushData(jobData);

    log.info(
        `Extracted: ${jobData.title} at ${jobData.company}` +
        ` | Salary: ${jobData.salary.raw || 'Not listed'}` +
        ` | ${jobData.workplaceType}`
    );
}
Enter fullscreen mode Exit fullscreen mode

Extracting Salary Data

ZipRecruiter's salary pages provide aggregated compensation data that's invaluable for market research:

async function handleSalaryPage(page, log) {
    log.info(`Processing salary page: ${page.url()}`);

    await page.waitForSelector(
        '[class*="salary-info"], [data-testid="salary-data"]',
        { timeout: 15000 }
    );

    const salaryData = await page.evaluate(() => {
        const jobTitle = document.querySelector(
            'h1'
        )?.textContent?.trim()?.replace(/Salary$/i, '').trim();

        // National average
        const nationalAvg = document.querySelector(
            '[class*="national-avg"], [data-testid="national-average"]'
        )?.textContent?.trim();

        // Salary range
        const rangeSection = document.querySelector(
            '[class*="salary-range"]'
        );
        const percentiles = {};
        const percentileEls = rangeSection?.querySelectorAll(
            '[class*="percentile"]'
        ) || [];
        percentileEls.forEach(el => {
            const label = el.querySelector(
                '[class*="label"]'
            )?.textContent?.trim();
            const value = el.querySelector(
                '[class*="value"]'
            )?.textContent?.trim();
            if (label && value) {
                percentiles[label.toLowerCase()] = value;
            }
        });

        // State-by-state data
        const stateData = Array.from(
            document.querySelectorAll(
                '[class*="state-row"], [data-testid="state-salary"]'
            )
        ).map(row => ({
            state: row.querySelector(
                '[class*="state-name"]'
            )?.textContent?.trim(),
            avgSalary: row.querySelector(
                '[class*="salary"]'
            )?.textContent?.trim(),
            jobCount: row.querySelector(
                '[class*="job-count"]'
            )?.textContent?.trim(),
        }));

        // Related job titles
        const relatedTitles = Array.from(
            document.querySelectorAll(
                '[class*="related-title"], [data-testid="related-job"]'
            )
        ).map(el => ({
            title: el.querySelector('a, [class*="title"]')
                ?.textContent?.trim(),
            salary: el.querySelector('[class*="salary"]')
                ?.textContent?.trim(),
            url: el.querySelector('a')?.href,
        }));

        return {
            type: 'salary_data',
            jobTitle,
            nationalAverage: nationalAvg,
            percentiles,
            stateBreakdown: stateData,
            relatedTitles,
            url: window.location.href,
            scrapedAt: new Date().toISOString(),
        };
    });

    await Dataset.pushData(salaryData);
    log.info(
        `Salary data for "${salaryData.jobTitle}":` +
        ` National avg ${salaryData.nationalAverage}`
    );
}
Enter fullscreen mode Exit fullscreen mode

Employer Profile Extraction

Company profiles reveal hiring patterns, culture, and organizational data:

async function handleCompanyProfile(page, log) {
    log.info(`Processing company profile: ${page.url()}`);

    const companyData = await page.evaluate(() => {
        const name = document.querySelector(
            'h1, [class*="company-name"]'
        )?.textContent?.trim();

        const rating = parseFloat(
            document.querySelector(
                '[class*="company-rating"] [class*="value"]'
            )?.textContent
        ) || null;

        const reviewCount = parseInt(
            document.querySelector(
                '[class*="review-count"]'
            )?.textContent?.replace(/[^0-9]/g, '')
        ) || 0;

        const overview = document.querySelector(
            '[class*="company-overview"], [class*="about"]'
        )?.textContent?.trim();

        const details = {};
        const detailRows = document.querySelectorAll(
            '[class*="company-detail"] [class*="row"]'
        );
        detailRows.forEach(row => {
            const label = row.querySelector(
                '[class*="label"]'
            )?.textContent?.trim();
            const value = row.querySelector(
                '[class*="value"]'
            )?.textContent?.trim();
            if (label && value) {
                details[label.toLowerCase().replace(/\s+/g, '_')] = value;
            }
        });

        // Active job count
        const activeJobs = parseInt(
            document.querySelector(
                '[class*="job-count"], [class*="active-jobs"]'
            )?.textContent?.replace(/[^0-9]/g, '')
        ) || 0;

        return {
            type: 'company_profile',
            name,
            rating,
            reviewCount,
            overview,
            details,
            activeJobs,
            url: window.location.href,
            scrapedAt: new Date().toISOString(),
        };
    });

    await Dataset.pushData(companyData);
    log.info(`Company: ${companyData.name} | Rating: ${companyData.rating} | Jobs: ${companyData.activeJobs}`);
}
Enter fullscreen mode Exit fullscreen mode

Complete Actor Configuration

Here's how to wire everything together into a complete Apify actor:

import { Actor } from 'apify';
import { PuppeteerCrawler, Dataset } from 'crawlee';

await Actor.init();

const input = await Actor.getInput() ?? {};
const {
    searchQueries = [
        { keyword: 'software engineer', location: 'Remote' }
    ],
    maxPages = 5,
    scrapeSalaryPages = true,
    scrapeCompanyProfiles = false,
} = input;

// Build start URLs from search queries
const startUrls = searchQueries.map(q => ({
    url: `https://www.ziprecruiter.com/jobs-search?search=${
        encodeURIComponent(q.keyword)
    }&location=${encodeURIComponent(q.location || '')}`,
    userData: {
        label: 'SEARCH',
        keyword: q.keyword,
        page: 1,
    },
}));

// Add salary page URLs if requested
if (scrapeSalaryPages) {
    for (const q of searchQueries) {
        const slug = q.keyword.split(' ')
            .map(w => w.charAt(0).toUpperCase() + w.slice(1))
            .join('-');
        startUrls.push({
            url: `https://www.ziprecruiter.com/Salaries/${slug}-Salary`,
            userData: { label: 'SALARY' },
        });
    }
}

const proxyConfiguration = await Actor.createProxyConfiguration({
    groups: ['RESIDENTIAL'],
});

const crawler = new PuppeteerCrawler({
    maxConcurrency: 2,
    maxRequestsPerMinute: 10,
    proxyConfiguration,
    navigationTimeoutSecs: 60,

    launchContext: {
        launchOptions: {
            headless: true,
            args: [
                '--no-sandbox',
                '--disable-blink-features=AutomationControlled',
            ],
        },
    },

    async requestHandler({ request, page, enqueueLinks, log }) {
        // Add random delay for politeness
        await new Promise(r => 
            setTimeout(r, 1500 + Math.random() * 3000)
        );

        const { label } = request.userData;

        switch (label) {
            case 'SEARCH':
                await handleSearchResults(
                    page, enqueueLinks, log, request
                );
                break;
            case 'JOB_DETAIL':
                await handleJobDetail(page, log);
                if (scrapeCompanyProfiles) {
                    const companyLink = await page.$(
                        'a[href*="/c/"][class*="company"]'
                    );
                    if (companyLink) {
                        const href = await companyLink.evaluate(
                            el => el.href
                        );
                        await enqueueLinks({
                            urls: [href],
                            userData: { label: 'COMPANY' },
                        });
                    }
                }
                break;
            case 'COMPANY':
                await handleCompanyProfile(page, log);
                break;
            case 'SALARY':
                await handleSalaryPage(page, log);
                break;
        }
    },

    async failedRequestHandler({ request, log }) {
        log.error(`Failed: ${request.url}`);
    },
});

await crawler.run(startUrls);

// Log summary statistics
const { items } = await Dataset.getData();
const jobCount = items.filter(i => i.type === 'job_detail').length;
const salaryCount = items.filter(i => i.type === 'salary_data').length;
const companyCount = items.filter(i => i.type === 'company_profile').length;

console.log('Scraping complete!');
console.log(`Jobs: ${jobCount} | Salary pages: ${salaryCount} | Companies: ${companyCount}`);

await Actor.exit();
Enter fullscreen mode Exit fullscreen mode

One-Click Apply Detection

ZipRecruiter's "one-click apply" feature is a valuable signal — jobs with this option typically get more applicants, and the data about which jobs offer it is useful for job seekers and recruiters alike:

async function analyzeApplicationOptions(page) {
    return await page.evaluate(() => {
        const applyButton = document.querySelector(
            '[class*="apply"], [data-testid="apply-button"]'
        );
        const buttonText = applyButton?.textContent?.trim() || '';

        return {
            hasOneClickApply: buttonText.toLowerCase().includes('1-click')
                || buttonText.toLowerCase().includes('one click')
                || buttonText.toLowerCase().includes('easy apply'),
            hasExternalApply: !!document.querySelector(
                'a[class*="external"], [data-testid="external-apply"]'
            ),
            applicationUrl: applyButton?.closest('a')?.href
                || applyButton?.getAttribute('data-url') || null,
            applicantCount: document.querySelector(
                '[class*="applicant"], [class*="applied"]'
            )?.textContent?.trim() || null,
        };
    });
}
Enter fullscreen mode Exit fullscreen mode

Data Processing and Analysis

Once you've collected the data, here's how to process it for insights:

// Post-processing: salary analysis
function analyzeSalaryTrends(jobs) {
    const salaryJobs = jobs.filter(
        j => j.type === 'job_detail' && j.salary?.min
    );

    if (salaryJobs.length === 0) return null;

    const salaries = salaryJobs.map(j => ({
        title: j.title,
        company: j.company,
        min: j.salary.min,
        max: j.salary.max || j.salary.min,
        mid: j.salary.max
            ? (j.salary.min + j.salary.max) / 2
            : j.salary.min,
        isRemote: j.workplaceType === 'remote',
        type: j.salary.type,
    }));

    // Convert hourly to annual for comparison
    const annualized = salaries.map(s => ({
        ...s,
        annualMid: s.type === 'hourly' ? s.mid * 2080 : s.mid,
    }));

    const avgSalary = annualized.reduce(
        (sum, s) => sum + s.annualMid, 0
    ) / annualized.length;

    const remoteSalaries = annualized.filter(s => s.isRemote);
    const onsiteSalaries = annualized.filter(s => !s.isRemote);

    return {
        totalJobsWithSalary: salaryJobs.length,
        averageAnnualSalary: Math.round(avgSalary),
        remoteAvg: remoteSalaries.length > 0
            ? Math.round(
                remoteSalaries.reduce(
                    (sum, s) => sum + s.annualMid, 0
                ) / remoteSalaries.length
            )
            : null,
        onsiteAvg: onsiteSalaries.length > 0
            ? Math.round(
                onsiteSalaries.reduce(
                    (sum, s) => sum + s.annualMid, 0
                ) / onsiteSalaries.length
            )
            : null,
        topPaying: annualized
            .sort((a, b) => b.annualMid - a.annualMid)
            .slice(0, 10)
            .map(s => ({
                title: s.title,
                company: s.company,
                salary: `$${s.annualMid.toLocaleString()}`,
            })),
    };
}
Enter fullscreen mode Exit fullscreen mode

Handling Anti-Bot Protections

ZipRecruiter uses CloudFlare and other protections. Key strategies:

// Browser fingerprint evasion
async function setupStealthBrowser(page) {
    // Override navigator properties
    await page.evaluateOnNewDocument(() => {
        Object.defineProperty(navigator, 'webdriver', {
            get: () => false,
        });

        // Realistic plugins
        Object.defineProperty(navigator, 'plugins', {
            get: () => [1, 2, 3, 4, 5],
        });
    });

    // Set realistic viewport with variation
    await page.setViewport({
        width: 1280 + Math.floor(Math.random() * 300),
        height: 800 + Math.floor(Math.random() * 200),
        deviceScaleFactor: 1,
    });

    // Set realistic user agent
    await page.setUserAgent(
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ' +
        'AppleWebKit/537.36 (KHTML, like Gecko) ' +
        'Chrome/120.0.0.0 Safari/537.36'
    );
}
Enter fullscreen mode Exit fullscreen mode

Practical Use Cases

1. Salary Benchmarking Tool

Build a tool that compares salaries across companies, locations, and experience levels:

// Group jobs by location and calculate salary stats
function benchmarkByLocation(jobs) {
    const byLocation = {};
    jobs.forEach(job => {
        const loc = job.location || 'Unknown';
        if (!byLocation[loc]) byLocation[loc] = [];
        if (job.salary?.min) byLocation[loc].push(job.salary);
    });

    return Object.entries(byLocation).map(([location, salaries]) => ({
        location,
        jobCount: salaries.length,
        avgMin: Math.round(
            salaries.reduce((s, sal) => s + sal.min, 0) / salaries.length
        ),
        avgMax: Math.round(
            salaries.reduce(
                (s, sal) => s + (sal.max || sal.min), 0
            ) / salaries.length
        ),
    })).sort((a, b) => b.avgMax - a.avgMax);
}
Enter fullscreen mode Exit fullscreen mode

2. Skills Demand Tracker

Extract and analyze which skills appear most frequently in job listings:

function analyzeSkillsDemand(jobs) {
    const skillKeywords = [
        'python', 'javascript', 'react', 'node.js', 'aws',
        'docker', 'kubernetes', 'sql', 'typescript', 'go',
        'rust', 'java', 'c++', 'machine learning', 'ai',
        'data science', 'devops', 'cloud', 'agile', 'scrum',
    ];

    const skillCounts = {};
    const skillSalaries = {};

    jobs.forEach(job => {
        if (!job.fullDescription) return;
        const desc = job.fullDescription.toLowerCase();

        skillKeywords.forEach(skill => {
            if (desc.includes(skill)) {
                skillCounts[skill] = (skillCounts[skill] || 0) + 1;
                if (job.salary?.min) {
                    if (!skillSalaries[skill]) skillSalaries[skill] = [];
                    skillSalaries[skill].push(
                        job.salary.max 
                            ? (job.salary.min + job.salary.max) / 2 
                            : job.salary.min
                    );
                }
            }
        });
    });

    return Object.entries(skillCounts)
        .map(([skill, count]) => ({
            skill,
            demand: count,
            avgSalary: skillSalaries[skill]
                ? Math.round(
                    skillSalaries[skill].reduce((a, b) => a + b, 0) 
                    / skillSalaries[skill].length
                )
                : null,
        }))
        .sort((a, b) => b.demand - a.demand);
}
Enter fullscreen mode Exit fullscreen mode

3. Hiring Trend Monitor

Track how job posting volumes change over time for specific roles:

function trackHiringTrends(jobs) {
    const byWeek = {};
    jobs.forEach(job => {
        if (!job.postedDate) return;
        const date = new Date(job.postedDate);
        const weekStart = new Date(date);
        weekStart.setDate(date.getDate() - date.getDay());
        const weekKey = weekStart.toISOString().split('T')[0];

        if (!byWeek[weekKey]) {
            byWeek[weekKey] = { total: 0, remote: 0, withSalary: 0 };
        }
        byWeek[weekKey].total++;
        if (job.workplaceType === 'remote') byWeek[weekKey].remote++;
        if (job.salary?.min) byWeek[weekKey].withSalary++;
    });

    return Object.entries(byWeek)
        .sort(([a], [b]) => a.localeCompare(b))
        .map(([week, data]) => ({
            week,
            ...data,
            remotePercentage: Math.round(
                (data.remote / data.total) * 100
            ),
        }));
}
Enter fullscreen mode Exit fullscreen mode

Legal and Ethical Considerations

When scraping ZipRecruiter, follow these guidelines:

  • Respect robots.txt and rate limit your requests
  • Don't scrape personal data like applicant information or resumes
  • Use reasonable concurrency — 2-3 concurrent browsers maximum
  • Add delays between requests (2-5 seconds minimum)
  • Cache results to avoid re-scraping the same listings
  • Don't republish job listings verbatim — use the data for analysis
  • Consider their API — ZipRecruiter offers partner APIs for some use cases

Conclusion

ZipRecruiter scraping gives you access to one of the most comprehensive job market datasets available. With the Puppeteer-based Apify actor approach outlined here, you can extract job listings, salary estimates, company profiles, and application data at scale.

The key to success is building a respectful scraper that handles pagination correctly, extracts structured salary data from various formats, and identifies valuable metadata like one-click apply availability and workplace type. Combined with the analysis functions shown above, you can build powerful workforce analytics tools.

Start with a focused search query, validate your extraction against a few listings manually, and then scale up. The Apify platform handles the hard infrastructure problems — proxy rotation, browser management, retries — so you can focus on extracting insights from the data.

Whether you're building a salary benchmarking tool, tracking hiring trends, or powering a job aggregation platform, this approach gives you a solid foundation for production-grade ZipRecruiter data extraction.

Top comments (0)