Comparably has become one of the most important platforms for understanding company culture, compensation, and employee sentiment. With detailed culture scores, CEO approval ratings, salary benchmarks, and diversity metrics, Comparably offers a data-rich view into what it's really like to work at thousands of companies.
In this guide, we'll explore how to scrape Comparably for company culture data, employee reviews, salary information, and more — including practical code examples and how to scale with Apify.
Why Scrape Comparably?
Comparably's data serves multiple use cases across industries:
- Recruiting and HR: Benchmark your company culture against competitors to improve talent acquisition
- Job seekers: Aggregate and compare company data across multiple employers at once
- Investors: Assess company health through employee sentiment — a leading indicator of performance
- Academic research: Study workplace culture trends, compensation patterns, and diversity initiatives
- Competitive intelligence: Understand how rival companies are perceived by their employees
- Consulting: Provide data-backed culture assessments to enterprise clients
Unlike Glassdoor, Comparably structures its data into quantified scores and rankings, making it particularly well-suited for systematic extraction and analysis.
Understanding Comparably's Data Structure
Comparably organizes company information across several distinct sections, each containing valuable structured data.
Company Overview Pages
Every company profile on Comparably includes:
- Company name and logo
- Industry and company size (employee count range)
- Headquarters location
- Founded year
- Website URL
- Overall culture score (letter grade A+ through F)
- CEO name and approval rating
- Recent awards (Best Company Culture, Best CEO, etc.)
Culture Scores
Comparably breaks culture into quantifiable dimensions:
- Overall Culture Score: Aggregate rating from A+ to F
- Compensation: How fairly employees feel they're paid
- Leadership: Trust and confidence in executive team
- Work-Life Balance: Flexibility and personal time
- Professional Development: Growth opportunities and mentorship
- Perks & Benefits: Non-salary compensation quality
- Operational Efficiency: How well the company runs day-to-day
- CEO Rating: Approval percentage for the chief executive
- Diversity Score: Inclusivity rating from underrepresented groups
- Gender Score: Rating specifically from women employees
- Retention: How likely employees are to stay
Each dimension gets its own letter grade, and the scores are further broken down by department, seniority level, gender, and ethnicity.
Employee Reviews
Individual reviews on Comparably follow a structured format:
- Overall rating (1-5 scale)
- Department the reviewer works in
- Job title (anonymized to level)
- Tenure at the company
- Pros and cons (free text)
- Specific dimension ratings
- Demographic information (optional, aggregated)
Salary Data
Comparably hosts detailed compensation information:
- Average salary by job title
- Salary ranges (low, median, high)
- Total compensation including bonuses and equity
- Salary satisfaction scores
- Comparisons to industry averages
- Salary by experience level
Interview Questions
A unique Comparably feature is crowdsourced interview data:
- Common interview questions by role
- Interview difficulty rating
- Interview process duration
- Offer acceptance rates
Technical Architecture of Comparably
Before building scrapers, understanding Comparably's technical setup is essential.
Page Rendering
Comparably uses a hybrid rendering approach. Some pages are server-side rendered (SSR) with initial data embedded in the HTML, while interactive elements load additional data through API calls. This means you can often extract core data with simple HTTP requests, but richer datasets may require JavaScript rendering.
URL Structure
Comparably uses clean, predictable URLs:
https://www.comparably.com/companies/{company-slug}
https://www.comparably.com/companies/{company-slug}/culture
https://www.comparably.com/companies/{company-slug}/salary
https://www.comparably.com/companies/{company-slug}/reviews
https://www.comparably.com/companies/{company-slug}/ceo
https://www.comparably.com/companies/{company-slug}/diversity
This predictable structure makes it straightforward to construct URLs programmatically.
Rate Limiting and Anti-Bot Measures
Comparably implements standard protections:
- IP-based rate limiting
- CAPTCHA challenges for suspicious traffic patterns
- Cloudflare protection layer
- Session-based tracking
Extracting Company Culture Scores
Let's start with extracting the core culture data. Here's a practical approach using Node.js with Cheerio for HTML parsing:
const axios = require('axios');
const cheerio = require('cheerio');
class ComparablyScraper {
constructor() {
this.baseUrl = 'https://www.comparably.com';
this.headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ' +
'AppleWebKit/537.36 (KHTML, like Gecko) ' +
'Chrome/120.0.0.0 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml',
'Accept-Language': 'en-US,en;q=0.9',
};
}
async getCompanyOverview(companySlug) {
const url = `${this.baseUrl}/companies/${companySlug}`;
const { data: html } = await axios.get(url, {
headers: this.headers,
});
const $ = cheerio.load(html);
return {
name: $('h1.company-name').text().trim(),
industry: $('.company-industry').text().trim(),
size: $('.company-size').text().trim(),
location: $('.company-location').text().trim(),
overallScore: $('.culture-score .grade').text().trim(),
ceoName: $('.ceo-name').text().trim(),
ceoApproval: $('.ceo-approval-rate').text().trim(),
website: $('a.company-website').attr('href'),
description: $('.company-description').text().trim(),
};
}
async getCultureScores(companySlug) {
const url = `${this.baseUrl}/companies/${companySlug}/culture`;
const { data: html } = await axios.get(url, {
headers: this.headers,
});
const $ = cheerio.load(html);
const scores = {};
$('.culture-dimension').each((_, el) => {
const dimension = $(el).find('.dimension-name').text().trim();
const grade = $(el).find('.dimension-grade').text().trim();
const score = $(el).find('.dimension-score').text().trim();
if (dimension) {
scores[dimension.toLowerCase().replace(/\s+/g, '_')] = {
grade,
score: parseFloat(score) || null,
};
}
});
return {
companySlug,
overallGrade: $('.overall-culture-grade').text().trim(),
dimensions: scores,
totalResponses: parseInt(
$('.response-count').text().replace(/\D/g, '')
) || null,
};
}
async delay(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
// Usage
const scraper = new ComparablyScraper();
scraper.getCompanyOverview('google').then(data => {
console.log(JSON.stringify(data, null, 2));
});
Extracting CEO Ratings and Leadership Data
CEO approval is one of Comparably's most-cited metrics. Here's how to extract it:
async function getCEOData(companySlug) {
const url = `https://www.comparably.com/companies/${companySlug}/ceo`;
const { data: html } = await axios.get(url, {
headers: {
'User-Agent': 'Mozilla/5.0 (compatible; research bot)',
},
});
const $ = cheerio.load(html);
return {
ceoName: $('.ceo-profile-name').text().trim(),
approvalRating: parseFloat(
$('.approval-percentage').text().replace('%', '')
),
totalRatings: parseInt(
$('.total-ratings').text().replace(/\D/g, '')
),
sentimentBreakdown: {
positive: $('.sentiment-positive').text().trim(),
neutral: $('.sentiment-neutral').text().trim(),
negative: $('.sentiment-negative').text().trim(),
},
topQualities: $('.ceo-quality').map((_, el) =>
$(el).text().trim()
).get(),
comparisonToIndustry: $('.industry-comparison').text().trim(),
};
}
Scraping Salary Data
Compensation data is particularly valuable for HR benchmarking:
async function getSalaryData(companySlug) {
const url = `https://www.comparably.com/companies/${companySlug}/salary`;
const { data: html } = await axios.get(url, {
headers: {
'User-Agent': 'Mozilla/5.0 (compatible; research bot)',
},
});
const $ = cheerio.load(html);
const salaries = [];
$('.salary-row').each((_, el) => {
salaries.push({
jobTitle: $(el).find('.job-title').text().trim(),
averageSalary: $(el).find('.avg-salary').text().trim(),
salaryRange: {
low: $(el).find('.salary-low').text().trim(),
high: $(el).find('.salary-high').text().trim(),
},
totalCompensation: $(el).find('.total-comp').text().trim(),
experienceLevel: $(el).find('.experience').text().trim(),
});
});
return {
companySlug,
salarySatisfaction: $('.salary-satisfaction-score').text().trim(),
averageOverallSalary: $('.company-avg-salary').text().trim(),
comparedToMarket: $('.market-comparison').text().trim(),
salaries,
};
}
// Get and display salary data
getSalaryData('microsoft').then(data => {
console.log(`Salary satisfaction: ${data.salarySatisfaction}`);
console.log(`\nTop salaries at ${data.companySlug}:`);
data.salaries.slice(0, 10).forEach(s => {
console.log(` ${s.jobTitle}: ${s.averageSalary} (${s.salaryRange.low} - ${s.salaryRange.high})`);
});
});
Extracting Diversity Metrics
Diversity data has become increasingly important for corporate accountability:
async function getDiversityData(companySlug) {
const url = `https://www.comparably.com/companies/${companySlug}/diversity`;
const { data: html } = await axios.get(url, {
headers: {
'User-Agent': 'Mozilla/5.0 (compatible; research bot)',
},
});
const $ = cheerio.load(html);
return {
diversityScore: $('.diversity-overall-score').text().trim(),
diversityGrade: $('.diversity-grade').text().trim(),
genderScore: $('.gender-score').text().trim(),
ethnicityScore: $('.ethnicity-score').text().trim(),
dimensions: {
equalTreatment: $('.equal-treatment-score').text().trim(),
inclusiveEnvironment: $(
'.inclusive-environment-score'
).text().trim(),
belongingSense: $('.belonging-score').text().trim(),
diverseManagement: $(
'.diverse-management-score'
).text().trim(),
},
demographicBreakdown: {
gender: {
male: $('.gender-male-pct').text().trim(),
female: $('.gender-female-pct').text().trim(),
nonBinary: $('.gender-nb-pct').text().trim(),
},
},
topPositiveThemes: $('.positive-theme').map((_, el) =>
$(el).text().trim()
).get(),
areasForImprovement: $('.improvement-area').map((_, el) =>
$(el).text().trim()
).get(),
};
}
Collecting Interview Questions
Interview data helps job seekers prepare and gives companies competitive intelligence on hiring practices:
async function getInterviewData(companySlug) {
const url = `https://www.comparably.com/companies/${companySlug}/interviews`;
const { data: html } = await axios.get(url, {
headers: {
'User-Agent': 'Mozilla/5.0 (compatible; research bot)',
},
});
const $ = cheerio.load(html);
const questions = [];
$('.interview-question').each((_, el) => {
questions.push({
question: $(el).find('.question-text').text().trim(),
role: $(el).find('.question-role').text().trim(),
department: $(el).find('.question-dept').text().trim(),
difficulty: $(el).find('.difficulty-rating').text().trim(),
frequency: $(el).find('.frequency-badge').text().trim(),
});
});
return {
companySlug,
overallDifficulty: $(
'.interview-difficulty-score'
).text().trim(),
averageDuration: $('.interview-duration').text().trim(),
interviewExperience: {
positive: $('.experience-positive').text().trim(),
neutral: $('.experience-neutral').text().trim(),
negative: $('.experience-negative').text().trim(),
},
questions,
};
}
Scaling with Apify
For production-scale Comparably scraping, Apify provides the infrastructure to handle proxy rotation, scheduling, and data storage automatically.
Building a Comparably Actor
Here's how to structure a Comparably scraper as an Apify actor:
const { Actor } = require('apify');
const { CheerioCrawler } = require('crawlee');
Actor.main(async () => {
const input = await Actor.getInput();
const {
companySlugs = [],
extractCulture = true,
extractSalary = true,
extractReviews = true,
extractDiversity = true,
maxConcurrency = 3,
} = input;
const dataset = await Actor.openDataset();
const crawler = new CheerioCrawler({
maxConcurrency,
requestHandlerTimeoutSecs: 60,
proxyConfiguration: await Actor.createProxyConfiguration({
groups: ['RESIDENTIAL'],
}),
async requestHandler({ request, $, log }) {
const { companySlug, dataType } = request.userData;
log.info(`Scraping ${dataType} for ${companySlug}`);
let result = { companySlug, dataType };
switch (dataType) {
case 'overview':
result.data = {
name: $('h1.company-name').text().trim(),
overallScore: $(
'.culture-score .grade'
).text().trim(),
ceoApproval: $(
'.ceo-approval-rate'
).text().trim(),
industry: $(
'.company-industry'
).text().trim(),
size: $('.company-size').text().trim(),
};
break;
case 'culture':
const dimensions = {};
$('.culture-dimension').each((_, el) => {
const name = $(el)
.find('.dimension-name')
.text()
.trim();
const grade = $(el)
.find('.dimension-grade')
.text()
.trim();
if (name) dimensions[name] = grade;
});
result.data = { dimensions };
break;
case 'salary':
const salaries = [];
$('.salary-row').each((_, el) => {
salaries.push({
title: $(el)
.find('.job-title')
.text()
.trim(),
salary: $(el)
.find('.avg-salary')
.text()
.trim(),
});
});
result.data = { salaries };
break;
case 'diversity':
result.data = {
score: $(
'.diversity-overall-score'
).text().trim(),
grade: $(
'.diversity-grade'
).text().trim(),
};
break;
}
await dataset.pushData(result);
},
failedRequestHandler({ request, log }) {
log.error(
`Failed: ${request.url} - ${request.userData.dataType}`
);
},
});
// Build request queue
const requests = [];
for (const slug of companySlugs) {
const sections = [
{ path: '', type: 'overview' },
...(extractCulture
? [{ path: '/culture', type: 'culture' }]
: []),
...(extractSalary
? [{ path: '/salary', type: 'salary' }]
: []),
...(extractDiversity
? [{ path: '/diversity', type: 'diversity' }]
: []),
];
for (const section of sections) {
requests.push({
url: `https://www.comparably.com/companies/${slug}${section.path}`,
userData: {
companySlug: slug,
dataType: section.type,
},
});
}
}
await crawler.run(requests);
log.info('Scraping complete!');
});
Running the Actor via API
const { ApifyClient } = require('apify-client');
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
async function scrapeCompanies(companies) {
const run = await client.actor('your-username/comparably-scraper').call({
companySlugs: companies,
extractCulture: true,
extractSalary: true,
extractReviews: true,
extractDiversity: true,
});
const { items } = await client
.dataset(run.defaultDatasetId)
.listItems();
return items;
}
// Compare tech giants
scrapeCompanies([
'google', 'microsoft', 'apple', 'amazon', 'meta',
]).then(results => {
// Group by company
const byCompany = {};
results.forEach(item => {
if (!byCompany[item.companySlug]) {
byCompany[item.companySlug] = {};
}
byCompany[item.companySlug][item.dataType] = item.data;
});
console.log('Company Culture Comparison:');
Object.entries(byCompany).forEach(([company, data]) => {
console.log(`\n${company}:`);
console.log(` Culture: ${data.overview?.overallScore}`);
console.log(` CEO: ${data.overview?.ceoApproval}`);
console.log(` Diversity: ${data.diversity?.grade}`);
});
});
Building a Culture Comparison Dashboard
With scraped data, you can build powerful comparison tools:
function generateCultureReport(companiesData) {
const report = {
generatedAt: new Date().toISOString(),
companies: [],
rankings: {},
};
for (const [slug, data] of Object.entries(companiesData)) {
report.companies.push({
slug,
name: data.overview?.name || slug,
overallGrade: data.overview?.overallScore,
ceoApproval: data.overview?.ceoApproval,
diversityGrade: data.diversity?.grade,
topSalaries: data.salary?.salaries?.slice(0, 5),
cultureDimensions: data.culture?.dimensions,
});
}
// Generate rankings per dimension
const dimensions = [
'overallGrade',
'ceoApproval',
'diversityGrade',
];
dimensions.forEach(dim => {
report.rankings[dim] = report.companies
.filter(c => c[dim])
.sort((a, b) => {
if (dim === 'ceoApproval') {
return (
parseFloat(b[dim]) - parseFloat(a[dim])
);
}
return a[dim].localeCompare(b[dim]);
})
.map((c, i) => ({
rank: i + 1,
company: c.name,
value: c[dim],
}));
});
return report;
}
Use Cases and Practical Applications
HR Benchmarking
Scraping Comparably data for your industry vertical lets HR teams understand where they stand. If your diversity score is a C while competitors average B+, that's actionable intelligence for your DEI initiatives.
Investment Analysis
Employee sentiment on Comparably often leads financial results by 6-12 months. Declining culture scores can signal upcoming talent attrition, reduced productivity, and eventual earnings misses.
Employer Brand Monitoring
Track your company's scores weekly. When culture dimensions drop, you can investigate and address issues before they appear in more public forums like Glassdoor.
Salary Market Intelligence
Comparably's salary data, when aggregated across an industry, provides clearer compensation benchmarks than individual negotiation. This is valuable for both employers setting competitive offers and candidates evaluating them.
Ethical and Legal Considerations
Scraping Comparably — like any website — comes with responsibilities:
- Review Terms of Service: Comparably's ToS may restrict automated access. Understand the legal framework in your jurisdiction.
- Rate limit your requests: Never overwhelm the server. Use delays between requests and respect HTTP 429 responses.
- Don't scrape personal information: Focus on aggregate company data, not individual employee identities.
- Cache results: Don't re-scrape unchanged data. Culture scores update infrequently.
- Use data ethically: Don't use scraped data to identify anonymous reviewers or discriminate against companies unfairly.
- Consider official APIs: Check if Comparably offers an API or data partnership that would provide authorized access.
Handling Anti-Scraping Protections
When scraping at scale, you'll encounter protections. Here's how to handle them responsibly:
const { gotScraping } = require('got-scraping');
async function resilientFetch(url, retries = 3) {
for (let attempt = 1; attempt <= retries; attempt++) {
try {
const response = await gotScraping({
url,
headerGeneratorOptions: {
browsers: ['chrome'],
operatingSystems: ['windows'],
devices: ['desktop'],
},
timeout: { request: 30000 },
});
if (response.statusCode === 200) {
return response.body;
}
if (response.statusCode === 429) {
const waitTime = Math.pow(2, attempt) * 5000;
console.log(
`Rate limited. Waiting ${waitTime / 1000}s...`
);
await new Promise(r => setTimeout(r, waitTime));
continue;
}
throw new Error(`HTTP ${response.statusCode}`);
} catch (error) {
if (attempt === retries) throw error;
await new Promise(r =>
setTimeout(r, attempt * 3000)
);
}
}
}
Conclusion
Comparably offers a uniquely structured dataset for understanding company culture, compensation, and employee sentiment. Unlike platforms with primarily free-text reviews, Comparably's quantified scores and dimensional breakdowns make it ideal for systematic extraction and analysis.
Whether you're building HR benchmarking tools, conducting investment research, or monitoring your employer brand, the techniques in this guide provide a solid foundation. Starting with simple HTTP-based extraction using Cheerio and scaling to production infrastructure with Apify actors, you can build comprehensive culture intelligence pipelines.
The key is to focus on the data that drives decisions — culture scores for benchmarking, salary data for compensation planning, diversity metrics for DEI initiatives, and CEO ratings for leadership assessment. Combined with responsible scraping practices and proper data handling, Comparably scraping is a valuable addition to any data collection toolkit.
Remember to always respect the platform's terms of service, implement reasonable rate limiting, and use the extracted data ethically. The goal is to generate insights that help companies improve and help professionals make informed career decisions.
Top comments (0)