Every company's hiring page is a window into their strategy. When a competitor starts hiring five machine learning engineers, they're building an AI product. When a startup posts for enterprise sales reps across three new cities, they're expanding territory. When a company you sell to posts for a "Procurement Manager," there's budget being allocated.
Job posting intelligence is one of the most underused signals in B2B sales and competitive analysis. In this guide, I'll show you how to build an automated system that monitors job postings across LinkedIn, Indeed, and ZipRecruiter — turning raw hiring data into actionable business intelligence.
Why Job Postings Are Strategic Gold
Before we dive into the technical implementation, let's understand why job data matters:
1. Growth Signals
A company posting 50+ roles in a quarter is growing. If those roles are in engineering, they're building. If they're in sales, they're scaling revenue. If they're in compliance, they might be preparing for an IPO or regulatory change.
2. Technology Stack Reveals
Job descriptions are remarkably specific about technology. A posting for a "Senior Kubernetes Engineer with Terraform experience" tells you exactly what infrastructure they're running. This is invaluable for selling developer tools, cloud services, or consulting.
3. Budget Indicators
Hiring is expensive. A company posting for senior roles with competitive salaries has budget. Companies that are cutting back reduce postings first — often before any public announcement.
4. Timing Signals
The timing of postings reveals urgency. A role reposted three times in two months? They're struggling to fill it and might be open to contractor solutions or tooling that reduces the need for that hire.
Architecture Overview
Here's what we're building:
┌─────────────┐ ┌─────────────┐ ┌──────────────┐
│ LinkedIn │ │ Indeed │ │ ZipRecruiter │
│ Jobs API │ │ Scraper │ │ Scraper │
└──────┬──────┘ └──────┬──────┘ └──────┬───────┘
│ │ │
└──────────┬───────┴───────────────────┘
│
┌───────▼────────┐
│ Data Pipeline │
│ (Normalize + │
│ Deduplicate) │
└───────┬────────┘
│
┌───────▼────────┐
│ Signal Engine │
│ (Detect hiring │
│ patterns) │
└───────┬────────┘
│
┌──────────┼──────────┐
│ │ │
┌────▼───┐ ┌───▼────┐ ┌───▼────┐
│ CRM │ │ Alerts │ │Dashboard│
│ Update │ │ System │ │ & API │
└────────┘ └────────┘ └────────┘
Step 1: Collecting Job Posting Data
The foundation of any intelligence system is reliable data collection. We'll use Apify actors to scrape job boards at scale without managing infrastructure.
LinkedIn Jobs Collection
LinkedIn is the richest source for B2B hiring intelligence. Here's how to set up automated collection using the LinkedIn Jobs Scraper actor on Apify:
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({
token: process.env.APIFY_TOKEN,
});
async function scrapeLinkedInJobs(companies) {
const results = [];
for (const company of companies) {
const run = await client.actor('hMvNSpz3JnHgl5jkh').call({
searchUrl: `https://www.linkedin.com/jobs/search/?keywords=${encodeURIComponent(company)}&location=United%20States&f_TPR=r604800`,
maxItems: 100,
proxy: {
useApifyProxy: true,
apifyProxyGroups: ['RESIDENTIAL'],
},
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
results.push({
company,
jobs: items,
scrapedAt: new Date().toISOString(),
});
}
return results;
}
// Track competitors and key accounts
const watchList = [
'Salesforce',
'HubSpot',
'Snowflake',
'Databricks',
'Stripe',
];
const data = await scrapeLinkedInJobs(watchList);
console.log(`Collected ${data.reduce((sum, d) => sum + d.jobs.length, 0)} job postings`);
Indeed Jobs Collection
Indeed provides broader coverage including roles that never appear on LinkedIn, particularly in mid-market and SMB companies:
async function scrapeIndeedJobs(queries) {
const allResults = [];
for (const query of queries) {
const run = await client.actor('misceres/indeed-scraper').call({
position: query.title,
country: 'US',
location: query.location || '',
maxItems: 200,
parseCompanyDetails: true,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
allResults.push({
query: query.title,
results: items.map(item => ({
title: item.positionName,
company: item.company,
location: item.location,
description: item.description,
salary: item.salary,
postedAt: item.postedAt,
url: item.url,
})),
});
}
return allResults;
}
// Search for signals that indicate tool/service needs
const queries = [
{ title: 'DevOps Engineer', location: 'Remote' },
{ title: 'Data Engineer Snowflake', location: '' },
{ title: 'Salesforce Administrator', location: '' },
{ title: 'Head of Procurement', location: '' },
];
ZipRecruiter for SMB Coverage
ZipRecruiter captures small and medium businesses that often don't post on LinkedIn:
async function scrapeZipRecruiter(searchTerms) {
const run = await client.actor('epctex/ziprecruiter-scraper').call({
search: searchTerms,
location: 'United States',
maxItems: 150,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
return items;
}
Step 2: Data Normalization Pipeline
Raw job data from different sources has different schemas. We need a unified format:
function normalizeJobPosting(raw, source) {
return {
id: generateId(raw.company, raw.title, source),
title: cleanTitle(raw.title || raw.positionName),
company: normalizeCompanyName(raw.company),
location: normalizeLocation(raw.location),
description: raw.description || '',
salary: parseSalary(raw.salary || raw.compensation),
source: source,
postedAt: parseDate(raw.postedAt || raw.date),
scrapedAt: new Date().toISOString(),
url: raw.url || raw.link,
// Extracted signals
technologies: extractTechnologies(raw.description),
seniorityLevel: detectSeniority(raw.title),
department: classifyDepartment(raw.title),
isRemote: detectRemote(raw.location, raw.description),
};
}
function extractTechnologies(description) {
if (!description) return [];
const techPatterns = {
cloud: ['AWS', 'Azure', 'GCP', 'Google Cloud'],
databases: ['PostgreSQL', 'MongoDB', 'Redis', 'Snowflake', 'BigQuery'],
languages: ['Python', 'JavaScript', 'TypeScript', 'Go', 'Rust', 'Java'],
frameworks: ['React', 'Next.js', 'Django', 'FastAPI', 'Spring Boot'],
devops: ['Kubernetes', 'Docker', 'Terraform', 'Jenkins', 'GitHub Actions'],
data: ['Spark', 'Airflow', 'dbt', 'Kafka', 'Flink'],
crm: ['Salesforce', 'HubSpot', 'Dynamics 365'],
};
const found = [];
const descUpper = description.toUpperCase();
for (const [category, techs] of Object.entries(techPatterns)) {
for (const tech of techs) {
if (descUpper.includes(tech.toUpperCase())) {
found.push({ name: tech, category });
}
}
}
return found;
}
function detectSeniority(title) {
const titleLower = title.toLowerCase();
if (titleLower.match(/\b(vp|vice president|director|head of|chief)\b/)) return 'executive';
if (titleLower.match(/\b(senior|sr\.?|lead|principal|staff)\b/)) return 'senior';
if (titleLower.match(/\b(junior|jr\.?|associate|entry)\b/)) return 'junior';
if (titleLower.match(/\b(intern|internship|co-op)\b/)) return 'intern';
return 'mid';
}
function classifyDepartment(title) {
const titleLower = title.toLowerCase();
const deptMap = {
engineering: /engineer|developer|architect|devops|sre|platform/,
data: /data scientist|data engineer|analytics|ml engineer|machine learning/,
sales: /sales|account executive|business development|sdr|bdr/,
marketing: /marketing|growth|content|seo|brand/,
product: /product manager|product owner|ux|designer/,
operations: /operations|procurement|supply chain|logistics/,
finance: /finance|accounting|controller|treasury/,
hr: /recruiter|people|talent|human resources/,
};
for (const [dept, pattern] of Object.entries(deptMap)) {
if (titleLower.match(pattern)) return dept;
}
return 'other';
}
Deduplication
The same job often appears across multiple platforms. We need smart deduplication:
function deduplicateJobs(jobs) {
const seen = new Map();
for (const job of jobs) {
// Create a fuzzy key based on company + title + location
const key = [
job.company.toLowerCase().replace(/[^a-z0-9]/g, ''),
job.title.toLowerCase().replace(/[^a-z0-9]/g, '').substring(0, 30),
job.location.toLowerCase().split(',')[0].trim(),
].join('|');
if (seen.has(key)) {
// Keep the one with more data
const existing = seen.get(key);
if ((job.description?.length || 0) > (existing.description?.length || 0)) {
job.sources = [...(existing.sources || [existing.source]), job.source];
seen.set(key, job);
} else {
existing.sources = [...(existing.sources || [existing.source]), job.source];
}
} else {
seen.set(key, job);
}
}
return Array.from(seen.values());
}
Step 3: The Signal Engine
This is where raw data becomes intelligence. We're looking for patterns that indicate opportunity:
class HiringSignalEngine {
constructor(historicalData) {
this.history = historicalData; // Previous scrape results
}
analyzeCompany(company, currentJobs) {
const signals = [];
const previousJobs = this.history.filter(j => j.company === company);
const previousCount = previousJobs.length;
const currentCount = currentJobs.length;
// Signal 1: Hiring Surge
if (currentCount > previousCount * 1.5 && currentCount > 5) {
signals.push({
type: 'HIRING_SURGE',
severity: 'high',
message: `${company} increased postings by ${Math.round(((currentCount - previousCount) / previousCount) * 100)}% (${previousCount} → ${currentCount})`,
actionable: true,
});
}
// Signal 2: New Department
const prevDepts = new Set(previousJobs.map(j => j.department));
const currDepts = new Set(currentJobs.map(j => j.department));
const newDepts = [...currDepts].filter(d => !prevDepts.has(d));
if (newDepts.length > 0) {
signals.push({
type: 'NEW_DEPARTMENT',
severity: 'medium',
message: `${company} is hiring in new departments: ${newDepts.join(', ')}`,
departments: newDepts,
});
}
// Signal 3: Executive Hiring
const execRoles = currentJobs.filter(j => j.seniorityLevel === 'executive');
if (execRoles.length > 0) {
signals.push({
type: 'EXECUTIVE_HIRE',
severity: 'high',
message: `${company} is hiring ${execRoles.length} executive roles: ${execRoles.map(j => j.title).join(', ')}`,
roles: execRoles,
});
}
// Signal 4: Technology Shift
const prevTech = new Set(previousJobs.flatMap(j => j.technologies.map(t => t.name)));
const currTech = new Set(currentJobs.flatMap(j => j.technologies.map(t => t.name)));
const newTech = [...currTech].filter(t => !prevTech.has(t));
if (newTech.length >= 2) {
signals.push({
type: 'TECH_SHIFT',
severity: 'medium',
message: `${company} is adopting new technologies: ${newTech.join(', ')}`,
technologies: newTech,
});
}
// Signal 5: Hiring Freeze (inverse signal)
if (currentCount < previousCount * 0.5 && previousCount > 10) {
signals.push({
type: 'HIRING_SLOWDOWN',
severity: 'low',
message: `${company} reduced postings by ${Math.round(((previousCount - currentCount) / previousCount) * 100)}%`,
actionable: false,
});
}
return signals;
}
generateWeeklyReport(companies) {
const report = {
generatedAt: new Date().toISOString(),
totalCompanies: companies.length,
signals: [],
};
for (const [company, jobs] of Object.entries(companies)) {
const companySignals = this.analyzeCompany(company, jobs);
if (companySignals.length > 0) {
report.signals.push({
company,
jobCount: jobs.length,
signals: companySignals,
});
}
}
// Sort by signal severity
report.signals.sort((a, b) => {
const severityOrder = { high: 0, medium: 1, low: 2 };
const aMax = Math.min(...a.signals.map(s => severityOrder[s.severity]));
const bMax = Math.min(...b.signals.map(s => severityOrder[s.severity]));
return aMax - bMax;
});
return report;
}
}
Step 4: CRM Integration
Intelligence is only valuable if it reaches the right people at the right time. Here's how to push signals into your CRM:
HubSpot Integration
import Hubspot from '@hubspot/api-client';
const hubspot = new Hubspot.Client({ accessToken: process.env.HUBSPOT_TOKEN });
async function pushSignalToHubspot(signal) {
// Search for the company in HubSpot
const searchResponse = await hubspot.crm.companies.searchApi.doSearch({
filterGroups: [{
filters: [{
propertyName: 'name',
operator: 'CONTAINS_TOKEN',
value: signal.company,
}],
}],
});
if (searchResponse.results.length === 0) {
console.log(`Company ${signal.company} not found in HubSpot, skipping`);
return;
}
const companyId = searchResponse.results[0].id;
// Update company properties
await hubspot.crm.companies.basicApi.update(companyId, {
properties: {
hiring_signal_type: signal.signals.map(s => s.type).join(', '),
hiring_signal_date: new Date().toISOString().split('T')[0],
active_job_count: signal.jobCount.toString(),
hiring_signal_detail: signal.signals.map(s => s.message).join('\n'),
},
});
// Create a note for the sales team
if (signal.signals.some(s => s.severity === 'high')) {
await hubspot.crm.objects.notesApi.create({
properties: {
hs_note_body: formatSignalNote(signal),
hs_timestamp: Date.now(),
},
associations: [{
to: { id: companyId },
types: [{ associationCategory: 'HUBSPOT_DEFINED', associationTypeId: 190 }],
}],
});
}
}
function formatSignalNote(signal) {
let note = `🎯 **Hiring Intelligence Alert** — ${signal.company}\n\n`;
note += `Active Postings: ${signal.jobCount}\n\n`;
for (const s of signal.signals) {
const icon = s.severity === 'high' ? '🔴' : s.severity === 'medium' ? '🟡' : '🟢';
note += `${icon} **${s.type}**: ${s.message}\n`;
}
note += `\n---\n_Auto-generated by Job Posting Intelligence System_`;
return note;
}
Salesforce Integration
import jsforce from 'jsforce';
async function pushToSalesforce(signal) {
const conn = new jsforce.Connection({
loginUrl: process.env.SF_LOGIN_URL,
});
await conn.login(process.env.SF_USERNAME, process.env.SF_PASSWORD + process.env.SF_TOKEN);
// Find the account
const accounts = await conn.query(
`SELECT Id, Name FROM Account WHERE Name LIKE '%${signal.company}%' LIMIT 1`
);
if (accounts.records.length === 0) return;
const accountId = accounts.records[0].Id;
// Create a Task for the account owner
if (signal.signals.some(s => s.actionable)) {
await conn.sobject('Task').create({
Subject: `Hiring Signal: ${signal.signals[0].type} at ${signal.company}`,
Description: signal.signals.map(s => s.message).join('\n'),
WhatId: accountId,
Priority: signal.signals.some(s => s.severity === 'high') ? 'High' : 'Normal',
Status: 'Not Started',
ActivityDate: new Date(Date.now() + 3 * 86400000).toISOString().split('T')[0],
});
}
}
Step 5: Scheduling and Automation
Set up the entire pipeline to run automatically on Apify's scheduling system:
// main.js — Apify Actor that orchestrates the full pipeline
import { Actor } from 'apify';
await Actor.init();
const input = await Actor.getInput();
const {
watchList = [],
hubspotToken,
slackWebhook,
schedule = 'weekly',
} = input;
// 1. Collect from all sources
console.log('Collecting job postings...');
const linkedInJobs = await scrapeLinkedInJobs(watchList);
const indeedJobs = await scrapeIndeedJobs(
watchList.map(c => ({ title: "c, location: '' }))"
);
// 2. Normalize and deduplicate
const allJobs = [
...linkedInJobs.flatMap(r => r.jobs.map(j => normalizeJobPosting(j, 'linkedin'))),
...indeedJobs.flatMap(r => r.results.map(j => normalizeJobPosting(j, 'indeed'))),
];
const uniqueJobs = deduplicateJobs(allJobs);
console.log(`${uniqueJobs.length} unique postings after dedup`);
// 3. Load historical data and generate signals
const store = await Actor.openKeyValueStore('job-intelligence-history');
const history = (await store.getValue('previous-scan')) || [];
const engine = new HiringSignalEngine(history);
// Group by company
const byCompany = {};
for (const job of uniqueJobs) {
if (!byCompany[job.company]) byCompany[job.company] = [];
byCompany[job.company].push(job);
}
const report = engine.generateWeeklyReport(byCompany);
// 4. Push to CRM and notify
for (const signal of report.signals) {
if (hubspotToken) await pushSignalToHubspot(signal);
if (slackWebhook) await notifySlack(slackWebhook, signal);
}
// 5. Save current scan as history for next run
await store.setValue('previous-scan', uniqueJobs);
await Actor.pushData(report);
console.log(`Generated ${report.signals.length} signals for ${report.totalCompanies} companies`);
await Actor.exit();
Practical Use Cases
For Sales Teams
Track your ICP companies. When a target account starts hiring for roles related to your product category, that's an intent signal stronger than any website visit. A company hiring three "Data Engineers with Snowflake experience" is probably about to expand their data stack — perfect timing to reach out about complementary tooling.
For Competitive Intelligence
Monitor your competitors' postings weekly. If your main competitor starts aggressively hiring in a new city or for a new product category, you'll know about it weeks before any press release.
For Investors and Analysts
Job posting velocity is a leading indicator of company health. Track portfolio companies, potential investments, or public companies for signals that precede earnings reports.
For Recruiting Firms
Know which companies are struggling to fill roles (reposted 3+ times) — those are warm leads for staffing agencies and recruiting services.
Cost Considerations
Running this system on Apify is surprisingly affordable:
- LinkedIn Jobs scraper: ~$5-10/run for 500 postings
- Indeed scraper: ~$3-5/run for 500 postings
- Weekly schedule for 50 companies: ~$50-80/month
- Apify platform free tier: 100 actor runs/month included
Compare that to commercial job intelligence platforms like Thinknum or Revelio Labs, which charge $10,000-50,000/year. Building your own gives you the same data at a fraction of the cost with full customization.
Next Steps
- Start small: Pick 10 companies you care about and run weekly scrapes
- Build your signal library: Add custom signals relevant to your industry
- Connect to your workflow: CRM, Slack, email — wherever your team lives
- Iterate on accuracy: Track which signals actually correlate with deals closed
Job posting intelligence won't replace your sales intuition, but it will make sure you never miss a signal hiding in plain sight. The companies that systematically monitor hiring data will consistently outperform those relying on gut feel and Google Alerts.
Want to try this yourself? Check out the Apify marketplace for ready-to-use job scraping actors, or build your own with the Apify SDK.
Top comments (0)