Why Job Market Data Matters for Builders
If you're building an HR tech product, running a staffing agency, or just trying to understand hiring trends in your industry, you need structured job market data. The problem? Most job boards don't offer public APIs, and the ones that do are expensive or heavily rate-limited.
The good news: job postings are public data. With the right tools, you can pull structured listings from multiple boards and build a real-time intelligence dashboard — no API keys required.
The Architecture
Here's what we're building:
- Data collection — Pull jobs from Indeed, WeWorkRemotely, and other boards on a schedule
- Normalization — Map different schemas into a unified format
- Storage — Push to a database or spreadsheet
- Visualization — Simple dashboard showing trends
Step 1: Collecting Data from Multiple Sources
The easiest approach is to use pre-built Apify actors that output structured JSON. For example:
- Indeed Jobs Scraper — returns title, company, salary, location, and description
- WeWorkRemotely Scraper — returns remote-specific listings with category tags
You can trigger these via the Apify API or on a cron schedule. Here's a quick Python example to pull data from any Apify actor:
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Step 2: Normalize the Data
Each source returns slightly different fields. Create a simple mapping:
def normalize_job(job, source):
"""Map different job board schemas to a common format."""
if source == 'indeed':
return {
'title': job['title'],
'company': job['company'],
'location': job.get('location', 'Not specified'),
'salary': job.get('salary'),
'url': job['url'],
'source': 'indeed',
'posted': job.get('postedAt')
}
elif source == 'weworkremotely':
return {
'title': job['title'],
'company': job['company'],
'location': 'Remote',
'salary': job.get('salary'),
'url': job['url'],
'source': 'weworkremotely',
'posted': job.get('date')
}
Step 3: Spotting Trends
Once you have a week or two of data, the insights get interesting:
- Salary trends by role — Are Python developer salaries rising or falling in your target market?
- Demand signals — Which job titles are appearing more frequently?
- Remote ratio — What percentage of new listings are remote vs. on-site?
- Company hiring velocity — Which companies are posting the most?
You can use pandas for quick analysis, or push the data to a Google Sheet and use its built-in charting.
Step 4: Making It Actionable
The real value comes when you schedule daily collection runs and track changes over time. Some ideas:
- For recruiters: Get alerts when a target company posts a new role
- For job seekers: Track salary ranges for your target title across multiple boards
- For HR tech builders: Feed this data into your product as a competitive intelligence layer
- For investors: Monitor hiring velocity as a signal for company growth
Getting Started
The fastest path to a working dashboard:
- Sign up for a free Apify account
- Run the Indeed Jobs Scraper and WWR Scraper with your target keywords
- Export the JSON results and load them into pandas or a spreadsheet
- Schedule daily runs to build a time series
The whole setup takes about 30 minutes, and you'll have a job market intelligence feed that most HR analytics companies charge hundreds per month for.
What job market data are you tracking? Drop a comment if you've built something similar — I'd love to compare approaches.
Top comments (0)