Remote job boards post thousands of listings daily across dozens of platforms. Manually checking LinkedIn, Remote.co, We Work Remotely, Remotive, and AngelList is a full-time job. Here's how to automate the aggregation.
Why build a remote jobs aggregator?
The same job is often posted on 3-7 different boards with different salary ranges, different application deadlines, and slightly different requirements. An aggregator lets you:
- Deduplicate across boards (same role, same company)
- Set custom alerts (Python developer, $120K+, async-first)
- Track which companies are growing (consistent hiring = healthy)
- Build lead lists (companies hiring = companies with budget)
The architecture
Job boards → Scrapers → Deduplication → Database → Alerts/API
Each board needs a separate scraper since they all use different HTML structures and anti-bot approaches.
Board-by-board approach
LinkedIn Jobs (largest volume)
LinkedIn limits anonymous access but their job search API is partially accessible:
import requests
def scrape_linkedin_jobs(keywords: str, location: str = "Remote", count: int = 25) -> list:
url = "https://www.linkedin.com/jobs-guest/jobs/api/seeMoreJobPostings/search"
params = {
"keywords": keywords,
"location": location,
"start": 0,
"count": count,
"f_WT": "2", # Remote only
}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/122.0.0.0",
"Accept": "application/json",
}
response = requests.get(url, params=params, headers=headers)
if response.status_code == 200:
from bs4 import BeautifulSoup
soup = BeautifulSoup(response.text, "html.parser")
jobs = []
for card in soup.select(".job-search-card"):
jobs.append({
"title": card.select_one(".job-search-card__title")?.get_text(strip=True),
"company": card.select_one(".job-search-card__company-name")?.get_text(strip=True),
"location": card.select_one(".job-search-card__location")?.get_text(strip=True),
"posted": card.select_one("time")?.get("datetime"),
"url": card.select_one("a")?.get("href"),
})
return jobs
return []
jobs = scrape_linkedin_jobs("python developer")
print(f"Found {len(jobs)} LinkedIn jobs")
We Work Remotely (cleaner HTML, no auth needed)
def scrape_wwr(category: str = "programming") -> list:
url = f"https://weworkremotely.com/categories/remote-{category}-jobs.rss"
import feedparser
feed = feedparser.parse(url)
return [
{
"title": entry.title,
"company": entry.get("company", ""),
"url": entry.link,
"published": entry.get("published", ""),
"description": entry.get("summary", "")[:500],
}
for entry in feed.entries
]
wwr_jobs = scrape_wwr("programming")
print(f"Found {len(wwr_jobs)} WWR jobs")
Remotive (has an official free API)
def get_remotive_jobs(category: str = "software-dev") -> list:
response = requests.get(
"https://remotive.com/api/remote-jobs",
params={"category": category, "limit": 50}
)
if response.status_code == 200:
return response.json().get("jobs", [])
return []
AngelList/Wellfound (requires auth)
AngelList's job data requires a session. Use Playwright:
from playwright.async_api import async_playwright
import asyncio
async def scrape_wellfound_jobs(keywords: str) -> list:
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
await page.goto(f"https://wellfound.com/jobs?q={keywords}&remote=true")
await page.wait_for_selector("[data-test='JobListItem']", timeout=10000)
jobs = await page.evaluate("""
() => Array.from(document.querySelectorAll('[data-test="JobListItem"]')).map(el => ({
title: el.querySelector('[data-test="job-title"]')?.innerText,
company: el.querySelector('[data-test="company-name"]')?.innerText,
salary: el.querySelector('[data-test="salary"]')?.innerText,
}))
""")
await browser.close()
return jobs
Deduplication logic
The same role appears on multiple boards. Deduplicate by company + title similarity:
from difflib import SequenceMatcher
def deduplicate_jobs(jobs: list) -> list:
unique = []
for job in jobs:
is_duplicate = False
key = f"{job.get('company','').lower()} {job.get('title','').lower()}"
for existing in unique:
existing_key = f"{existing.get('company','').lower()} {existing.get('title','').lower()}"
similarity = SequenceMatcher(None, key, existing_key).ratio()
if similarity > 0.85: # 85% similar = same job
is_duplicate = True
break
if not is_duplicate:
unique.append(job)
return unique
# Aggregate and deduplicate
all_jobs = (
scrape_linkedin_jobs("python developer") +
scrape_wwr("programming") +
get_remotive_jobs("software-dev")
)
unique_jobs = deduplicate_jobs(all_jobs)
print(f"Total: {len(all_jobs)} | After dedup: {len(unique_jobs)}")
Scheduling and alerts
Run this pipeline on schedule (n8n, cron, or Apify schedule) and filter for your criteria:
def filter_jobs(jobs: list, filters: dict) -> list:
results = []
for job in jobs:
title = job.get("title", "").lower()
salary_text = job.get("salary", "") or ""
# Keyword filter
if filters.get("keywords"):
if not any(kw.lower() in title for kw in filters["keywords"]):
continue
# Salary filter (rough)
if filters.get("min_salary"):
import re
salaries = re.findall(r'\$(\d+)', salary_text.replace(",", ""))
if salaries:
max_sal = max(int(s) for s in salaries)
if max_sal < filters["min_salary"]:
continue
results.append(job)
return results
target_jobs = filter_jobs(unique_jobs, {
"keywords": ["python", "backend", "api"],
"min_salary": 100000
})
# Send alerts
for job in target_jobs:
send_slack_notification(job) # or email, Telegram, etc.
The pre-built option
The Remote Jobs Aggregator on Apify covers LinkedIn, WWR, Remotive, and Wellfound in one run. Input your keywords and salary filters, get deduplicated results as JSON or webhook push.
116+ production runs. Pay-per-result pricing.
Using this for lead generation
Beyond job hunting, remote job data is powerful for sales:
- Companies hiring remotely at scale = have budget + culture fit for tools
- 5+ backend engineer openings in 3 months = platform rebuild in progress
- "DevOps" + "Kubernetes" + "security" = enterprise security concerns
Filter job postings by your ICP, then use the Contact Info Scraper to get decision-maker contacts from those company websites.
n8n AI Automation Pack ($39) — 5 production-ready workflows
Skip the setup
Apify Scrapers Bundle — $29 one-time
Includes the Remote Jobs Aggregator and 34 other production scrapers.
Top comments (0)