Your CRM has a name, an email address, and a company domain. That is it.
Every rep on your team spends the next 30–60 minutes per lead filling in the blanks: job title, company headcount, tech stack, funding stage, recent news. Multiply that by 20 leads per day per rep, and your pipeline is not slow because of bad messaging — it is slow because half your team's time is vanishing into manual research before the first email goes out.
Commercial enrichment platforms exist to solve this. They also cost more than most lean sales teams can justify.
What Enrichment Platforms Actually Charge
Here is what the current market looks like:
- ZoomInfo: $15,000+/year for a team contract. Occasionally negotiable, never cheap.
- Clearbit: $600–$2,000+/month depending on volume and features. Now owned by HubSpot, pricing has drifted upward.
- Apollo.io: $49–$149/month per seat — arguably the most accessible — but rate-limited on enrichment calls and increasingly aggressive with plan restrictions.
- LinkedIn Sales Navigator: $79–$150/seat/month. Still requires a human to do the actual research inside the tool.
All of these are priced for teams at funded companies. If you are a founder doing outbound yourself, a solo RevOps operator, or an agency running enrichment for a handful of clients, the unit economics do not work.
The data these platforms are selling you is public. LinkedIn job postings, company pages, tech job listings, press coverage — none of it is proprietary. The product is the infrastructure to collect and normalize it at scale. And that infrastructure has gotten cheap enough to build yourself.
What the DIY Pipeline Looks Like
The goal: given a company domain (e.g. acme.io), automatically fill in headcount, industry, hiring signals, tech stack indicators, and recent news — then write those fields back into your CRM.
Three data sources cover most of what you need:
1. LinkedIn Company Page
Company size buckets (1–10, 11–50, 51–200, etc.), industry, headquarters location, recent company updates. This is enough to qualify or disqualify most leads before outreach.
2. Job Postings
What a company is actively hiring tells you what they are building. A company posting for "Stripe integration engineer" or "Salesforce Admin" signals specific tools in their stack. A wave of engineering hires signals growth. A freeze on hiring signals budget pressure.
3. Domain / Public Web
WHOIS data for founding year. The company's own /tech-stack, /docs, or /integrations page often lists the tools they support. Google News for press coverage in the last 90 days.
The Apify Actor Stack
Rather than writing and maintaining custom scrapers, use Apify's pre-built actors. The three that cover this pipeline:
linkedin-job-scraper
https://apify.com/lanky_quantifier/linkedin-job-scraper
Input a company name or LinkedIn URL. Returns active job listings with title, department, location, and description text. Parse the descriptions for tech keywords: Node.js, React, Salesforce, Stripe, AWS, Kubernetes. Each keyword is a signal about what they run.
const { ApifyClient } = require("apify-client");
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
async function getHiringSignals(companyLinkedInUrl) {
const run = await client.actor("lanky_quantifier/linkedin-job-scraper").call({
startUrls: [{ url: companyLinkedInUrl }],
maxItems: 20,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
const techKeywords = ["Salesforce", "HubSpot", "Stripe", "AWS", "Node.js",
"React", "Kubernetes", "Postgres", "Snowflake", "dbt"];
const signals = new Set();
items.forEach((job) => {
techKeywords.forEach((kw) => {
if (job.description && job.description.includes(kw)) signals.add(kw);
});
});
return {
activeOpenings: items.length,
techSignals: [...signals],
departments: [...new Set(items.map((j) => j.department).filter(Boolean))],
};
}
website-content-crawler
Input the company's domain. Crawl their /docs, /integrations, or /pricing page for tool mentions. This is faster and more reliable than parsing job descriptions because companies explicitly list their integrations.
google-serp-scraper
https://apify.com/lanky_quantifier/google-serp-scraper
Query: "acme.io" site:techcrunch.com OR site:venturebeat.com after:2025-01-01
Returns press coverage. A recent funding announcement or product launch is a strong signal for timing outreach.
Wiring It to Your CRM
Once you have enrichment data, write it back via API. Both HubSpot and Pipedrive expose straightforward REST endpoints for updating company and contact records.
const axios = require("axios");
async function enrichHubSpotContact(contactId, enrichmentData) {
const properties = {
company_headcount: enrichmentData.headcount,
tech_stack: enrichmentData.techSignals.join(", "),
active_job_openings: enrichmentData.activeOpenings,
recent_funding_news: enrichmentData.recentNews || "",
enrichment_date: new Date().toISOString().split("T")[0],
};
await axios.patch(
`https://api.hubapi.com/crm/v3/objects/contacts/${contactId}`,
{ properties },
{ headers: { Authorization: `Bearer ${process.env.HUBSPOT_TOKEN}` } }
);
}
For Pipedrive, the equivalent is a PUT to /v1/persons/{id} with custom fields mapped in your account settings.
Running It Nightly
Wrap everything in a scheduled job. The simplest setup:
// enrichment-pipeline.js
async function runEnrichmentBatch() {
// 1. Pull contacts missing enrichment fields from CRM
const unenrichedContacts = await getUnenrichedContacts();
for (const contact of unenrichedContacts) {
try {
const hiring = await getHiringSignals(contact.linkedinUrl);
const techStack = await crawlDomainForTechSignals(contact.domain);
const news = await getRecentNews(contact.domain);
await enrichHubSpotContact(contact.id, {
...hiring,
techSignals: [...hiring.techSignals, ...techStack],
recentNews: news[0]?.title || "",
});
console.log(`Enriched: ${contact.domain}`);
} catch (err) {
console.error(`Failed: ${contact.domain} — ${err.message}`);
}
}
}
runEnrichmentBatch();
Schedule with a cron job or GitHub Actions on a nightly trigger. Run it against your 20 newest contacts each night. By morning, your reps open the CRM and the research is already done.
What This Costs
Running this pipeline at realistic scale:
| Volume | Apify Credits | Monthly Cost |
|---|---|---|
| 100 contacts/month | ~500 credits | ~$5 |
| 500 contacts/month | ~2,500 credits | ~$15 |
| 2,000 contacts/month | ~10,000 credits | ~$30 |
Against $600–$2,000/month for Clearbit or $15,000+/year for ZoomInfo, the math is not close.
The tradeoff is build time (a few hours to wire up) and data freshness (you enrich on-demand vs. querying a pre-built database). For sub-enterprise outbound volumes, the tradeoff is almost always worth it.
The Setup in Summary
- Pull unenriched contacts from your CRM nightly
- Run
linkedin-job-scraperfor hiring signals and tech stack clues - Crawl the company domain for explicit tool mentions
- Run
google-serp-scraperfor recent press - Write enriched fields back via HubSpot or Pipedrive API
- Wake up to a CRM your reps can actually use
The research bottleneck that costs 30–60 minutes per lead becomes a background process that costs less than $30/month. Your reps spend that time on the outreach itself — which is the only part of the job that actually moves pipeline.
All Apify actors referenced are available at apify.com/lanky_quantifier. API credits are pay-per-use with no monthly minimum.
Top comments (0)