DEV Community

AlwaysPrimeDev
AlwaysPrimeDev

Posted on

I Built a LinkedIn Profile Scraper on Apify for Public Profiles, Company Enrichment, and Lead Research


Public LinkedIn data is still one of the most useful inputs for lead generation, recruiting, founder sourcing, and
market research.

The problem is that many LinkedIn scraping workflows have too much friction:

  • they depend on cookies
  • they break the moment setup is slightly wrong
  • they return shallow profile data
  • they make you wait until the whole run finishes before you can use anything

I wanted something simpler.

So I built a LinkedIn Profile Scraper on Apify that works with public profile URLs, does not require LinkedIn cookies,
and returns structured profile data plus company enrichment and best-effort contact discovery from public company
websites.

What the actor does

You pass in one or more public LinkedIn profile URLs like this:

{
  "profileUrls": [
    "https://www.linkedin.com/in/williamhgates",
    "https://www.linkedin.com/in/satyanadella"
  ]
}
Enter fullscreen mode Exit fullscreen mode

For each public profile, the actor can return:

  • full name, headline, summary, location
  • followers and connections
  • current role and company
  • work experience and education
  • recent posts and articles
  • company LinkedIn URL, website, industry, and size
  • best-effort email candidates discovered from public company pages

That makes it useful not only for scraping profiles, but for building enriched lead or research datasets.

Why I built it this way

The main design goal was low-friction enrichment.

Instead of asking users to manage session cookies, I focused on publicly accessible profile pages. Then I extended the
output beyond the profile itself:

  • company details are enriched from public LinkedIn company pages
  • email candidates are discovered from public company website pages like /contact, /about, and /team
  • successful profiles are streamed into the Apify dataset as soon as they finish
  • failed items are kept out of the main result dataset so the output stays clean

That last point matters more than people think. If you are enriching hundreds of profiles, you usually do not want to
wait for the entire batch before the first usable results appear.

Technical notes

The actor is written in Go and uses concurrent workers, retry handling, request timeouts, and proxy support. On the
parsing side, it combines HTML selectors with JSON-LD extraction to get more reliable structured data from public
pages.

On the Apify side, I wanted the actor to feel like a production tool, not just a script:

  • minimal input
  • progressive dataset output
  • export to JSON, CSV, or Excel
  • easy connection to webhooks, Make, Zapier, n8n, Airtable, Google Sheets, or a CRM

Good use cases

This actor is a good fit if you need:

  • recruiter snapshots of public profiles
  • enriched prospect data for outbound
  • founder and operator sourcing
  • company and talent mapping
  • quick public-profile research pipelines inside Apify

Compliance note

This actor is intended for publicly visible LinkedIn data only. It is not meant to bypass authentication walls or
access private profile data. As always, make sure your usage complies with applicable rules, laws, and internal
policies.

Final thought

I did not want to build β€œjust another scraper.” I wanted an Apify actor that turns a public LinkedIn profile URL into
usable structured research data with as little setup as possible.

If that matches your workflow, you can try the actor here:

[Click here]

If you want, I can also share the implementation details or write a follow-up post about the parsing and enrichment
pipeline behind it.

Top comments (0)