Dumebi Okolo

Posted on Jun 12

How to Build a LinkedIn Outreach Pipeline (Without Getting Your Account Banned)

#typescript #javascript #webdev #tutorial

TL;DR: A LinkedIn outreach pipeline is a background worker that signs in with your own session, opens profiles, sends connection requests and messages on a schedule you control, and can post content straight to your feed. The hard was staying invisible to LinkedIn's detection. We got to our nineteenth build in about two weeks. Along the way, the session kept dying after three profiles (a device fingerprint mismatch), the stealth layer turned out to be detectable on its own, an authenticated proxy refused to connect, and Chrome froze in ways no timeout caught. This is every failure and the fix that finally held.

We built a LinkedIn marketing pipeline inside Ozigi because our own go-to-market runs on it. I didn't just want it to be another tool; I needed it to send real messages to real people without getting my personal account flagged.
The very first version we built worked for sourcing and reaching three leads, then the session died. The second version got past that and froze instead. This pattern repeated for two weeks and led us from building v1 of our LinkedIn worker to the current version 26.

This article is like a cleaned-up version of our build log for educational purposes.
If you are trying to reach people on LinkedIn from code, you will hit most of these walls in roughly this order. I will name the exact failure each time, because "it stopped working" helped me precisely never.

What Does a LinkedIn Outreach Pipeline Actually Do?

A complete LinkedIn outreach pipeline does four jobs: It signs in with your session cookie so LinkedIn sees you, not a script. It opens a lead's profile. It sends a connection request or a message, depending on whether you are already a first-degree connection. And it can publish a post to your feed. The first three are outreach. The fourth is content.
They share the same infrastructure, which matters later.

None of these look overly complicated logic. You click a button, type into a box, press send. But the reason this turned into a two-week build is that LinkedIn does not want a script doing any of it, and it is very good at telling the difference between you, a human, and your code (bot).

If you are weighing a Linkedin outreach against cold email first, we wrote up the trade-offs in email versus LinkedIn outreach for dev tools, and the wider loop this fits into lives in the go-to-market playbook for small teams.
This article is about the engineering aspect of pipelines.

Why Is LinkedIn So Hard to Build Against?

LinkedIn runs three layers of defense, and when you try to automate LinkedIn, you trip all of them.
The first is the feed model, a 150-billion-parameter system called 360Brew that reads content semantically and demotes anything that smells machine-made. We covered that one in detail in how to make your LinkedIn content stand out in 2026, and it governs the posting side of the pipeline.

The second is a behavioural bot layer (you will see requests to protechts.net and PerimeterX-style beacons fire on every page) that watches how the browser behaves.

The third is the one that we struggled the most with: a device-fingerprint check tied to your session token. LinkedIn binds your login to the machine you logged in from, across something like forty-eight signals. Changing the machine (with the same session token), even subtly, and the token dies.

Almost none of the bugs we encountered had anything to do with the logic we were building. They were detection bugs wearing a logic costume.

Tools We Used To Build Our LinkedIn Outreach Pipeline

After days of sleepless nights and trying to figure out why our logic was failing, this is the current v26 we have.

Runtime: a long-lived Node worker on Fly.io, polling a queue every 90 seconds.
Browser control: Patchright, a drop-in fork of Playwright, driving a real Google Chrome rather than bundled Chromium.
Queue and state: a linkedin_queue table in Supabase, with statuses (queued, in_progress, done, failed) and an attempt counter.
Scheduling: QStash crons that enqueue work and trigger reply checks.
Network: a residential proxy so the traffic looks like a home connection, fronted by a local wrapper (more on that pain below).

A higher-level view of how this slots into the rest of the product is on the Ozigi architecture page.
For this article, hold onto one detail to help you understand the rest of the flow: the worker keeps a browser alive between polls instead of launching a fresh one each time. That decision is the difference between a session that lives and a session that dies, and it took me far too long to understand why.

The First Wall: Getting Your Worker to Even Boot

Before any of the interesting failures, three boring ones stopped the process cold.

The first time, the worker crashed on startup with Node.js 20 detected without native WebSocket support. Pass the 'ws' package as a transport.

The Playwright Docker image ships Node 20, which has no global WebSocket, and the Supabase Realtime client throws without one.
The fix was to install ws and pass it as the transport. The problem was that we initialised a Supabase client in three files (index.ts, login.ts, browser.ts). We patched one and watched the same crash reappear from a different code path. The options were to patch all three or none.

The next issue was that Chromium would not launch at all, because the Dockerfile pinned the Playwright image to one version while npm install resolved the package to a newer one.
Each Playwright release stores its browser at a version-specific path inside the image, so a mismatch means the binary the package looked for is not there.
The solution was to pin the image tag and the package version to the exact same number and set PATCHRIGHT_SKIP_BROWSER_DOWNLOAD=1 so the post-install step does not redundantly re-fetch a browser the image already has.

Then, page.goto() started taking thirty to forty-five seconds, or timing out on pages that were clearly already loaded.
The problem was that we set waitUntil: 'networkidle'. LinkedIn keeps long-poll connections and analytics beacons open forever, so the "no network for 500ms" condition that networkidle waits for never arrives. Switching to domcontentloaded fixed it instantly. The Playwright navigation docs spells this out, but it is an easy default to leave in place until it becomes a problem.

Why Your LinkedIn Worker Session Dies After Three Profiles

This particular problem was the bug that nearly made us quit the whole approach.

The pattern was maddeningly consistent. The first two or three profile loads in a poll cycle worked perfectly, and we excited and about to round off our sprint, and then the fourth profile comes back as a login redirect or LinkedIn's HTTP 999 "unusual activity" page.
My initial solution to this was taking the same li_at cookie pasted into my browser and attaching it to the session on Playwright, which still worked fine. However, I restarted the worker and got the same problem every time.

We spent a long time re-injecting the cookie from the database on every poll. Little did we know that the cookie was never the problem. We even kept the page object open between polls. No change. The session still died on the third or fourth request.

After a long time, we found that the actual cause of the persistent bug is a system LinkedIn runs that we like to call APFC, an anti-fingerprint-change check.
When a user logs in, LinkedIn records a device fingerprint built from around forty-eight signals: CPU count, RAM, screen resolution, audio hardware, the behavioural beacons mentioned earlier, and more.
It binds their session token to that fingerprint. Our code called browser.newContext() once per poll cycle.
Now, every newContext() spins up a fresh context with freshly randomised characteristics, so from LinkedIn's side our "logged-in user" was teleporting to a new device every ninety seconds. Two or three of those and the token gets invalidated.

How we fixed was to stop creating contexts. We moved from chromium.launch() plus per-poll newContext() to a single launchPersistentContext() per user, written to disk and kept alive in a module-level cache across polls.

// browser.ts — one context per user, kept alive across poll cycles
const ctx = await chromium.launchPersistentContext(
  `/data/linkedin-profiles/${userId}`,
  {
    channel: 'chrome',          // real Chrome, not bundled Chromium
    headless: true,
    proxy: { server: `http://127.0.0.1:${localPort}` },
  }
)
contextCache.set(userId, ctx)

The persistent context writes cookies, localStorage, IndexedDB, and the PerimeterX-style tokens to /data/linkedin-profiles/{userId} and reuses them. The fingerprint stops changing because the browser stops being reborn. One context per user, discarded only on explicit expiry or a worker restart. After this landed in v11, sessions stopped dying mid-cycle.

If you take one thing from this section so far, take this: a fresh browser context is a new device as far as LinkedIn is concerned, and a logged-in user does not change devices every ninety seconds.

Getting Past Bot Detection at the Protocol Level

The persistent context kept the session alive, but the behavioural layer still flagged the odd run as a bot. We were on playwright-extra with the stealth plugin at that point.

The problem with that combination is subtle. The stealth plugin hides the signals that mark a browser as script-driven by injecting JavaScript into the page. The injection mechanism itself (Page.addScriptToEvaluateOnNewDocumentover the DevTools Protocol) is a detectable signal. So, it seemed like we patched one pattern and introduced another.

What worked was switching to Patchright, a Playwright fork that patches those patterns at the Chrome DevTools Protocol level, before any page JavaScript runs.
It was a one-line replacement: import { chromium } from 'patchright' and the rest of the Playwright API stays the same.
We paired that with installing real google-chrome-stable in the image instead of relying on bundled Chromium, which gives a more believable TLS handshake.
LinkedIn's edge can fingerprint the handshake itself through JA3 and JA4, and bundled Chromium has a slightly different signature from the Chrome a real person runs.

We also included ignoreDefaultArgs: ['--enable-automation'], so the browser does not announce itself with the banner Chrome normally shows when it is being driven by a script.

The Problem With Using Proxy Nobody Warns You About

To make the traffic look residential, we routed through a residential proxy. This produced the single most confusing error of the build: ERR_PROXY_AUTH_UNSUPPORTED on every HTTPS request.

The chain of failed attempts here was long, so here it is in order:

Passing the proxy username and password through Playwright's proxy config: Chromium only sends those credentials when the proxy answers with an HTTP 407 challenge. The residential proxy never sends a 407 for tunnelled HTTPS, so the credentials never go out.
Embedding credentials in the proxy URL (http://user:pass@host:port): This threw ERR_INVALID_AUTH_CREDENTIALS, because the sticky-session username format contains characters Chromium rejects inside a URL authority.
URL-encoding the credentials: This worked with curl from inside the same container, and still failed with ERR_SOCKET_NOT_CONNECTED once it went through Chromium.
Switching the proxy to SOCKS5: With this, Playwright launched, then threw Browser does not support socks5 proxy authentication. Chromium has never implemented authenticated SOCKS5. The code path does not exist. The socks5h:// scheme fails the same way.

NOTE: Chromium cannot do authenticated proxies for HTTPS, whether you ask over HTTP CONNECT or SOCKS5. So we stopped asking it to. We run a tiny local wrapper on loopback that accepts an unauthenticated connection from Chromium, opens the authenticated session to the upstream proxy itself, and relays. Chromium points at http://127.0.0.1:<localPort> with no credentials and never knows authentication happened. The SOCKS5 spec (RFC 1928) and the Playwright network docs are both worth a read before you go down this road, because the limitation is not well advertised.

There was a side effect we didn't anticipate, however: Our proxy exit was in Brazil, so LinkedIn served the entire UI in Portuguese. Every button label my code looked for (Message, More actions) came back as Mensagem and Mais ações, and every selector was missed. That sent us into the next category of pain.

The Problem With Blind Automation: Clicking Buttons That Do Not Want to Be Clicked

LinkedIn's profile page is a moving target. Class names rotate between deploys, buttons hide behind dropdowns, and the same action lives in three different places depending on your relationship to the person. This section is four bugs that all reduce to "find the right button and actually click it."

The Connect button reported itself invisible: clickConnectButton kept logging "Connect button not found" on profiles where the button was plainly on screen. Playwright's visibility check (offsetParent, non-zero bounding box) returned false for buttons a human could see and click. The fix for this was building a three-layer fallback: look for the button by aria-label first, then scan <button> text inside main, and finally open the "More actions" dropdown and click the item there. That third path needed a real .click() rather than dispatchEvent, because the dropdown renders as a portal and a synthetic event would never reach LinkedIn's React handler.
Open Profiles got skipped as "already connected": Some members let anyone message them without connecting. Those profiles show a Message button and no Connect button, which is exactly what a first-degree connection looks like. My code saw the "Message button, and then no Connect button" and concluded we were already connected, so it skipped them. The fix to this was to check the page for the literal first-degree badge text (1st, plus its localised forms) before assuming connection. No badge means it is an Open Profile, which we handle as its own flow rather than a skip.
Messages went to the wrong person: The first messaging implementation we tried opened /messaging/thread/new/?recipients=<slug>, which drops you into a compose box with a recipient search prefilled. Pressing Enter confirmed the first typeahead suggestion, and when the slug matched more than one account, that was sometimes a stranger (a wrong recipient). This meant people got messages meant for someone else.
We solved this by a profile-first messaging: open linkedin.com/in/<slug>/, then click the Message button on that page. The compose window opens pre-addressed to the exact profile you are looking at, and the typeahead never enters the picture.
Compound locators lied about visibility: With Patchright, chaining selectors with .or() had cases where isVisible() disagreed with the screen. A word-boundary regex (/\bmessag\b/) failed to match "message" because there is no boundary inside a word, and a loose /connect/i filter happily matched "Remove Connection." We stopped trusting locator chains for this and dropped to a single DOM scan with exact prefix matching against a label dictionary:

const label = await page.evaluate(() => {
  const MSG = /^(Message|Mensagem|Mensaje|Envoyer|Nachricht|Invia|Stuur)\b/i
  for (const b of document.querySelectorAll('button')) {
    const t = (b.getAttribute('aria-label') ?? b.textContent ?? '').trim()
    if (MSG.test(t)) return t
  }
  return null
}).catch(() => null)

That same dictionary is what fixed the Portuguese problem from the proxy section. We extended every label set to the six most common non-English LinkedIn locales, so More actions also matches Mais ações, Más acciones, Plus d'actions, Weitere Aktionen, Altre azioni, and Meer acties.

One last button bug: after the first message of a cycle, later profiles loaded with the Message button covered. LinkedIn restores the messaging compose overlay from storage when a new page loads in the same context, and it renders on top of the profile action buttons. Detecting it by CSS class broke when LinkedIn renamed the class. We detected it by aria-label instead (/messaging.*overlay|compose message/i) and dismissed it before scanning for buttons.

The Silent Bug: a Message That Never Sends

This one deserves its own section because it taught us to distrust using as any.

sendLinkedInMessage would run to completion, throw nothing, and send no message. No error, no log, no clue.
The code passed an object reference into page.evaluate() and read the result back off that reference afterward, with an as any cast to keep TypeScript quiet.
That is not how Playwright returns values from evaluate(). The reference was always null at runtime, the cast hid the mistake at compile time, and the send logic quietly took the do-nothing branch every time.

The repair was to read the return value directly instead of through a smuggled reference:

const resolvedMsgLabel = await page.evaluate(() => {
  // ...scan the DOM, return the matched label string
}).catch(() => null)

An as any next to anything crossing the page boundary is a place a bug goes to hide. This landed in v14, and it is the one I would have caught fastest with stricter types and slowest with more logging, which is exactly backwards from where I looked first.

When Chrome Just Freezes During Automations

By v17 the pipeline was fully functional, but then it started hanging, and the hangs were worse than crashes because the worker stayed alive and stopped making progress.

The first frozen point was inside page.evaluate(). We had page.setDefaultTimeout(30_000) set and assumed it covered everything.
It does not cover evaluate(). That call runs over a DevTools Protocol method with no built-in timeout, so when Chrome's renderer is paused or deadlocked, the call waits forever. We wrapped individual evaluates in a Promise.race against a manual timeout as a stopgap.

Then we hit a deeper version of the same thing. page.close() would hang after a successful send. Closing a page sends a close command over the protocol, and in certain states (a request in flight, an internal alert) Chrome's renderer deadlocks waiting on a lock before it acknowledges. The await never returns. Worse, if our per-job timeout fired during that hang, after the message had sent but before we wrote done to the database, the next startup would reset the item to queued and send it again. Duplicate messages to the same person.
We had moved to v19 with the fix to this:

await Promise.race([
  page.close(),
  new Promise((r) => setTimeout(r, 5_000)),
])

If Chrome does not acknowledge the close in five seconds, we move on and abandon the page. The context stays valid for the next lead.

The part that surprised us most came down to a timeout that never fired. Our per-job timeout was built on Promise.race, and during a real CDP deadlock it stayed silent. The reason is that Promise.race and setTimeout both live on the JavaScript event loop, and a deadlocked synchronous await blocks that loop. A blocked loop cannot run the timer callback that was supposed to rescue it. The watchdog has to live outside the loop.

So we added a process-level watchdog on setInterval, which rides libuv's timer wheel at the OS level and fires even when the JS loop is wedged:

const watchdog = setInterval(() => {
  if (Date.now() - pollStartTime > 8 * 60_000) {
    console.error('[watchdog] poll cycle stuck past 8 min, exiting')
    process.exit(1) // Fly restarts the machine on a non-zero exit
  }
}, 30_000)

When a cycle runs past eight minutes, the worker kills itself and Fly's supervisor restarts it.

That restart created one more problem. Startup cleanup reset every in_progress item back to queued, with no memory of how many times it had been tried. If a single lead was what crashed the worker, the worker would process it, die, restart, reset it, and process it again, forever. We fixed it with a two-tier cleanup and an attempt counter: a job that times out at max attempts is marked failed immediately, and startup only re-queues items that still have attempts left. Everything else gets marked failed instead of looping.

How We Post LinkedIn Content With the Same Engine

The outreach side and the content side run on the same foundation, which is the quiet payoff of all the session work above. Once you have a persistent, fingerprint-stable, proxy-fronted Chrome that LinkedIn trusts as you, posting to your own feed is the same machinery pointed at the composer instead of a profile.

That symmetry is the reason we kept outreach and content in one system rather than two. A connection request and a feed post should sound like the same person, because to the reader they are the same person. We define that voice once as a System Persona, strip the generic model vocabulary with the Banned Lexicon the engine enforces at generation time, and keep a human-in-the-loop edit before anything publishes. The feed model demotes content it reads as machine-made, so posting from a pipeline only helps if what you post does not read like it came from one. The full reasoning behind that constraint engine is in how we stop AI slop in production.

Deployment Lessons From Fly.io

fly deploy was taking three to five minutes a shot. The build context was 1.3 GB, because running deploy from the repo root shipped the entire monorepo (workspace node_modules, the .next cache, assets) to the remote builder before any layer cached. Running fly deploy from workers/linkedin/ instead, where node_modules is small, dropped the context to a few megabytes and the deploy to under thirty seconds.

The other was QStash's ten-schedule cap. Campaign creation started failing once we had a handful of campaigns, because the scheduler created a fresh global "check replies" cron for every campaign instead of reusing the one that already existed. Four campaigns, four identical crons, four wasted slots. The fix was to look up the existing reply schedule and reuse its ID rather than create another.

How To Build A LinkedIn outreach Pipeline, If You Are Starting From Zero

If we were doing this again from scratch, this is the sequence that would have saved the most time:

Get the worker booting first. Node 20 needs the ws transport for Supabase, and your Playwright image tag and package version have to match exactly. Boring, blocking, do it first.
Use domcontentloaded, never networkidle, on LinkedIn. The page is interactive long before the network goes quiet, and on LinkedIn it never does.
Launch one persistent context per user and keep it alive. This is the whole session-stability game. Do not call newContext() per cycle.
Drive real Chrome through Patchright, not stealth-patched Playwright. Protocol-level patching beats JavaScript injection you can be caught doing.
Solve proxy auth with a loopback wrapper. Chromium will not authenticate a proxy for HTTPS. Stop trying to make it.
Scan the DOM for buttons by exact label, in multiple locales. Trust the live DOM over locator chains, and never assume English.
Put a setInterval watchdog outside the event loop. Promise.race cannot save you from a deadlock that blocks the loop it runs on.
Track attempts and fail loudly. A retry loop with no cap will happily send the same person ten messages.

What This Cost, and the Build Versus Buy Call

This took about two weeks and nineteen builds to reach something I trusted with my own account. Most of that was not writing code.

I am not going to pretend the right move for everyone is to build their own headless Chrome worker and keep it alive on Fly. We built ours because outreach is core to how Ozigi reaches its own users and because I wanted one voice across content and outreach instead of stitching three tools together. If you want LinkedIn outreach and feed posting without maintaining any of the above, that is precisely the part we packaged into the Ozigi GTM engine, and you can watch it run on the live demo before deciding. For a side-by-side on where this sits against the alternatives, our best free GTM tool comparison walks through the field.

Whichever way you go, the engineering reality does not change. LinkedIn is built to tell you from a script, and the work is almost entirely about not giving it a reason to.

Frequently Asked Questions

Why does my LinkedIn session expire after a few requests?
Almost always a device-fingerprint mismatch. LinkedIn binds your session token to the machine you logged in from, across roughly forty-eight signals. If your code creates a new browser context per request, each one looks like a different device, and the token gets invalidated after two or three. Use a single persistent browser context per user and keep it alive between requests so the fingerprint stays constant.

Patchright or Playwright for LinkedIn?
Patchright, in our experience. Stealth plugins for Playwright hide the script-driven signals by injecting JavaScript, and the injection itself is detectable over the DevTools Protocol. Patchright patches those signals at the protocol level before any page script runs, and it is a drop-in replacement for Playwright's chromium import. Pair it with real Google Chrome rather than bundled Chromium for a more believable TLS handshake.

Why do I need a residential proxy, and why is the auth so painful?
A residential proxy makes your traffic look like a home connection instead of a data centre, which matters for a behavioural detection layer. The pain is that Chromium cannot authenticate a proxy for HTTPS traffic, over either HTTP CONNECT or SOCKS5. The working pattern is a small local wrapper on loopback that handles authentication to the upstream proxy itself and exposes an unauthenticated endpoint to Chromium.

Why does Playwright hang on page.goto() for LinkedIn?
Because you are probably using waitUntil: 'networkidle'. LinkedIn keeps long-poll and analytics connections open continuously, so the "no network activity" condition never fires. Switch to domcontentloaded. The page is interactive well before the network settles.

Can the same setup post content to my LinkedIn feed?
Yes. Once you have a trusted, fingerprint-stable browser session, posting to your feed uses the same browser control as outreach, pointed at the composer. The catch is the feed's quality model, which demotes content it reads as machine-made, so the writing has to sound like a person regardless of how it was published.

How many connection requests per day is safe?
Keep daily volume low and let replies pause your sequences. LinkedIn enforces a weekly invitation cap (commonly observed somewhere between 100 and 200 invites a week) and watches for bursty behaviour. Modest, steady daily sends with real follow-ups beat hitting any cap. The point of outreach is replies, not volume, which we cover in the go-to-market playbook.

Will building this get my account banned?
It can, if you act like a script: high volume, no delays, a session that keeps changing devices, or generic spam to strangers. The engineering in this article is about looking like a real person to LinkedIn's detection layers. The behaviour on top of it (who you write to, how often, and what you say) is what actually keeps an account healthy. Review LinkedIn's user agreement and decide what you are comfortable with before you start.

Building outreach as a small team and tired of stitching tools together? Ozigi runs sourcing, scoring, LinkedIn and email outreach, and content in one voice on a free tier with no card required. Questions, or want to compare engineering notes? Reach me at hello@ozigi.app.

DEV Community