agenthustler

Posted on Mar 27

How to Scrape LinkedIn Connections for Network Analysis

#python #tutorial #webdev #programming

How to Scrape LinkedIn Connections for Network Analysis

Understanding your professional network can reveal hidden patterns — clusters of industry contacts, potential introductions, and career trajectory insights. In this tutorial, we'll build a Python tool that analyzes LinkedIn connection data for network analysis.

The Approach: Export + Enrich

LinkedIn lets you export your own connections as CSV via Settings > Data Privacy > Get a copy of your data. We'll parse that export and enrich it with publicly available data.

Setting Up

pip install requests pandas networkx matplotlib

Parsing Your LinkedIn Export

import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt

def load_connections(csv_path):
    df = pd.read_csv(csv_path, skiprows=3)
    df.columns = [c.strip() for c in df.columns]
    df["Connected On"] = pd.to_datetime(df["Connected On"])
    return df

df = load_connections("Connections.csv")
print(f"Total connections: {len(df)}")
print(f"Top companies:\n{df['Company'].value_counts().head(10)}")

Enriching Profiles with Public Data

Use ThorData residential proxies for reliable enrichment:

import requests
import time

SCRAPER_API_KEY = "YOUR_SCRAPERAPI_KEY"

def enrich_profile(name, company):
    query = f"{name} {company} site:linkedin.com/in"
    params = {
        "api_key": SCRAPER_API_KEY,
        "url": f"https://www.google.com/search?q={query}",
        "render": "false"
    }
    resp = requests.get("https://api.scraperapi.com", params=params)
    time.sleep(1.5)
    return resp.text

for _, row in df.head(5).iterrows():
    html = enrich_profile(
        row["First Name"] + " " + row["Last Name"],
        row["Company"]
    )
    print(f"Fetched data for {row['First Name']} ({len(html)} chars)")

Building the Network Graph

Group connections by company and create a co-affiliation network:

def build_network(df):
    G = nx.Graph()
    companies = df.groupby("Company")
    for company, group in companies:
        if len(group) < 2 or pd.isna(company):
            continue
        people = group["First Name"].tolist()
        for i, p1 in enumerate(people):
            for p2 in people[i+1:]:
                G.add_edge(p1, p2, company=company)
    return G

G = build_network(df)
print(f"Nodes: {G.number_of_nodes()}, Edges: {G.number_of_edges()}")

degrees = sorted(G.degree(), key=lambda x: x[1], reverse=True)
for name, deg in degrees[:10]:
    print(f"  {name}: {deg} shared affiliations")

Visualizing Clusters

plt.figure(figsize=(14, 10))
pos = nx.spring_layout(G, k=0.5, seed=42)
nx.draw_networkx(
    G, pos,
    node_size=[G.degree(n) * 50 for n in G.nodes()],
    font_size=7, alpha=0.8, edge_color="#cccccc"
)
plt.title("LinkedIn Network: Co-Company Affiliations")
plt.savefig("network.png", dpi=150)

Temporal Growth Analysis

df["month"] = df["Connected On"].dt.to_period("M")
growth = df.groupby("month").size().cumsum()
growth.plot(figsize=(10, 4), title="Network Growth Over Time")
plt.ylabel("Total Connections")
plt.savefig("growth.png")

Scaling Up

For monitoring large enrichment jobs, ScrapeOps provides dashboards that track success rates across thousands of requests.

Key Takeaways

LinkedIn's CSV export gives you structured connection data to analyze
NetworkX reveals hidden clusters and bridge connectors
Proxy services like ScraperAPI make enrichment reliable
Temporal analysis shows networking momentum and patterns

Your professional network is a dataset waiting to be explored. Start with the export, build the graph, and discover the patterns.

This tutorial uses your own LinkedIn data export. Always respect platform terms of service and rate limits.

DEV Community

How to Scrape LinkedIn Connections for Network Analysis

How to Scrape LinkedIn Connections for Network Analysis

The Approach: Export + Enrich

Setting Up

Parsing Your LinkedIn Export

Enriching Profiles with Public Data

Building the Network Graph

Visualizing Clusters

Temporal Growth Analysis

Scaling Up

Key Takeaways

Top comments (0)