In B2B sales, cold outreach is getting harder. Response rates are dropping because everyone is using the same automated tools to spam the same lists.
The secret to high-converting B2B outreach is intent.
If someone comments on a viral LinkedIn post about "The struggles of managing AWS infrastructure," and you sell an AWS management tool—that person is a warm lead.
In this tutorial, I'll show you how to build a Python script that takes a viral LinkedIn post URL, extracts every person who commented on it, and exports their profile data to a CSV for highly targeted outreach.
The Problem with LinkedIn Scraping
LinkedIn has the most aggressive anti-scraping measures on the internet. If you try to use Selenium or BeautifulSoup, your account will be restricted or permanently banned within hours.
To do this safely, we will use the SociaVault API. It handles all the proxy rotation, headless browser management, and CAPTCHA solving on the backend. You just make a simple API call.
Prerequisites
- Python 3.8+
-
requestslibrary -
pandaslibrary - A SociaVault API key (Get 1,000 free credits at sociavault.com)
pip install requests pandas
Step 1: The Script Setup
Create a file called linkedin_scraper.py.
import requests
import pandas as pd
import time
API_KEY = 'your_sociavault_api_key'
BASE_URL = 'https://api.sociavault.com/v1/linkedin'
headers = {
'Authorization': f'Bearer {API_KEY}',
'Content-Type': 'application/json'
}
Step 2: Extracting the Post ID
LinkedIn URLs look like this:
https://www.linkedin.com/posts/username_this-is-the-post-title-activity-7165432109876543210-AbCd
The actual Post ID is the 19-digit number (7165432109876543210). Let's write a quick helper to extract it.
import re
def extract_post_id(url):
match = re.search(r'activity-(\d+)-', url)
if match:
return match.group(1)
# Handle alternative URL formats
match = re.search(r'urn:li:activity:(\d+)', url)
if match:
return match.group(1)
raise ValueError("Could not find Post ID in URL")
Step 3: Fetching the Commenters
Now we'll hit the SociaVault endpoint to get the comments for that specific post.
def get_post_commenters(post_id, max_comments=100):
print(f"Fetching comments for post {post_id}...")
leads = []
try:
response = requests.get(
f"{BASE_URL}/post/comments",
headers=headers,
params={
'post_id': post_id,
'limit': max_comments
}
)
if response.status_code == 200:
comments = response.json().get('data', [])
for comment in comments:
author = comment.get('author', {})
# We only want real people, not company pages
if author.get('type') == 'USER':
leads.append({
'Full Name': author.get('name'),
'Headline': author.get('headline'), # e.g., "CTO at TechCorp"
'Profile URL': author.get('profile_url'),
'Comment Text': comment.get('text'),
'Engagement': comment.get('likes_count')
})
return leads
else:
print(f"API Error: {response.text}")
return []
except Exception as e:
print(f"Request failed: {e}")
return []
Step 4: Filtering and Exporting
We don't just want a raw list. We want to filter out people who left low-value comments like "CFBR" (Commenting for better reach) or "Following". We want people who actually engaged with the topic.
def process_leads(leads, output_filename):
if not leads:
print("No leads found.")
return
df = pd.DataFrame(leads)
# Remove duplicates (if someone commented twice)
df = df.drop_duplicates(subset=['Profile URL'])
# Filter out low-value comments (less than 10 characters)
df = df[df['Comment Text'].str.len() > 10]
# Filter out common engagement pod phrases
spam_phrases = ['cfbr', 'following', 'great post', 'agree']
df = df[~df['Comment Text'].str.lower().isin(spam_phrases)]
print(f"\nExtracted {len(df)} high-quality leads!")
# Show a preview
for index, row in df.head(3).iterrows():
print("-" * 50)
print(f"Name: {row['Full Name']}")
print(f"Headline: {row['Headline']}")
print(f"Comment: {row['Comment Text'][:100]}...")
# Export to CSV
df.to_csv(output_filename, index=False)
print(f"\nSaved leads to {output_filename}")
# Run the script
if __name__ == "__main__":
# Example viral post URL
target_url = "https://www.linkedin.com/posts/example_post-activity-7165432109876543210-AbCd"
try:
post_id = extract_post_id(target_url)
leads = get_post_commenters(post_id, max_comments=200)
process_leads(leads, "linkedin_warm_leads.csv")
except Exception as e:
print(f"Error: {e}")
The Outreach Strategy
Now you have a CSV file named linkedin_warm_leads.csv containing the names, job titles, and profile URLs of people who engaged with a specific topic.
Instead of sending a generic cold message, you can send a highly personalized connection request:
"Hey [Name], saw your comment on [Author]'s post about AWS infrastructure. I completely agree with your point about [Reference their comment]. We actually built a tool that solves exactly that. Open to connecting?"
This approach routinely sees 40-50% acceptance rates, compared to the 5% average of standard cold outreach.
Scale Your Lead Gen
If you want to scale this, you can use SociaVault to automate the entire pipeline. You can search for posts by keyword, extract the commenters, and feed them directly into your CRM via API.
Get your free API key at SociaVault.com and start building your intent-based lead machine today.
Top comments (0)