AMAAN SARFARAZ

Posted on Jan 12

Building a LinkedIn Lead Gen Agent That Actually Works

#ai #automation #agents #python

How I built an AI agent to extract premium LinkedIn members using Agno and ConnectSafely.ai - with real code, real numbers, and lessons learned the hard way

Built an AI agent that extracts LinkedIn Premium members from groups in ~90 seconds. No scraping, no browser automation, no account bans. Using Agno (agent framework) + ConnectSafely.ai (compliant LinkedIn API).

Full code on GitHub 👈

The Problem: I Wasted 6 Hours Clicking LinkedIn Profiles

You know that feeling when you need to find decision-makers in a LinkedIn group, and you end up clicking through profiles one by one like it's 2005?

Yeah. That was me last month.

I needed to extract all Premium/Verified members from a 5,000-person LinkedIn group. For context, these are usually:

Founders with actual budgets
VPs who can say "yes" to deals
Verified professionals investing in their careers

Manual approach: 6 hours of clicking → 47 contacts → severe mouse fatigue → existential crisis about my career choices

What I needed:

Auto-fetch thousands of members ✅
Identify Premium/Verified badges ✅
Filter out noise ✅
Export to Google Sheets ✅
Not get my LinkedIn account banned ✅✅✅

Spoiler: I built it. Let me show you how.

Why Not Just Scrape It?

Short answer: Because I like my LinkedIn account.

Long answer: I tried the "scraping route" first. Here's what happened:

Scraping is a technical dead end for LinkedIn automation:

Selectors break with every UI update
Rate limiting is aggressive and unpredictable
Account bans are permanent
Legal gray area (or just illegal)

There had to be a better way.

The Stack: Agno + ConnectSafely.ai

After burning a week on dead ends, I landed on this architecture:

🤖 Agno: AI agent framework
🔗 ConnectSafely.ai: LinkedIn automation API (compliant, stable, actually works)

📊 Google Sheets API: For export and CRM integration

Why Agno?

Most agent frameworks want you to build elaborate multi-agent systems with "Product Manager Agent," "Engineer Agent," "QA Agent" all chatting via prompts.

Cool for demos. Terrible for production.

Agno is different:

One agent that orchestrates tools
Explicit workflows (no emergent behavior)
Tool-first design (agent is the glue, not the worker)

Think of it as hiring one senior dev instead of managing a committee of junior agents.

Why ConnectSafely.ai?

Because I don't want to:

Reverse-engineer LinkedIn's private APIs
Manage session cookies and CSRF tokens
Handle rate limiting logic
Debug why profiles stopped loading
Get banned

ConnectSafely gives me:

# This just works™
members = connectsafely.fetch_group_members(group_id)

No scraping. No browser automation. Just data.

Architecture: Keep It Stupid Simple

Here's the folder structure:

agentic/agno/
├── app.py                     # Entry point
├── agents/
│   └── agent.py               # Agno agent config
│
├── config/
│   ├── agent_setup.py         
│   ├── workflows.py           
│   └── constants.py           
│
├── tools/
│   ├── linkedin/
│   │   ├── fetch_group_members_tool.py
│   │   ├── filter_premium_tool.py
│   │   └── workflow_tool.py
│   │
│   └── googlesheet/
│       ├── sheets_tool.py
│       └── export_tool.py
│
└── README.md

Design principle: Each tool does ONE thing. No god classes. No hidden coupling.

The Agent: Configured Once, Used Forever

Here's the agent setup:

from agno import Agent
from agno.llms import Gemini

agent = Agent(
    name="LinkedInPremiumExtractor",
    description=(
        "You extract LinkedIn group members via ConnectSafely.ai, "
        "filter for Premium/Verified users, and export to Google Sheets. "
        "You are precise, efficient, and never scrape."
    ),
    llm=Gemini(model="gemini-3-pro-preview"),
    tools=[
        fetch_group_members_tool,
        filter_premium_verified_tool,
        export_to_sheets_tool,
    ],
    show_tool_calls=True,  # Debug mode
    markdown=True,
)

What makes this work:

Agent knows exactly what success looks like
Tools are explicitly listed (no discovery overhead)
LLM is used for orchestration only (not data processing)
One agent, zero delegation chaos

Tool #1: Fetch Members (The Right Way)

from connectsafely import ConnectSafelyClient

def fetch_group_members_tool(group_id: str) -> list:
    """
    Fetch all members from a LinkedIn group.
    Handles pagination automatically.
    """
    client = ConnectSafelyClient(api_key=os.getenv("CONNECTSAFELY_API_KEY"))

    response = client.groups.fetch_members(
        group_id=group_id,
        pagination=True,
        include_badges=True,  # We need Premium/Verified status
        timeout=30
    )

    return response.members

What this handles automatically:

✅ Pagination for group members
✅ Rate limiting (respects LinkedIn's limits)
✅ Retries on network failures
✅ Session management

What I don't have to worry about:

❌ CSS selectors
❌ Browser fingerprinting
❌ Cookie management
❌ Getting banned

Tool #2: Filter Premium Members

This tool isolates the filtering logic:

def filter_premium_verified_tool(members: list) -> list:
    """
    Filter members to only Premium/Verified accounts.
    """
    filtered = []

    for member in members:
        is_premium = member.get("isPremium", False)
        is_verified = member.get("isVerified", False)
        has_badge = "premium" in member.get("badges", [])

        if is_premium or is_verified or has_badge:
            filtered.append(member)

    return filtered

Why isolate this?

Easy to test in isolation
Reusable across workflows
Simple to extend (add location, industry filters)
Clear single responsibility

Want to filter by geography?

if member.get("location") == "San Francisco, CA":
    filtered.append(member)

Done.

Tool #3: Export to Google Sheets

Final step: getting data into a usable format.

from googleapiclient.discovery import build

def export_to_sheets_tool(members: list, sheet_id: str):
    """
    Export filtered members to Google Sheets.
    """
    service = build('sheets', 'v4', credentials=creds)

    # Prepare data
    rows = [["Name", "Headline", "Profile URL", "Status", "Company"]]

    for m in members:
        rows.append([
            m.get("name", "N/A"),
            m.get("headline", "N/A"),
            m.get("profileUrl", "N/A"),
            "Premium" if m.get("isPremium") else "Verified",
            m.get("company", "N/A"),
        ])

    # Write to sheet
    service.spreadsheets().values().update(
        spreadsheetId=sheet_id,
        range="Sheet1!A1",
        valueInputOption="RAW",
        body={"values": rows}
    ).execute()

    return {"status": "success", "rows": len(rows)}

Output is CRM-ready:

Clean column headers
No manual formatting needed
Direct import to HubSpot/Salesforce

The Workflow: No Black Box Magic

Here's the beautiful part—the workflow is explicit:

def run_extraction_workflow(group_id: str, sheet_id: str):
    # Step 1: Fetch all members
    print("Fetching members...")
    members = fetch_group_members_tool(group_id)
    print(f"✅ Fetched {len(members)} members")

    # Step 2: Filter Premium/Verified
    print("Filtering for Premium/Verified...")
    premium = filter_premium_verified_tool(members)
    print(f"✅ Found {len(premium)} Premium/Verified members")

    # Step 3: Export to Sheets
    print("Exporting to Google Sheets...")
    result = export_to_sheets_tool(premium, sheet_id)
    print(f"✅ Exported {result['rows']} rows")

    return result

No hidden agent chatter.

No task confusion.

Just execution.

You can read the code and know exactly what happens. Try doing that with 5 LLM agents delegating tasks via prompts.

Real Numbers: Production Performance

From actual production runs:

┌─────────────────────────────────────┐
│ Performance Metrics                 │
├─────────────────────────────────────┤
│ Extraction Speed:  ~50 members/sec  │
│ Accuracy:          98% detection    │
│ Max Group Size:    10,000+ members  │
│ API Uptime:        99.9%            │
└─────────────────────────────────────┘

Example run:

$ uv run streamlit run App.py

Fetching members...
✅ Fetched 3,847 members (76s)

Filtering for Premium/Verified...
✅ Found 412 Premium/Verified members (10.7% conversion)

Exporting to Google Sheets...
✅ Exported 413 rows (4s)

Total time: 80 seconds

That's 412 high-quality leads in under 90 seconds.

Manual approach? 6-8 hours of soul-crushing clicking.

Lessons Learned (The Hard Way)

✅ What Worked

1. Single Agent > Agent Swarm

One agent with good tools beats 5 agents arguing via prompts.

2. APIs > Scraping (Always)

Scraping is technical debt from day one. APIs scale.

3. Tool Isolation = Easy Debugging

When tools do one thing, debugging is trivial. When they're coupled, it's hell.

🔧 What Was Painful

1. Google OAuth Dance

Refresh tokens, scopes, service accounts... took longer than the actual agent code.

2. Agent Prompts Matter More Than You Think

Vague description = vague results. Be surgical with your agent's purpose.

3. Large Groups Need Chunking

5k members? You'll hit memory limits. Process in batches.

🐛 Bugs I Hit (So You Don't Have To)

# Bug 1: Forgot to handle missing badges
# Fix: Use .get() with defaults
badges = member.get("badges", [])

# Bug 2: Rate limits on Sheets API
# Fix: Batch updates, don't write row-by-row
service.spreadsheets().values().batchUpdate(...)

# Bug 3: Agent hallucinating tool parameters
# Fix: Use Pydantic models for strict typing
class FetchMembersInput(BaseModel):
    group_id: str = Field(..., description="LinkedIn group ID")

What's Next: Future Roadmap

Planning these upgrades:

# 🔄 CRM Sync
def sync_to_hubspot(members: list):
    # Direct HubSpot/Salesforce integration
    pass

# 🔔 Real-time Monitoring
def watch_new_premium_members(group_id: str):
    # Webhook alerts for new Premium joins
    pass

# 📦 Batch Processing
def process_multiple_groups(group_ids: list):
    # Analyze 10+ groups in parallel
    pass

# 🧠 AI Lead Scoring
def score_leads(members: list):
    # Rank prospects by profile content
    pass

Real-World Use Cases

This is already being used by:

Sales teams: Targeting enterprise decision-makers
Recruiters: Sourcing verified engineers/designers
Marketers: Building ABM target lists
Founders: Partnership research and networking
Event organizers: Finding keynote speakers

Try It Yourself

Want to build your own version?

📂 GitHub: ConnectSafelyAI/agentic-framework-examples

📚 Docs: connectsafely.ai/docs

🔑 API Keys: https://connectsafely.ai/docs/api

Quick Start

# Clone the repo
git clone https://github.com/ConnectSafelyAI/agentic-framework-examples
cd extract-linkedin-premium-users-from-linkedin-groups/agentic/agno

# Install dependencies
uv sync

# Set up environment
cp .env.example .env
# Add your ConnectSafely API key

# Run it
uv run streamlit run App.py

Resources & Support

📧 Email: support@connectsafely.ai

📖 Documentation: connectsafely.ai/docs

💼 Custom Workflows: Contact for enterprise automation

Follow us for more automation tips:

LinkedIn • YouTube • Instagram • Facebook • X • Bluesky • Mastodon

Key Takeaways

Agent ≠ Complexity: One smart agent beats agent sprawl
Tools > Prompts: Give your agent good tools, not good vibes
APIs > Scraping: Always. No exceptions.
Compliance Matters: Build systems you can actually run in production

Building something cool with AI agents? Drop it in the comments—I'd love to see what you're working on!

Hit me up if you have questions or want to chat about agent architectures. Always down to nerd out about this stuff. 🤓

DEV Community