DEV Community

AMAAN SARFARAZ
AMAAN SARFARAZ

Posted on

Export LinkedIn Data to Google Sheets with CrewAI

CrewAI makes building AI agents intuitive with its task-oriented approach. In this tutorial, we'll build an agent that searches LinkedIn and exports results to Google Sheets.

What We're Building

A CrewAI agent that:

  • Searches LinkedIn profiles via natural language
  • Exports results to Google Sheets or JSON
  • Remembers context between commands
  • Runs in a Streamlit web interface

Why CrewAI?

CrewAI shines for:

  • Task-Based Design: Think in terms of jobs to be done
  • Built-in Memory: Automatic context management
  • Tool Decorators: Simple @tool syntax
  • Rapid Prototyping: Get agents running fast

Prerequisites

  • Python 3.10+
  • ConnectSafely.ai API token
  • Google Gemini API key
  • Google Sheets credentials (optional)

Project Structure

crewai/
β”œβ”€β”€ App.py              # Streamlit UI
β”œβ”€β”€ crew.py             # CrewAI wrapper
β”œβ”€β”€ workflows.py        # Command handlers
β”œβ”€β”€ pyproject.toml      # Dependencies
β”œβ”€β”€ agents/
β”‚   └── agents.py       # Agent definition
└── tools/
    β”œβ”€β”€ search_people_tool.py
    └── googlesheet/                 # Google Sheets export module
        β”œβ”€β”€ auth.py                  # OAuth authentication
        β”œβ”€β”€ client.py                # Google Sheets API client
        └── export_to_sheets.py      # Export tool
    └── export_to_json_tool.py
Enter fullscreen mode Exit fullscreen mode

Step 1: Setup

cd linkedin-to-sheets-export/agentic/crewai
cp .env.example .env
# Add your API keys to .env
uv sync
Enter fullscreen mode Exit fullscreen mode

Step 2: Create Tools with @tool Decorator

CrewAI uses decorators to define tools:

# tools/search_people_tool.py
import os
import requests
from crewai.tools import tool

@tool("Search LinkedIn People")
def search_people(
    keywords: str,
    location: str = None,
    title: str = None,
    limit: int = 100
) -> dict:
    """
    Search for LinkedIn profiles using ConnectSafely.ai API.

    Args:
        keywords: Search terms like "CEO SaaS" or "Software Engineer"
        location: Geographic filter like "San Francisco" or "United States"
        title: Job title filter like "Head of Growth"
        limit: Maximum results to return (default: 100)

    Returns:
        Dictionary with people list and count
    """
    api_token = os.getenv("CONNECTSAFELY_API_TOKEN")

    response = requests.post(
        "https://api.connectsafely.ai/linkedin/search/people",
        headers={
            "Authorization": f"Bearer {api_token}",
            "Content-Type": "application/json",
        },
        json={
            "keywords": keywords,
            "location": location,
            "title": title,
            "limit": min(limit, 100),
        },
        timeout=60,
    )

    data = response.json()
    if isinstance(data, list):
        data = data[0] if data else {}

    people = data.get("people", [])

    formatted = [{
        "profileUrl": p.get("profileUrl", ""),
        "fullName": f"{p.get('firstName', '')} {p.get('lastName', '')}".strip(),
        "headline": p.get("headline", ""),
        "location": p.get("location", ""),
    } for p in people]

    return {
        "success": True,
        "people": formatted,
        "count": len(formatted)
    }
Enter fullscreen mode Exit fullscreen mode

Step 3: Export Tools

The export tool is modularized into separate files:

# tools/googlesheet/auth.py
import os
import requests

def get_access_token() -> str:
    """Get Google OAuth access token from refresh token."""
    response = requests.post(
        "https://oauth2.googleapis.com/token",
        headers={"Content-Type": "application/x-www-form-urlencoded"},
        data={
            "client_id": os.getenv("GOOGLE_CLIENT_ID"),
            "client_secret": os.getenv("GOOGLE_CLIENT_SECRET"),
            "refresh_token": os.getenv("GOOGLE_REFRESH_TOKEN"),
            "grant_type": "refresh_token",
        },
    )
    return response.json()["access_token"]
Enter fullscreen mode Exit fullscreen mode
# tools/googlesheet/client.py
import requests
from .auth import get_access_token

class GoogleSheetsClient:
    """Wrapper for Google Sheets API operations."""

    def create_spreadsheet(self, title: str, sheet_name: str, headers: list):
        # Creates spreadsheet with OAuth authentication
        pass

    def append_data(self, spreadsheet_id: str, sheet_name: str, values: list):
        # Appends rows using OAuth
        pass
Enter fullscreen mode Exit fullscreen mode
# tools/googlesheet/export_to_sheets.py
from crewai.tools import tool
from .client import GoogleSheetsClient

@tool("Export to Google Sheets")
def export_to_sheets(
    people: list,
    spreadsheet_id: str = None,
    spreadsheet_title: str = None,
    sheet_name: str = "LinkedIn People"
) -> dict:
    """Export LinkedIn profiles to Google Sheets using OAuth authentication."""
    client = GoogleSheetsClient()
    # Uses OAuth authentication
    # Handles spreadsheet creation and duplicate detection
Enter fullscreen mode Exit fullscreen mode

Step 4: Define the Agent

# agents/agents.py
from crewai import Agent
from langchain_google_genai import ChatGoogleGenerativeAI

from tools.search_people_tool import search_people
from tools.googlesheet.export_to_sheets import export_to_sheets
from tools.export_to_json_tool import export_to_json

def create_export_agent():
    """Create the LinkedIn Export Agent."""

    llm = ChatGoogleGenerativeAI(
        model="gemini-2.5-pro",
        temperature=0.1,
    )

    return Agent(
        role="LinkedIn Export Specialist",
        goal="Search LinkedIn profiles and export results to Google Sheets or JSON files",
        backstory="""You are an expert at finding professional profiles on LinkedIn
        and organizing the data for export. You understand search parameters,
        can interpret user requests, and efficiently export data to spreadsheets.""",
        tools=[search_people, export_to_sheets, export_to_json],
        llm=llm,
        memory=True,
        verbose=True,
    )
Enter fullscreen mode Exit fullscreen mode

Step 5: Create the Crew Wrapper

# crew.py
from crewai import Crew, Task
from agents.agents import create_export_agent

class LinkedInExportCrew:
    def __init__(self):
        self.agent = create_export_agent()
        self.search_results = []

    def execute(self, command: str, context: str = None) -> dict:
        """Execute a command with the crew."""

        # Build task description
        task_desc = command
        if context:
            task_desc = f"{context}\n\nCurrent request: {command}"

        task = Task(
            description=task_desc,
            expected_output="A summary of the action taken and results",
            agent=self.agent,
        )

        crew = Crew(
            agents=[self.agent],
            tasks=[task],
            verbose=True,
        )

        result = crew.kickoff()

        return {
            "success": True,
            "result": str(result)
        }
Enter fullscreen mode Exit fullscreen mode

Step 6: Build the UI

# App.py
import os
import streamlit as st
from dotenv import load_dotenv
from crew import LinkedInExportCrew

load_dotenv()

st.set_page_config(page_title="LinkedIn Export - CrewAI", page_icon="πŸ“Š")
st.title("πŸ“Š LinkedIn to Sheets Export")
st.caption("Powered by CrewAI & ConnectSafely.ai")

# Initialize
if "crew" not in st.session_state:
    st.session_state.crew = LinkedInExportCrew()

if "messages" not in st.session_state:
    st.session_state.messages = []

if "context" not in st.session_state:
    st.session_state.context = ""

# Examples
with st.expander("Example Commands"):
    st.markdown("""
    - `Search for 50 CEOs in New York`
    - `Find software engineers in San Francisco`
    - `Export results to Google Sheets`
    - `Save as JSON`
    """)

# Chat history
for msg in st.session_state.messages:
    st.chat_message(msg["role"]).write(msg["content"])

# Input
if command := st.chat_input("Enter command..."):
    st.session_state.messages.append({"role": "user", "content": command})
    st.chat_message("user").write(command)

    with st.chat_message("assistant"):
        with st.spinner("CrewAI working..."):
            result = st.session_state.crew.execute(
                command=command,
                context=st.session_state.context
            )

            response = result.get("result", "")
            st.markdown(response)

            st.session_state.messages.append({
                "role": "assistant",
                "content": response
            })

            # Update context
            st.session_state.context += f"\nUser: {command}\nAgent: {response}\n"
Enter fullscreen mode Exit fullscreen mode

Step 7: Run

uv run streamlit run App.py
Enter fullscreen mode Exit fullscreen mode

Key CrewAI Patterns

1. Tool Decorators

The @tool decorator is the cleanest way to define tools:

@tool("Tool Name")
def my_tool(param: str) -> dict:
    """Docstring becomes the tool description."""
    return {"result": "value"}
Enter fullscreen mode Exit fullscreen mode

2. Agent Memory

Enable memory for context retention:

Agent(
    # ...
    memory=True,  # Enables conversation memory
)
Enter fullscreen mode Exit fullscreen mode

3. Task-Based Execution

Each command becomes a task:

task = Task(
    description="What to do",
    expected_output="What success looks like",
    agent=agent,
)
Enter fullscreen mode Exit fullscreen mode

Extending the Agent

Add Multiple Agents

researcher = Agent(role="Researcher", ...)
exporter = Agent(role="Data Exporter", ...)

crew = Crew(
    agents=[researcher, exporter],
    tasks=[research_task, export_task],
    process=Process.sequential,
)
Enter fullscreen mode Exit fullscreen mode

Add Custom Memory

from crewai.memory import LongTermMemory

agent = Agent(
    # ...
    memory=True,
    long_term_memory=LongTermMemory(),
)
Enter fullscreen mode Exit fullscreen mode

Troubleshooting

"Tool not found"

  • Ensure tools are passed to the Agent constructor
  • Check tool function names match what agent expects

Slow execution

  • CrewAI has overhead; consider caching results
  • Use verbose=False in production

Memory issues

  • Context can grow large; implement truncation
  • Clear context periodically

Resources


Questions? Drop them in the comments!

Top comments (0)