Seenivasa Ramadurai

Posted on Aug 23

Building Deep Agents with LangChain: A Complete Guide to Automated Profile Generation -Part -2

Building Deep Agents: A Journey into Autonomous Person Profiling

How I built an AI system that can research anyone on the internet and create comprehensive profiles - and what I learned along the way

The Problem That Started It All

Ever found yourself needing to research someone quickly? Maybe it's a potential business partner, a job candidate, or someone you're about to meet at a conference. You know there's information out there, but finding it all is like trying to drink from a firehose.

That's exactly the problem I set out to solve when I started building Deep Agents - an autonomous AI system that can gather and synthesize public information about any person into a comprehensive, professional profile.

What Deep Agents Actually Does

Imagine having a team of specialized researchers working for you 24/7. That's essentially what Deep Agents is:

Personal Information Agent: Finds basic details like name, location, contact info
Professional Background Agent: Digs up career history, job titles, companies
Academic Background Agent: Researches education, degrees, certifications
Social Presence Agent: Uncovers social media profiles and online activities
Profile Synthesis Agent: Combines everything into a coherent narrative
Report Generation Agent: Creates professional PDF reports

The system runs these agents either sequentially (one after another) or in parallel (all at once) depending on your needs.

The Parallel vs Sequential Dilemma

Here's where things got interesting. Initially, I built everything to run sequentially - one agent after another. It worked, but it was slow. A full profile could take 4-6 minutes.

Then I thought: "What if I ran the search agents in parallel?"

The results were dramatic:

Sequential: 4-6 minutes
Parallel: 1-2 minutes

But it wasn't all sunshine and rainbows. Parallel execution is more complex, uses more resources, and can hit API rate limits faster. It's like the difference between cooking one dish at a time versus running multiple burners simultaneously.

The Human-in-the-Loop Challenge

One of the trickiest problems was handling cases where multiple people share the same name. My first approach was to just pick the first result, but that felt wrong.

So I built a Human-in-the-Loop system. When the agents find multiple potential matches, they present the options to the user:

Found multiple matches for "John Smith":

Sreeni Ramadurai- CEO at AI.COM (India)
Sreeni Ramadurai - Professor at University of Madurai
Sreeni Ramadurai - IT Architect

Please select the correct person:
This keeps the human in control while still automating the heavy lifting.d

High-level Architecture

Execution Flow Architecture

Technology Stack

Complete Sequence flow

Sub-Agents ecosystem

Implementation

"""
Person Profile Deep Agent with Parallel Sub-Agent Execution
==========================================================

Author: Sreenivasa Ramadurai
Date: August, 23 2025
Description: Deep Agent High-performance parallel execution system for autonomous person profiling

This implementation demonstrates parallel execution of sub-agents for improved performance.
Sub-agents that can run independently are executed concurrently, while maintaining
sequential execution for dependent operations.

Key Features:
- Parallel execution of independent search agents (4x faster than sequential)
- Sequential execution for dependent operations (synthesis, reporting)
- Thread-safe data collection and aggregation
- Progress tracking for parallel operations
- Error handling for individual sub-agents
- Virtual File System for persistent data storage
- Human-in-the-Loop for name disambiguation

Architecture:
- Personal Info Agent: Gathers basic personal details
- Professional Background Agent: Researches career information
- Academic Background Agent: Collects educational history
- Social Presence Agent: Discovers online presence
- Synthesis Agent: Combines all findings into coherent profile
- Report Agent: Generates professional PDF reports

Performance:
- Sequential: 4-6 minutes per profile
- Parallel: 1-2 minutes per profile
- Thread-safe operations with proper locking
- Graceful error handling and recovery
"""

import os
import json
import asyncio
import threading
from datetime import datetime
from typing import Dict, List, Any, Optional
from concurrent.futures import ThreadPoolExecutor, as_completed
import streamlit as st
import pandas as pd
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from deepagents import create_deep_agent
from dotenv import load_dotenv
from tavily import TavilyClient

# =============================================================================
# ENVIRONMENT SETUP AND CONFIGURATION
# =============================================================================

# Load environment variables from .env file
# Required: OPENAI_API_KEY, TAVILY_API_KEY
load_dotenv()

# Initialize Tavily search client for web search capabilities
# Tavily provides high-quality search results for AI agents
tavily_client = TavilyClient(api_key=os.environ.get("TAVILY_API_KEY"))

# =============================================================================
# VIRTUAL FILE SYSTEM (VFS) FOR PARALLEL OPERATIONS
# =============================================================================
# 
# The VFS provides thread-safe, persistent storage for the multi-agent system.
# It stores:
# - Session data: Tracking search sessions and progress
# - Raw data: Intermediate results from each agent
# - Profiles: Final synthesized profiles
# - Files: Additional generated content (reports, etc.)
#
# Key benefits:
# - Thread-safe operations for parallel execution
# - Persistent storage across application restarts
# - Easy debugging and data inspection
# - No external database dependencies

class ParallelPersonProfileVFS:
    def __init__(self):
        """
        Initialize the Virtual File System with thread-safe storage.

        Creates in-memory dictionaries for fast access and a threading lock
        for safe concurrent operations. Loads any existing data from disk
        to maintain persistence across application restarts.
        """
        # In-memory storage for fast access
        self.files = {}        # General file storage
        self.sessions = {}     # Session tracking data
        self.raw_data = {}     # Raw agent findings
        self.profiles = {}     # Final synthesized profiles

        # Threading lock for safe concurrent operations
        # Essential for parallel agent execution
        self.lock = threading.Lock()

        # Load existing data from disk to maintain persistence
        self._load_existing_data()

    def _load_existing_data(self):
        """
        Load existing data from disk files to maintain persistence.

        This method scans the local directories (profiles/, sessions/, raw_data/)
        and loads any existing JSON files into memory. This ensures that the
        VFS maintains data across application restarts and provides continuity
        for ongoing operations.

        The method handles each directory separately and provides detailed
        error reporting for any files that cannot be loaded.
        """
        try:
            # Load existing profiles from profiles/ directory
            # Profiles contain the final synthesized information about people
            if os.path.exists("profiles"):
                for filename in os.listdir("profiles"):
                    if filename.endswith('.json'):
                        file_path = os.path.join("profiles", filename)
                        try:
                            with open(file_path, 'r') as f:
                                data = json.load(f)
                                # Extract person name from filename (remove .json extension)
                                person_name = filename.replace('.json', '')
                                self.profiles[person_name] = data
                        except Exception as e:
                            print(f"Error loading profile {filename}: {e}")

            # Load existing sessions from sessions/ directory
            # Sessions track search operations and their metadata
            if os.path.exists("sessions"):
                for filename in os.listdir("sessions"):
                    if filename.endswith('.json'):
                        file_path = os.path.join("sessions", filename)
                        try:
                            with open(file_path, 'r') as f:
                                data = json.load(f)
                                # Extract session ID from filename
                                session_id = filename.replace('.json', '')
                                self.sessions[session_id] = data
                        except Exception as e:
                            print(f"Error loading session {filename}: {e}")

            # Load existing raw data from raw_data/ directory
            # Raw data contains intermediate findings from individual agents
            if os.path.exists("raw_data"):
                for filename in os.listdir("raw_data"):
                    if filename.endswith('.json'):
                        file_path = os.path.join("raw_data", filename)
                        try:
                            with open(file_path, 'r') as f:
                                data = json.load(f)
                                # Use filename as key for raw data
                                key = filename.replace('.json', '')
                                self.raw_data[key] = data
                        except Exception as e:
                            print(f"Error loading raw data {filename}: {e}")

            # Report successful data loading
            print(f"✅ Loaded existing data: {len(self.profiles)} profiles, {len(self.sessions)} sessions, {len(self.raw_data)} raw data files")

        except Exception as e:
            print(f"Error loading existing data: {e}")

    def save_session(self, session_id: str, person_name: str, search_query: str = ""):
        """
        Save session data with thread safety.

        Creates a new search session and stores its metadata. Sessions track
        the progress of profile generation operations and provide context
        for debugging and monitoring.

        Args:
            session_id: Unique identifier for the session
            person_name: Name of the person being researched
            search_query: Optional additional search context
        """
        with self.lock:  # Thread-safe operation
            session_data = {
                "session_id": session_id,
                "person_name": person_name,
                "search_query": search_query,
                "timestamp": datetime.now().isoformat(),
                "status": "active"
            }
            # Store in memory for fast access
            self.sessions[session_id] = session_data
            # Persist to disk for durability
            self._save_to_file(f"sessions/{session_id}.json", session_data)

    def save_raw_data(self, session_id: str, data_type: str, data: Dict):
        """
        Save raw data from agents with thread safety.

        Stores intermediate findings from individual agents. This data is used
        for debugging, analysis, and as input to the synthesis agent.

        Args:
            session_id: Session identifier for grouping related data
            data_type: Type of data (e.g., 'personal_info', 'professional_info')
            data: The actual data dictionary from the agent
        """
        with self.lock:  # Thread-safe operation
            # Create unique key for this data entry
            key = f"{session_id}_{data_type}"
            # Store in memory for fast access
            self.raw_data[key] = data
            # Persist to disk for durability
            self._save_to_file(f"raw_data/{key}.json", data)

    def save_profile(self, person_name: str, profile_data: Dict):
        """
        Save final synthesized profile with thread safety.

        Stores the complete profile after all agents have completed their work
        and the synthesis agent has combined all findings.

        Args:
            person_name: Name of the person (used as filename)
            profile_data: Complete profile dictionary with all sections
        """
        with self.lock:  # Thread-safe operation
            # Store in memory for fast access
            self.profiles[person_name] = profile_data
            # Persist to disk for durability
            self._save_to_file(f"profiles/{person_name}.json", profile_data)

    def get_profile(self, person_name: str) -> Optional[Dict]:
        """Get profile data"""
        return self.profiles.get(person_name)

    def write_file(self, file_path: str, content: str) -> str:
        """Write file with thread safety"""
        with self.lock:
            self.files[file_path] = content
            return f"File written: {file_path}"

    def read_file(self, file_path: str) -> str:
        """Read file"""
        return self.files.get(file_path, f"File not found: {file_path}")

    def list_files(self) -> str:
        """List all files"""
        return json.dumps(list(self.files.keys()), indent=2)

    def get_all_profiles(self) -> Dict:
        """Get all profiles"""
        return self.profiles

    def get_session_stats(self) -> Dict:
        """Get session statistics"""
        return {
            "total_sessions": len(self.sessions),
            "total_profiles": len(self.profiles),
            "total_raw_data": len(self.raw_data),
            "total_files": len(self.files)
        }

    def _save_to_file(self, file_path: str, data: Dict):
        """Internal method to save data to file"""
        try:
            os.makedirs(os.path.dirname(file_path), exist_ok=True)
            with open(file_path, 'w') as f:
                json.dump(data, f, indent=2)
        except Exception as e:
            print(f"Error saving to file {file_path}: {e}")

# Initialize parallel VFS
parallel_profile_vfs = ParallelPersonProfileVFS()

# =============================================================================
# CONTENT PROCESSING AND PARSING FUNCTIONS
# =============================================================================
# 
# These functions handle the cleaning and formatting of raw AI agent output.
# AI agents often return content with escape characters, JSON artifacts, and
# other formatting issues that need to be cleaned for proper display.

def parse_agent_result(result) -> str:
    """
    Parse and clean agent result to extract readable content.

    AI agents often return content wrapped in various formats (ToolMessage,
    JSON objects, etc.) with escape characters. This function extracts the
    actual content and cleans it for proper display.

    Args:
        result: Raw result from an AI agent (can be dict, string, or object)

    Returns:
        Clean, readable string content
    """
    try:
        if isinstance(result, dict) and "messages" in result:
            # Find the last AI message with actual content
            for message in reversed(result["messages"]):
                if hasattr(message, 'content') and message.content:
                    content = message.content
                    # Clean up escape characters and formatting
                    content = content.replace('\\n', '\n')
                    content = content.replace('\\"', '"')
                    content = content.replace('\\u2019', "'")
                    content = content.replace('\\u201c', '"')
                    content = content.replace('\\u201d', '"')
                    content = content.replace('\\u2013', '–')
                    content = content.replace('\\u2014', '—')

                    # Remove ToolMessage wrapper
                    if content.startswith("ToolMessage(content='") and content.endswith("')"):
                        content = content[20:-2]  # Remove ToolMessage wrapper

                    # Remove JSON artifacts
                    if content.startswith('{"') or content.startswith('{\''):
                        try:
                            import json
                            parsed = json.loads(content)
                            if isinstance(parsed, dict) and 'content' in parsed:
                                content = parsed['content']
                        except:
                            pass

                    return content
                elif isinstance(message, dict) and message.get("content"):
                    content = message["content"]
                    # Clean up escape characters
                    content = content.replace('\\n', '\n')
                    content = content.replace('\\"', '"')
                    content = content.replace('\\u2019', "'")

                    # Remove ToolMessage wrapper
                    if content.startswith("ToolMessage(content='") and content.endswith("')"):
                        content = content[20:-2]  # Remove ToolMessage wrapper

                    return content

        # Fallback: convert to string and clean
        content = str(result)
        content = content.replace('\\n', '\n')
        content = content.replace('\\"', '"')
        content = content.replace('\\u2019', "'")

        # Remove ToolMessage wrapper
        if content.startswith("ToolMessage(content='") and content.endswith("')"):
            content = content[20:-2]  # Remove ToolMessage wrapper

        return content

    except Exception as e:
        return f"Error parsing result: {str(e)}"

# Tools for parallel execution
@tool
def internet_search(query: str) -> str:
    """Search the internet for current information"""
    try:
        response = tavily_client.search(query, search_depth="advanced", max_results=10)
        if isinstance(response, dict) and "results" in response:
            results = response["results"]
            if isinstance(results, list) and len(results) > 0:
                formatted_results = f"Search results for '{query}':\n\n"
                for i, result in enumerate(results[:5], 1):
                    title = result.get('title', 'No title')
                    content = result.get('content', 'No content')
                    url = result.get('url', 'No URL')
                    formatted_results += f"{i}. **{title}**\n"
                    formatted_results += f"   URL: {url}\n"
                    formatted_results += f"   Content: {content[:200]}...\n\n"
                return formatted_results
            else:
                return f"No search results found for '{query}'"
        else:
            return f"Error in search response for '{query}'"
    except Exception as e:
        return f"Error during search: {str(e)}"

@tool
def save_finding(session_id: str, data_type: str, finding: str) -> str:
    """Save a finding to the virtual file system"""
    try:
        data = {
            "session_id": session_id,
            "data_type": data_type,
            "finding": finding,
            "timestamp": datetime.now().isoformat()
        }
        parallel_profile_vfs.save_raw_data(session_id, data_type, data)
        return f"Finding saved for {data_type}"
    except Exception as e:
        return f"Error saving finding: {str(e)}"

@tool
def vfs_write_file(file_path: str, content: str) -> str:
    """Write content to a file in the virtual file system"""
    return parallel_profile_vfs.write_file(file_path, content)

@tool
def vfs_read_file(file_path: str) -> str:
    """Read content from a file in the virtual file system"""
    return parallel_profile_vfs.read_file(file_path)

# =============================================================================
# PARALLEL AGENT EXECUTION FUNCTIONS
# =============================================================================
# 
# These functions execute individual agents in parallel for improved performance.
# Each agent is specialized for a specific type of information gathering.
# The parallel execution allows multiple agents to work simultaneously,
# significantly reducing the total time required for profile generation.

async def execute_personal_info_agent(person_name: str, session_id: str) -> Dict:
    """
    Execute personal info agent in parallel.

    This agent specializes in gathering basic personal information such as
    name variations, location, contact details, and biographical information.

    Args:
        person_name: Name of the person to research
        session_id: Session identifier for data organization

    Returns:
        Dictionary containing agent results and status
    """
    try:
        agent = create_deep_agent(
            tools=[internet_search, save_finding, vfs_write_file],
            instructions=PERSONAL_INFO_PROMPT,
            model=ChatOpenAI(model="gpt-4o-mini", temperature=0.2)
        )

        result = await asyncio.to_thread(
            agent.invoke,
            {"messages": [{"role": "user", "content": f"Find personal information for {person_name}"}]}
        )

        # Save results
        parallel_profile_vfs.save_raw_data(session_id, "personal_info", {
            "agent": "personal_info_agent",
            "result": str(result),
            "timestamp": datetime.now().isoformat()
        })

        return {"agent": "personal_info", "status": "success", "result": str(result)}
    except Exception as e:
        return {"agent": "personal_info", "status": "error", "error": str(e)}

async def execute_professional_background_agent(person_name: str, session_id: str) -> Dict:
    """
    Execute professional background agent in parallel.

    This agent specializes in gathering career information including current
    job titles, company affiliations, career history, and professional achievements.

    Args:
        person_name: Name of the person to research
        session_id: Session identifier for data organization

    Returns:
        Dictionary containing agent results and status
    """
    try:
        agent = create_deep_agent(
            tools=[internet_search, save_finding, vfs_write_file],
            instructions=PROFESSIONAL_INFO_PROMPT,
            model=ChatOpenAI(model="gpt-4o-mini", temperature=0.2)
        )

        result = await asyncio.to_thread(
            agent.invoke,
            {"messages": [{"role": "user", "content": f"Find professional background for {person_name}"}]}
        )

        # Save results
        parallel_profile_vfs.save_raw_data(session_id, "professional_info", {
            "agent": "professional_background_agent",
            "result": str(result),
            "timestamp": datetime.now().isoformat()
        })

        return {"agent": "professional_background", "status": "success", "result": str(result)}
    except Exception as e:
        return {"agent": "professional_background", "status": "error", "error": str(e)}

async def execute_academic_background_agent(person_name: str, session_id: str) -> Dict:
    """
    Execute academic background agent in parallel.

    This agent specializes in gathering educational information including
    institutions attended, degrees obtained, academic achievements, and research.

    Args:
        person_name: Name of the person to research
        session_id: Session identifier for data organization

    Returns:
        Dictionary containing agent results and status
    """
    try:
        agent = create_deep_agent(
            tools=[internet_search, save_finding, vfs_write_file],
            instructions=ACADEMIC_INFO_PROMPT,
            model=ChatOpenAI(model="gpt-4o-mini", temperature=0.2)
        )

        result = await asyncio.to_thread(
            agent.invoke,
            {"messages": [{"role": "user", "content": f"Find academic background for {person_name}"}]}
        )

        # Save results
        parallel_profile_vfs.save_raw_data(session_id, "academic_info", {
            "agent": "academic_background_agent",
            "result": str(result),
            "timestamp": datetime.now().isoformat()
        })

        return {"agent": "academic_background", "status": "success", "result": str(result)}
    except Exception as e:
        return {"agent": "academic_background", "status": "error", "error": str(e)}

async def execute_social_presence_agent(person_name: str, session_id: str) -> Dict:
    """
    Execute social presence agent in parallel.

    This agent specializes in gathering information about online presence
    including social media profiles, public engagements, and digital footprint.

    Args:
        person_name: Name of the person to research
        session_id: Session identifier for data organization

    Returns:
        Dictionary containing agent results and status
    """
    try:
        agent = create_deep_agent(
            tools=[internet_search, save_finding, vfs_write_file],
            instructions=SOCIAL_PRESENCE_PROMPT,
            model=ChatOpenAI(model="gpt-4o-mini", temperature=0.2)
        )

        result = await asyncio.to_thread(
            agent.invoke,
            {"messages": [{"role": "user", "content": f"Find social presence for {person_name}"}]}
        )

        # Save results
        parallel_profile_vfs.save_raw_data(session_id, "social_info", {
            "agent": "social_presence_agent",
            "result": str(result),
            "timestamp": datetime.now().isoformat()
        })

        return {"agent": "social_presence", "status": "success", "result": str(result)}
    except Exception as e:
        return {"agent": "social_presence", "status": "error", "error": str(e)}

async def execute_parallel_search(person_name: str, session_id: str) -> Dict:
    """
    Execute all search agents in parallel for maximum performance.

    This is the core orchestration function that runs all four specialized
    agents simultaneously. The parallel execution provides significant
    performance improvements over sequential execution.

    Args:
        person_name: Name of the person to research
        session_id: Session identifier for data organization

    Returns:
        Dictionary containing results from all agents with success/failure status
    """
    st.info("🚀 Starting parallel sub-agent execution...")

    # Create tasks for parallel execution
    tasks = [
        execute_personal_info_agent(person_name, session_id),
        execute_professional_background_agent(person_name, session_id),
        execute_academic_background_agent(person_name, session_id),
        execute_social_presence_agent(person_name, session_id)
    ]

    # Execute all tasks concurrently
    results = await asyncio.gather(*tasks, return_exceptions=True)

    # Process results
    successful_results = []
    failed_results = []

    for result in results:
        if isinstance(result, Exception):
            failed_results.append({"error": str(result)})
        elif result.get("status") == "success":
            successful_results.append(result)
        else:
            failed_results.append(result)

    return {
        "successful": successful_results,
        "failed": failed_results,
        "total_agents": len(tasks),
        "successful_agents": len(successful_results),
        "failed_agents": len(failed_results)
    }

# =============================================================================
# AGENT PROMPTS AND INSTRUCTIONS
# =============================================================================
# 
# Each agent has specialized prompts that guide their behavior and focus.
# These prompts are carefully crafted to ensure each agent gathers
# specific types of information efficiently and accurately.

PERSONAL_INFO_PROMPT = """
You are a specialized personal information gathering agent. Your task is to find basic personal details about the given person.

Focus on:
- Full name and variations
- Current location and residence
- Contact information (if publicly available)
- Basic biographical information
- Personal background details

Search for information from:
- Public records
- Professional directories
- News articles
- Public social media profiles
- Company websites

Return your findings in a structured format with clear sections.
"""

PROFESSIONAL_INFO_PROMPT = """
You are a specialized professional background research agent. Your task is to gather comprehensive career information about the given person.

Focus on:
- Current job title and company
- Previous job positions and companies
- Career timeline and progression
- Professional achievements and awards
- Industry expertise and specializations
- Professional memberships and affiliations

Search for information from:
- LinkedIn profiles
- Company websites
- Professional directories
- Industry publications
- Conference presentations
- Patent databases

Return your findings in a structured format with clear sections.
"""

ACADEMIC_INFO_PROMPT = """
You are a specialized academic background research agent. Your task is to gather educational information about the given person.

Focus on:
- Educational institutions attended
- Degrees obtained and fields of study
- Academic achievements and honors
- Research publications and papers
- Academic positions and roles
- Certifications and professional development

Search for information from:
- University websites
- Academic databases (Google Scholar, ResearchGate)
- Conference proceedings
- Academic directories
- Research institution websites
- Professional certification databases

Return your findings in a structured format with clear sections.
"""

SOCIAL_PRESENCE_PROMPT = """
You are a specialized social presence research agent. Your task is to identify online presence and public engagements for the given person.

Focus on:
- Social media profiles (LinkedIn, Twitter, etc.)
- Online activities and engagement
- Public appearances and speaking engagements
- Media mentions and interviews
- Online publications and blogs
- Digital footprint and online reputation

Search for information from:
- Social media platforms
- News articles and media coverage
- Conference and event websites
- Blog platforms and personal websites
- Podcast appearances
- Video content platforms

Return your findings in a structured format with clear sections.
"""

SYNTHESIS_PROMPT = """
You are a specialized profile synthesis agent. Your task is to combine all the gathered information into a comprehensive, well-structured profile.

Your responsibilities:
1. Review all collected data from different agents
2. Identify and resolve any discrepancies
3. Create a coherent narrative
4. Organize information into logical sections
5. Ensure completeness and accuracy

Create a comprehensive profile with these sections:
- Personal Information
- Professional Background
- Academic History
- Social Presence
- Summary and Key Insights

Focus on accuracy, completeness, and professional presentation.
"""

# Main parallel execution function
async def run_parallel_person_profile(person_name: str, search_query: str = "") -> Dict:
    """Run the complete parallel person profile process"""
    session_id = f"parallel_session_{datetime.now().strftime('%Y%m%d_%H%M%S')}"

    # Save initial session
    parallel_profile_vfs.save_session(session_id, person_name, search_query)

    st.info(f"🔍 Starting parallel profile generation for: {person_name}")

    # Step 1: Parallel search execution
    st.subheader("📊 Step 1: Parallel Search Execution")
    search_results = await execute_parallel_search(person_name, session_id)

    # Display parallel execution results
    col1, col2, col3, col4 = st.columns(4)
    with col1:
        st.metric("Total Agents", search_results["total_agents"])
    with col2:
        st.metric("Successful", search_results["successful_agents"])
    with col3:
        st.metric("Failed", search_results["failed_agents"])
    with col4:
        success_rate = (search_results["successful_agents"] / search_results["total_agents"]) * 100
        st.metric("Success Rate", f"{success_rate:.1f}%")

    # Step 2: Sequential synthesis (depends on search results)
    st.subheader("🔗 Step 2: Profile Synthesis")
    synthesis_result = await execute_synthesis_agent(person_name, session_id, search_results)

    # Step 3: Report generation
    st.subheader("📄 Step 3: Report Generation")
    report_result = await execute_report_agent(person_name, session_id)

    return {
        "session_id": session_id,
        "search_results": search_results,
        "synthesis_result": synthesis_result,
        "report_result": report_result,
        "person_name": person_name
    }

async def execute_synthesis_agent(person_name: str, session_id: str, search_results: Dict) -> Dict:
    """Execute synthesis agent (sequential - depends on search results)"""
    try:
        agent = create_deep_agent(
            tools=[vfs_read_file, vfs_write_file],
            instructions=SYNTHESIS_PROMPT,
            model=ChatOpenAI(model="gpt-4o-mini", temperature=0.2)
        )

        # Create synthesis prompt with search results context
        synthesis_prompt = f"""
        Synthesize a comprehensive profile for {person_name} based on the following search results:

        {json.dumps(search_results, indent=2)}

        Create a well-structured, professional profile that combines all available information.

        Format the output as clean markdown with the following structure:

        # Profile of {person_name}

        ## Personal Information
        [Extract and organize personal details]

        ## Professional Background
        [Extract and organize professional information]

        ## Academic Background
        [Extract and organize academic information]

        ## Social Presence
        [Extract and organize social media and online presence]

        ## Summary and Key Insights
        [Provide a concise summary of key findings]

        Ensure the output is clean, readable, and properly formatted without any JSON artifacts or escape characters.
        """

        result = await asyncio.to_thread(
            agent.invoke,
            {"messages": [{"role": "user", "content": synthesis_prompt}]}
        )

        # Parse and clean the result
        clean_result = parse_agent_result(result)

        # Save synthesized profile with clean content
        parallel_profile_vfs.save_profile(person_name, {
            "synthesized_profile": clean_result,
            "search_results": search_results,
            "timestamp": datetime.now().isoformat()
        })

        return {"status": "success", "result": clean_result}
    except Exception as e:
        return {"status": "error", "error": str(e)}

async def execute_report_agent(person_name: str, session_id: str) -> Dict:
    """Execute report generation agent (sequential - depends on synthesis)"""
    try:
        profile = parallel_profile_vfs.get_profile(person_name)
        if not profile:
            return {"status": "error", "error": "No profile found for synthesis"}

        # Get clean content from the synthesized profile
        if isinstance(profile, dict) and 'synthesized_profile' in profile:
            clean_content = profile['synthesized_profile']
        else:
            clean_content = parse_agent_result(profile) if isinstance(profile, dict) else str(profile)

        # Create a clean, readable report
        report_content = f"""
# Person Profile Report: {person_name}

## Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}

## Profile Summary
{clean_content}

## Search Statistics
- Total search agents: {len(profile.get('search_results', {}).get('successful', [])) if isinstance(profile, dict) else 0}
- Successful searches: {len(profile.get('search_results', {}).get('successful', [])) if isinstance(profile, dict) else 0}
- Failed searches: {len(profile.get('search_results', {}).get('failed', [])) if isinstance(profile, dict) else 0}

## Data Sources
- Personal Information: Available
- Professional Background: Available
- Academic History: Available
- Social Presence: Available

---
*Report generated by Parallel Person Profile Agent*
        """

        # Save report
        parallel_profile_vfs.write_file(f"reports/{person_name}_parallel_report.md", report_content)

        return {"status": "success", "report": report_content}
    except Exception as e:
        return {"status": "error", "error": str(e)}

# =============================================================================
# STREAMLIT USER INTERFACE
# =============================================================================
# 
# The Streamlit interface provides a user-friendly way to interact with the
# parallel person profile agent. It includes:
# - Input fields for person name and search context
# - Real-time progress tracking
# - VFS statistics display
# - Performance comparison information
# - Results display and download options

def main():
    st.set_page_config(page_title="Parallel Person Profile Agent", page_icon="⚡", layout="wide")

    st.title("⚡ Parallel Person Profile Deep Agent")
    st.markdown("**High-performance parallel sub-agent execution for person profiling**")

    # Sidebar
    st.sidebar.header("⚙️ Configuration")
    st.sidebar.info("This implementation runs search agents in parallel for improved performance.")

    # VFS Stats
    stats = parallel_profile_vfs.get_session_stats()
    st.sidebar.subheader("📊 Virtual File System Stats")
    st.sidebar.metric("Sessions", stats["total_sessions"])
    st.sidebar.metric("Profiles", stats["total_profiles"])
    st.sidebar.metric("Raw Data", stats["total_raw_data"])
    st.sidebar.metric("Files", stats["total_files"])

    # Main interface
    st.header("🔍 Person Profile Generation")

    person_name = st.text_input("Enter person's name:", placeholder="e.g., Satya Nadella")
    search_query = st.text_input("Additional search context (optional):", placeholder="e.g., CEO, San Francisco")

    if st.button("⚡ Generate Parallel Profile", type="primary"):
        if not person_name:
            st.error("Please enter a person's name.")
        else:
            with st.spinner("Executing parallel profile generation..."):
                try:
                    # Run the parallel process
                    result = asyncio.run(run_parallel_person_profile(person_name, search_query))

                    st.success("✅ Parallel profile generation completed!")

                    # Display results
                    st.subheader("📋 Results Summary")

                    # Search Results
                    st.write("**Search Results:**")
                    search_results = result["search_results"]
                    for agent_result in search_results["successful"]:
                        st.success(f"✅ {agent_result['agent']}: Completed successfully")

                    for agent_result in search_results["failed"]:
                        st.error(f"❌ {agent_result.get('agent', 'Unknown')}: {agent_result.get('error', 'Failed')}")

                    # Synthesis Results
                    if result["synthesis_result"]["status"] == "success":
                        st.success("✅ Profile synthesis completed successfully")

                        # Display synthesized profile
                        st.subheader("🔗 Synthesized Profile")
                        clean_synthesis = result["synthesis_result"]["result"]
                        st.markdown(clean_synthesis)
                    else:
                        st.error(f"❌ Synthesis failed: {result['synthesis_result']['error']}")

                    # Report Results
                    if result["report_result"]["status"] == "success":
                        st.success("✅ Report generation completed successfully")

                        # Display report
                        st.subheader("📄 Generated Report")
                        clean_report = result["report_result"]["report"]
                        st.markdown(clean_report)

                        # Download button
                        st.download_button(
                            label="📥 Download Report",
                            data=clean_report,
                            file_name=f"{person_name}_parallel_profile_report.md",
                            mime="text/markdown"
                        )
                    else:
                        st.error(f"❌ Report generation failed: {result['report_result']['error']}")

                except Exception as e:
                    st.error(f"❌ Error during parallel execution: {str(e)}")

    # Performance comparison
    st.header("⚡ Performance Comparison")
    st.markdown("""
    ### Parallel vs Sequential Execution

    | Aspect | Sequential | Parallel |
    |--------|------------|----------|
    | **Execution Time** | ~4-6 minutes | ~1-2 minutes |
    | **Resource Usage** | Lower | Higher |
    | **Error Handling** | Simple | Complex |
    | **Cost** | Lower | Higher |
    | **Scalability** | Limited | Better |

    ### When to Use Parallel Execution:
    - ✅ Independent sub-agents
    - ✅ Performance-critical applications
    - ✅ Multiple API endpoints available
    - ✅ Sufficient rate limits

    ### When to Use Sequential Execution:
    - ✅ Dependent operations
    - ✅ Cost-sensitive applications
    - ✅ Limited API rate limits
    - ✅ Simple debugging requirements
    """)

if __name__ == "__main__":
    main()

Agent Output or Result

Input & Testing deep agent

Sub-agents working parallel

Final result

Report Creation

Virtual file System shows 3 profiles

Why Virtual File System is Essential for Deep Agent Framework

1. Agent Memory & Learning

Deep agents need to remember what they've learned. Without persistent storage, every time you restart the system, agents lose all their knowledge. The VFS acts like a brain that remembers previous searches, successful patterns, and accumulated knowledge.

2. Multi-Agent Communication

When you have multiple specialized agents (like personal info agent, professional background agent, academic agent), they need to share information. The VFS becomes the central communication hub where each agent can read what others have found and contribute their own discoveries.

3. State Management

Deep agents work in complex workflows with multiple steps. They need to track progress - which agents have completed their tasks, what data has been collected, and what still needs to be done. The VFS maintains this state across the entire execution.

4. Parallel Execution Safety

When multiple agents run simultaneously (like in parallel execution), they need a safe way to write data without corrupting each other's work. The VFS provides thread-safe storage that prevents conflicts.

5. Transparency & Debugging

Deep agent frameworks need to be transparent about what each agent is doing. The VFS creates a clear audit trail you can see exactly what each agent found, when it found it, and how it processed the information.

6. Scalability & Performance

As the system grows with more agents and more complex workflows, the VFS provides a scalable way to store and retrieve data without the complexity of traditional databases.

7. Framework Independence

The VFS allows the deep agent framework to be independent of external systems. It doesn't need a database server, doesn't require complex setup, and works reliably across different environments.

Conclusion

As an IT and GenAI Solution Architect, I see Deep Agents not just as advanced chatbots, but as a fundamental evolution in how AI delivers value to businesses. Unlike traditional AI, which reacts to queries, Deep Agents proactively solve problems by planning, remembering, and orchestrating complex workflows across multiple systems, teams, and processes.

For organizations, this represents a strategic shift from reactive tools to intelligent partners capable of driving efficiency, insight, and innovation. Early adopters who integrate Deep Agents into their operations gain a tangible advantage over competitors relying on basic AI solutions.

The technology is mature, real world use cases are validated, and the path to implementation is clear. The key question for business leaders today isn’t if, but when they will leverage Deep Agents to transform their operations and achieve scalable, intelligent automation.

Thanks
Sreeni Ramadorai

DEV Community