Mohamed Hatem Diabi

Posted on Jul 15

Inside Inboundr’s AI Agents: Exploring Sequential Multi-Agent Content Creation with RAG and Human-in-the-Loop

#ai #contentwriting #agentaichallenge

Introduction

Inboundr represents a sophisticated implementation of modern AI agent architecture patterns, specifically designed for LinkedIn content creation and curation. The platform employs a Sequential Multi-Agent approach combined with Retrieval Augmented Generation (RAG), external tools integration, and Human-in-the-Loop feedback mechanisms to deliver high-quality, personalized content at scale.

This article provides a comprehensive technical deep-dive into Inboundr's architecture, examining three core workflows that demonstrate different agent orchestration patterns and their practical applications in production environments.

Overview of Inboundr's Architecture

Inboundr's architecture is built on the foundation of specialized AI agents working in sequence, each with distinct responsibilities and capabilities. The system integrates multiple cutting-edge technologies:

Core Components

Sequential Agent Orchestration: Multiple specialized agents work in predetermined sequences
RAG Integration: Custom knowledge retrieval system for user and company data
External Tool Integration: Exa search engine, YouTube scraping, and SerpAPI
Human-in-the-Loop: Review, feedback, and regeneration capabilities
Flow Template Engine: YAML-based workflow orchestration
Multi-Modal Processing: Support for text, images, audio, and video content

Key Technologies

Language Models: Claude 3.5 Sonnet, GPT-4o, Gemini 2.5 Flash
Search Engines: Exa for internet search, SerpAPI for YouTube
Vector Database: Supabase with custom embeddings
Workflow Orchestration: Prefect for task management
Knowledge Management: Custom RAG microservice for knowledge retrieval
Content Processing: Whisper for audio transcription, custom image analysis

Architecture 1: Post Generation and Regeneration

The post generation workflow represents Inboundr's core sequential multi-agent architecture, where specialized agents collaborate to create LinkedIn posts from user topics.

Technical Implementation

The workflow follows a Sequential Agents pattern with RAG integration and Human-in-the-Loop feedback:

Context Processing Agent: Handles file uploads, URL crawling, and context preparation
Data Enrichment Agent: Performs RAG queries and internet searches
Content Generation Agent: Creates the initial post draft
Refinement Agent: Polishes and optimizes the content
Style Modifier Agent: Applies user-specific styling
Human Review: Manual review and feedback collection
Regeneration Agent: Incorporates feedback for iterative improvement

Architecture Diagram

graph TD
    User[User Input] --> Router{Post Type?}

    Router -->|User Topic| UserTopicFlow[Generate from User Topic]
    Router -->|External Topic| ExternalTopicFlow[Generate from External Topic]
    Router -->|Regenerate| RegenerateFlow[Regenerate Post]

    subgraph "Sequential Multi-Agent Post Generation"
        UserTopicFlow --> ContextAgent[Context Processing Agent]
        ExternalTopicFlow --> ContextAgent
        RegenerateFlow --> ContextAgent

        ContextAgent --> ProcessFiles[Process Context Files]
        ContextAgent --> ProcessUrls[Process Context URLs]
        ProcessFiles --> RAGAgent[RAG Data Agent]
        ProcessUrls --> RAGAgent

        RAGAgent --> QueryBrain[Query User Brain]
        RAGAgent --> QueryCompany[Query Company Brain]
        RAGAgent --> SearchInternet[Internet Search via Exa]

        QueryBrain --> ContentAgent[Content Generation Agent]
        QueryCompany --> ContentAgent
        SearchInternet --> ContentAgent

        ContentAgent --> RefineAgent[Refinement Agent]
        RefineAgent --> StyleAgent[Style Modifier Agent]
        StyleAgent --> HookAgent[Hook Generation Agent]
        HookAgent --> TranslateAgent[Translation Agent]

        TranslateAgent --> DraftPost[Draft Post]
    end

    subgraph "Human-in-the-Loop System"
        DraftPost --> HumanReview[Human Review]
        HumanReview -->|Approve| FinalPost[Final Post]
        HumanReview -->|Feedback| RegenerateFlow
    end

    subgraph "RAG Knowledge Base"
        UserKnowledge[(User Knowledge Store)]
        CompanyKnowledge[(Company Knowledge Store)]
        HistoryKnowledge[(Post History)]
    end

    subgraph "External Tools"
        ExaSearch[Exa Search Engine]
        TavilySearch[Tavily Search]
        SerpapiTool[SerpAPI]
    end

    QueryBrain <--> UserKnowledge
    QueryCompany <--> CompanyKnowledge
    RAGAgent <--> HistoryKnowledge
    SearchInternet <--> ExaSearch
    SearchInternet <--> TavilySearch
    ContentAgent <--> SerpapiTool

    FinalPost --> User

Key Features

Specialized Agent Roles:

Each agent has a specific function and set of tools
Agents pass structured data between each other
Context is maintained throughout the entire workflow

RAG Integration:

User-specific knowledge retrieval from personal content
Company-wide knowledge sharing
Historical post analysis for consistency

Human-in-the-Loop:

Manual review checkpoints
Iterative feedback incorporation
Quality assurance mechanisms

Architecture 2: Daily External Topic Research

The external topic research system demonstrates Inboundr's capability to autonomously discover and curate content from the internet and YouTube, representing a Hierarchical Agents + Parallel Agents pattern.

Technical Implementation

This workflow runs on a scheduled basis and employs parallel processing for efficiency:

Query Generation Agent: Creates search queries based on user preferences
Parallel Search Agents: Simultaneously search internet and YouTube
Content Extraction Agent: Processes and extracts relevant information
Evaluation Agent: Scores and ranks discovered topics
Curation Agent: Selects the best topics for each user

Architecture Diagram

graph TD
    Scheduler[Daily Scheduler] --> TopicGenerator[Topic Generation Supervisor]

    subgraph "Hierarchical Agent Management"
        TopicGenerator --> UserAnalyzer[User Context Analyzer]
        UserAnalyzer --> QueryAgent[Query Generation Agent]

        QueryAgent --> ParallelCoordinator[Parallel Search Coordinator]

        ParallelCoordinator --> InternetSearch[Internet Search Agent]
        ParallelCoordinator --> YouTubeSearch[YouTube Search Agent]
    end

    subgraph "Parallel Internet Research"
        InternetSearch --> ExaEngine[Exa Search Engine]
        ExaEngine --> InternetResults[Internet Results]

        InternetResults --> InternetProcessor[Internet Content Processor]
        InternetProcessor --> InternetTopics[Generated Internet Topics]
    end

    subgraph "Parallel YouTube Research"
        YouTubeSearch --> YouTubeAPI[YouTube API via SerpAPI]
        YouTubeAPI --> YouTubeResults[YouTube Video Results]

        YouTubeResults --> VideoAnalyzer[Video Selection Agent]
        VideoAnalyzer --> TranscriptAgent[Transcript Extraction Agent]
        TranscriptAgent --> YouTubeTopics[Generated YouTube Topics]
    end

    subgraph "Topic Evaluation & Curation"
        InternetTopics --> EvaluationAgent[Topic Evaluation Agent]
        YouTubeTopics --> EvaluationAgent

        EvaluationAgent --> ScoringEngine[AI Scoring Engine]
        ScoringEngine --> CurationAgent[Topic Curation Agent]
        CurationAgent --> RankedTopics[Ranked Topic List]
    end

    subgraph "Knowledge Integration"
        RankedTopics --> TopicStorage[(External Topics DB)]
        TopicStorage --> PostGeneration[Post Generation System]
    end

    subgraph "User Context Sources"
        UserProfile[(User LinkedIn Profile)]
        CompanyData[(Company Information)]
        ContentPillars[(Content Pillars)]
        PublishingPolicy[(Publishing Policy)]
        PreviousTopics[(Previous Topics)]
    end

    UserAnalyzer <--> UserProfile
    UserAnalyzer <--> CompanyData
    QueryAgent <--> ContentPillars
    QueryAgent <--> PublishingPolicy
    QueryAgent <--> PreviousTopics

    RankedTopics --> PostGeneration

Key Features

Parallel Processing:

Internet and YouTube searches run simultaneously
Improved performance through concurrent execution
Resource optimization for large-scale operations

Intelligent Curation:

AI-powered topic scoring and ranking
Relevance assessment based on user preferences
Duplicate detection and content freshness checks

Multi-Source Integration:

Combines insights from multiple search engines
Processes different content types (articles, videos, transcripts)
Maintains content quality standards

Architecture 3: Overall Inboundr System Architecture

The complete Inboundr system represents a sophisticated implementation of Agents Hierarchy + Loop + Parallel Agents + Shared RAG architecture.

Technical Implementation

The system integrates all components into a unified platform:

API Gateway: Handles all incoming requests and routing
Flow Orchestrator: Manages workflow execution using Prefect
Agent Hierarchy: Supervisor and worker agents for different tasks
Shared Knowledge Base: Centralized RAG system
External Integrations: Multiple search engines and APIs
Human Interface: Review and feedback mechanisms

Architecture Diagram

graph TB
    subgraph "Client Layer"
        WebApp[Web Application]
        API[API Gateway]
        WebApp --> API
    end

    subgraph "Orchestration Layer"
        FlowEngine[Flow Engine - Prefect]
        FlowTemplates[Flow Templates]
        TaskRunner[Task Runner]

        API --> FlowEngine
        FlowEngine --> FlowTemplates
        FlowEngine --> TaskRunner
    end

    subgraph "Agent Hierarchy"
        SupervisorAgent[Supervisor Agent]

        subgraph "Content Generation Agents"
            PostAgent[Post Generation Agent]
            RegenerateAgent[Regeneration Agent]
            ImageAgent[Image Generation Agent]
        end

        subgraph "Research Agents"
            TopicAgent[Topic Research Agent]
            InternetAgent[Internet Search Agent]
            YouTubeAgent[YouTube Research Agent]
        end

        subgraph "Processing Agents"
            ContextAgent[Context Processing Agent]
            RAGAgent[RAG Query Agent]
            EvaluationAgent[Content Evaluation Agent]
        end

        SupervisorAgent --> PostAgent
        SupervisorAgent --> RegenerateAgent
        SupervisorAgent --> ImageAgent
        SupervisorAgent --> TopicAgent
        SupervisorAgent --> InternetAgent
        SupervisorAgent --> YouTubeAgent

        PostAgent --> ContextAgent
        PostAgent --> RAGAgent
        RegenerateAgent --> ContextAgent
        RegenerateAgent --> RAGAgent
        TopicAgent --> EvaluationAgent
    end

        subgraph "Shared RAG System"
        RAGCore[RAG Service Core]

        subgraph "Knowledge Bases"
            UserBrains[(User Knowledge Bases)]
            CompanyBrains[(Company Knowledge Bases)]
            ContextBrains[(Context Knowledge Bases)]
        end

        subgraph "Vector Storage"
            SupabaseVectors[(Supabase Vectors)]
            Embeddings[Embedding Models]
        end

        RAGCore --> UserBrains
        RAGCore --> CompanyBrains
        RAGCore --> ContextBrains
        RAGCore --> SupabaseVectors
        RAGCore --> Embeddings
    end

    subgraph "External Tools & APIs"
        ExaSearch[Exa Search]
        TavilySearch[Tavily Search]
        SerpAPI[SerpAPI]
        YouTubeAPI[YouTube API]
        IdeogramAPI[Ideogram API]
        WhisperAPI[Whisper API]

        subgraph "LLM Services"
            Claude[Claude 3.5 Sonnet]
            GPT4[GPT-4o]
            Gemini[Gemini 2.5 Flash]
            BedrockAWS[AWS Bedrock]
        end
    end

    subgraph "Human-in-the-Loop"
        ReviewInterface[Review Interface]
        FeedbackSystem[Feedback System]
        ApprovalWorkflow[Approval Workflow]

        ReviewInterface --> FeedbackSystem
        FeedbackSystem --> ApprovalWorkflow
    end

    subgraph "Data Storage"
        PostgreSQL[(PostgreSQL)]
        Redis[(Redis Cache)]
        S3Storage[(S3 Storage)]

        subgraph "Application Data"
            Users[(Users)]
            Companies[(Companies)]
            Posts[(Posts)]
            Topics[(External Topics)]
            Media[(Media Files)]
        end

        PostgreSQL --> Users
        PostgreSQL --> Companies
        PostgreSQL --> Posts
        PostgreSQL --> Topics
        S3Storage --> Media
    end

    %% Connections
    TaskRunner --> SupervisorAgent

    RAGAgent <--> RAGCore
    ContextAgent <--> RAGCore

    InternetAgent <--> ExaSearch
    InternetAgent <--> TavilySearch
    YouTubeAgent <--> SerpAPI
    YouTubeAgent <--> YouTubeAPI
    ImageAgent <--> IdeogramAPI
    ContextAgent <--> WhisperAPI

    PostAgent <--> Claude
    PostAgent <--> GPT4
    RegenerateAgent <--> Claude
    TopicAgent <--> Gemini
    EvaluationAgent <--> BedrockAWS

    SupervisorAgent --> ReviewInterface
    ApprovalWorkflow --> FlowEngine

    FlowEngine --> PostgreSQL
    FlowEngine --> Redis
    FlowEngine --> S3Storage

    %% Feedback Loops
    ApprovalWorkflow -.->|Regenerate| RegenerateAgent
    FeedbackSystem -.->|Context Update| ContextAgent
    ReviewInterface -.->|Quality Control| EvaluationAgent

Key Features

Scalable Architecture:

Microservices-based design for individual components
Horizontal scaling capabilities for high-volume processing
Efficient resource utilization through parallel processing

Unified Knowledge Management:

Centralized RAG system accessible to all agents
Consistent knowledge representation across workflows
Real-time knowledge updates and synchronization

Comprehensive Integration:

Multiple LLM providers for redundancy and optimization
Diverse external APIs for comprehensive data access
Flexible workflow orchestration for different use cases

Architecture 4: AI-Powered Video Editing and Clips Generation

Inboundr's video editing system represents a sophisticated implementation of Multi-Agent Video Processing with AI-Powered Content Analysis and Automated Clip Generation. This architecture demonstrates how AI agents can collaborate to transform long-form video content into engaging, platform-optimized short clips.

Technical Implementation

The video editing workflow employs a Hierarchical Multi-Agent approach with three distinct processing stages:

Video Processing Pipeline: Download, transcription, and audio extraction
AI-Powered Viral Clips Generation: Content analysis and segment identification
Video Editing Tools: Automated editing, formatting, and optimization

Architecture Diagram

graph TB
    subgraph "Video Input Layer"
        VideoURL[Video URL Input]
        VideoDownload[Video Download Service]
        VideoURL --> VideoDownload
    end

    subgraph "Video Processing Pipeline"
        VideoProcessor[Video Processing Agent]
        AudioExtractor[Audio Extraction Tool]
        WhisperTranscriber[Whisper Transcription Service]

        VideoDownload --> VideoProcessor
        VideoProcessor --> AudioExtractor
        AudioExtractor --> WhisperTranscriber
    end

    subgraph "AI Viral Clips Generation"
        TranscriptionAnalyzer[Transcription Analysis Agent]
        ViralEvaluator[Viral Potential Evaluator Agent]
        ClipGenerator[Clip Generation Agent]

        subgraph "Analysis Components"
            TopicBoundaries[Topic Boundary Detection]
            EmotionalHighs[Emotional High Point Detection]
            HookIdentifier[Hook Moment Identification]
            CoherenceAnalyzer[Standalone Coherence Analysis]
        end

        subgraph "Evaluation Components"
            HookScorer[Hook Quality Scorer]
            FlowAnalyzer[Content Flow Analyzer]
            ValueAssessor[Value Delivery Assessor]
            TrendAligner[Trend Alignment Evaluator]
        end

        subgraph "Generation Components"
            ClipSelector[Optimal Clip Selector]
            PlatformOptimizer[Platform-Specific Optimizer]
            TimestampProcessor[Timestamp Processor]
        end

        WhisperTranscriber --> TranscriptionAnalyzer
        TranscriptionAnalyzer --> TopicBoundaries
        TranscriptionAnalyzer --> EmotionalHighs
        TranscriptionAnalyzer --> HookIdentifier
        TranscriptionAnalyzer --> CoherenceAnalyzer

        TranscriptionAnalyzer --> ViralEvaluator
        ViralEvaluator --> HookScorer
        ViralEvaluator --> FlowAnalyzer
        ViralEvaluator --> ValueAssessor
        ViralEvaluator --> TrendAligner

        ViralEvaluator --> ClipGenerator
        ClipGenerator --> ClipSelector
        ClipGenerator --> PlatformOptimizer
        ClipGenerator --> TimestampProcessor
    end

    subgraph "Video Editing Tools"
        VideoCutter[Video Cutting Tool]
        CropAgent[Video Cropping Agent]
        SubtitleAgent[Subtitle Generation Agent]
        FillerWordRemover[Filler Word Removal Agent]
        TextOverlayAgent[Text Overlay Agent]
        SilenceRemover[Silence Removal Agent]

        subgraph "Editing Configurations"
            VideoFormats[Video Format Options<br/>Portrait, Square, Landscape]
            SubtitleStyles[Subtitle Styling Options]
            TextConfigs[Text Overlay Configurations]
            VADConfig[Voice Activity Detection Config]
        end

        ClipGenerator --> VideoCutter
        VideoCutter --> CropAgent
        VideoCutter --> SubtitleAgent
        VideoCutter --> FillerWordRemover
        VideoCutter --> TextOverlayAgent
        VideoCutter --> SilenceRemover

        CropAgent --> VideoFormats
        SubtitleAgent --> SubtitleStyles
        TextOverlayAgent --> TextConfigs
        SilenceRemover --> VADConfig
    end

    subgraph "Video Script Generation"
        ScriptSupervisor[Script Generation Supervisor]
        TopicResearcher[Topic Research Agent]
        StructurePlanner[Video Structure Planner]
        ContentWriter[Content Writing Agent]

        subgraph "Script Components"
            UserProfiler[User Profile Analyzer]
            AudienceAnalyzer[Target Audience Analyzer]
            StyleGenerator[Style Preference Generator]
            PlatformAdapter[Platform Adaptation Agent]
        end

        ScriptSupervisor --> TopicResearcher
        ScriptSupervisor --> StructurePlanner
        ScriptSupervisor --> ContentWriter

        TopicResearcher --> UserProfiler
        StructurePlanner --> AudienceAnalyzer
        ContentWriter --> StyleGenerator
        ContentWriter --> PlatformAdapter
    end

    subgraph "Storage & Management"
        VideoStorage[(Video Storage<br/>Supabase)]
        ProjectDB[(Project Database)]
        TranscriptStorage[(Transcript Storage)]
        ClipMetadata[(Clip Metadata)]

        VideoCutter --> VideoStorage
        WhisperTranscriber --> TranscriptStorage
        ClipGenerator --> ClipMetadata
        VideoProcessor --> ProjectDB
    end

    subgraph "External Services"
        FFmpeg[FFmpeg Processing]
        MoviePy[MoviePy Library]
        OpenAIWhisper[OpenAI Whisper API]
        YoutubeAPI[YouTube Download API]

        VideoCutter <--> FFmpeg
        SubtitleAgent <--> FFmpeg
        CropAgent <--> FFmpeg
        AudioExtractor <--> MoviePy
        WhisperTranscriber <--> OpenAIWhisper
        VideoDownload <--> YoutubeAPI
    end

    subgraph "Workflow Orchestration"
        PrefectOrchestrator[Prefect Flow Orchestrator]
        TaskRunner[ThreadPool Task Runner]
        FlowDeployment[Flow Deployment Manager]

        PrefectOrchestrator --> TaskRunner
        TaskRunner --> FlowDeployment
        FlowDeployment --> VideoProcessor
        FlowDeployment --> TranscriptionAnalyzer
        FlowDeployment --> ScriptSupervisor
    end

    subgraph "Quality Assurance"
        VideoValidator[Video Quality Validator]
        TranscriptValidator[Transcript Quality Checker]
        ClipQualityChecker[Clip Quality Assessor]

        VideoProcessor --> VideoValidator
        WhisperTranscriber --> TranscriptValidator
        ClipGenerator --> ClipQualityChecker
    end

    %% Feedback Loops
    ClipQualityChecker -.->|Quality Feedback| ClipGenerator
    VideoValidator -.->|Processing Feedback| VideoProcessor
    TranscriptValidator -.->|Accuracy Feedback| WhisperTranscriber

    %% Data Flow
    VideoStorage --> ProjectDB
    TranscriptStorage --> ProjectDB
    ClipMetadata --> ProjectDB

Key Features

Multi-Stage AI Processing:

Sequential Agent Workflow: Each processing stage builds upon the previous
Specialized Agent Roles: Dedicated agents for analysis, evaluation, and generation
Intelligent Content Analysis: AI-powered identification of viral potential segments

Advanced Video Processing:

Automated Transcription: Whisper-powered audio-to-text conversion
Intelligent Segmentation: Topic boundary and emotional high-point detection
Platform Optimization: Format-specific optimization for different social media platforms

Comprehensive Editing Tools:

Video Formatting: Automatic cropping for portrait (9:16), square (1:1), and landscape (16:9) formats
Subtitle Integration: Automated subtitle generation with customizable styling
Content Enhancement: Filler word removal, silence detection, and text overlay capabilities

Workflow Details

1. Video Processing Pipeline

The foundation of the video editing system begins with comprehensive video processing:

# Video Processing Workflow
@flow(name="Video Processing Workflow")
async def video_transcriptions_flow(
    project_id: str,
    video_url: str,
    video_quality: str = "medium",
):
    # 1. Download Video
    video_path = await task_download_video(video_url, filename, video_quality)

    # 2. Extract Audio
    audio_path = await task_extract_audio(video_path, filename)

    # 3. Transcribe Audio
    transcription_result = await task_transcribe(audio_path)

    # 4. Save Results
    saved_paths = await task_save_results(transcription_result, filename)

    return {"status": "completed", "results": saved_paths}

2. AI-Powered Viral Clips Generation

The viral clips generation employs a sophisticated multi-agent system:

Transcription Analysis Agent:

Identifies natural topic boundaries and logical breaking points
Detects emotional high points and engagement peaks
Locates strong hook moments for audience capture
Ensures segment coherence and standalone viability

Viral Potential Evaluator Agent:

Evaluates content segments using four key criteria:
- Hook Quality: Attention-grabbing potential (0-10 scale)
- Content Flow: Pacing and structural coherence (0-10 scale)
- Value Delivery: Usefulness and emotional impact (0-10 scale)
- Trend Alignment: Relevance to current trends (0-10 scale)

Clip Generation Agent:

Selects optimal clips based on viral scoring
Ensures precise timestamp boundaries
Maintains contextual integrity
Provides platform-specific optimization recommendations

3. Video Editing Tools Integration

The system provides comprehensive editing capabilities:

Video Formatting Options:

Portrait (9:16) for TikTok, Instagram Stories
Square (1:1) for Instagram posts
Landscape (16:9) for YouTube, LinkedIn

Advanced Editing Features:

Filler word removal using AI transcript analysis
Voice Activity Detection (VAD) for silence removal
Subtitle burning with customizable styling
Text overlay with positioning and timing controls

Performance Characteristics

Processing Speed:

Video download: 30-60 seconds for typical YouTube videos
Audio extraction: 5-10 seconds per minute of video
Transcription: 15-30 seconds per minute of audio (OpenAI Whisper)
Viral clips generation: 2-5 minutes for comprehensive analysis
Video editing: 10-20 seconds per edit operation

Quality Metrics:

Transcription accuracy: 95%+ for clear audio
Viral clip relevance: 85%+ user satisfaction
Edit operation success rate: 98%+

Scalability Features:

Parallel processing with ThreadPool task runners
Concurrent video processing for multiple projects
Efficient resource utilization with Prefect orchestration
Automatic cleanup of temporary files

Technical Advantages

Intelligent Content Analysis:

AI-powered segment identification reduces manual work by 90%
Emotional high-point detection improves engagement rates
Platform-specific optimization increases viral potential

Automated Workflow:

End-to-end processing from URL to finished clips
Minimal human intervention required
Consistent quality across all generated content

Flexible Architecture:

Modular design allows easy addition of new editing tools
Platform-agnostic processing supports multiple social media formats
Extensible agent system for future AI capabilities

Integration with Core Inboundr System

The video editing architecture seamlessly integrates with Inboundr's broader ecosystem:

Shared Components:

Uses the same Prefect orchestration system
Integrates with Supabase for storage and metadata
Leverages existing user authentication and project management

Cross-System Benefits:

Video content feeds into the general content recommendation system
Transcripts enhance the RAG knowledge base
Generated clips provide training data for improving content strategies

This video editing architecture demonstrates how specialized AI agents can collaborate to transform complex video processing tasks into automated, intelligent workflows that consistently deliver high-quality, platform-optimized content.

Technical Deep Dive: Implementation Details

Architectural Decision: Prefect + YAML vs Traditional AI Agent Frameworks

One of the most critical architectural decisions in building Inboundr was choosing Prefect with custom YAML templates over established AI agent frameworks like CrewAI, LangChain, or AutoGen. This decision was driven by four key requirements that traditional frameworks couldn't adequately address at the time.

Why We Avoided Traditional AI Agent Frameworks

1. Scalability Limitations

Traditional AI agent frameworks like CrewAI and LangChain were designed for single-user, single-task scenarios. Inboundr needed to handle:

Multi-tenant operations: Hundreds of users generating content simultaneously
Concurrent workflow execution: Multiple agents processing different tasks in parallel
Resource isolation: Ensuring one user's intensive task doesn't impact others
Horizontal scaling: Adding computational resources during peak usage

# Traditional Framework Challenge
crew = Crew(
    agents=[researcher, writer, editor],
    tasks=[research_task, write_task, edit_task],
    verbose=True
)
# Limited to single execution, no built-in multi-tenancy
result = crew.kickoff()

2. Traceability and Observability

Production AI systems require comprehensive monitoring and debugging capabilities:

Task-level tracing: Understanding exactly where workflows succeed or fail
Performance monitoring: Real-time metrics on agent execution times and resource usage
Error propagation tracking: Identifying how failures cascade through agent chains
Audit trails: Complete logs for compliance and optimization

# Prefect provides built-in observability
@flow(name="Generate Post from User Topic")
async def flow_generate_post_from_user_topic(data: GeneratePostFromUserTopicData):
    with prefect.context() as context:
        # Automatic logging, timing, and state tracking
        context.logger.info(f"Processing for user {data.user_id}")

        # Each task is automatically monitored
        context_result = await process_context_files(data)
        post_result = await generate_post(context_result)

        return post_result

3. Full Control Over Execution

AI agent frameworks often abstract away crucial infrastructure decisions:

Custom retry logic: Implementing domain-specific retry strategies for different failure modes
Resource management: Fine-grained control over memory, CPU, and GPU allocation
Conditional execution: Complex business logic determining agent activation
Integration flexibility: Seamless integration with existing infrastructure

4. Infrastructure Integration

Enterprise deployment requires integration with existing systems:

Database connections: Direct integration with PostgreSQL, Redis, and vector stores
Authentication systems: User management and access control
Monitoring tools: Integration with existing observability stacks
Deployment pipelines: CI/CD integration for agent workflow updates

The Prefect + YAML Solution

Inboundr's architecture leverages Prefect's workflow orchestration capabilities with custom YAML templates to create a scalable, observable, and controllable AI agent system:

Flow Template Engine

Inboundr uses a custom YAML-based flow template system that defines agent workflows:

name: "Generate Post from User Topic"
description: "Generate post workflow from user desired topic."
version: 0.0

default_settings:
  models:
    - service: "bedrock"
      model: "us.anthropic.claude-3-5-sonnet-20241022-v2:0"
      temperature: 0.3
  embedding:
    service: "bedrock"

tasks:
  - id: "process_context_files"
    description: "Add context files into context brain"
    type: "iterate"
    child_task:
      id: "add_knowledge_from_file"
      type: "brain_knowledge"

  - id: "additional_data"
    description: "Collect additional data from internal and internet"
    type: "additional_data"

  - id: "generate_post"
    description: "Generate linkedin post"
    type: "query_model"
    template_file: "generate_post/generate_post_for_active_mode.txt"

Custom Agent Orchestration with Prefect

The system translates YAML templates into Prefect flows with full observability:

@flow(
    name="Generate Post from User Topic",
    task_runner=ThreadPoolTaskRunner(max_workers=4),
    retries=3,
    retry_delay_seconds=30
)
async def flow_generate_post_from_user_topic(data: GeneratePostFromUserTopicData):
    """
    Production-ready flow with:
    - Built-in retry logic
    - Parallel execution capability
    - Comprehensive logging
    - Resource management
    """

    # Initialize context with full tracing
    context = await get_context_data_to_generate_post_from_user_topic(data)

    # Process with automatic state management
    return await process(
        flow=generate_post_from_user_topic_flow_template,
        context=context,
        result_processor=store_generated_post,
        as_task=True,  # Enable task-level monitoring
    )

Key Benefits of This Approach

1. Production-Grade Scalability

# Multi-tenant deployment with resource isolation
@flow(
    task_runner=ThreadPoolTaskRunner(max_workers=50),
    persist_result=True,
    result_storage=S3ResultStorage(bucket="inboundr-flows")
)
async def multi_tenant_post_generation():
    # Concurrent execution for multiple users
    tasks = []
    for user_data in user_queue:
        task = flow_generate_post_from_user_topic.submit(user_data)
        tasks.append(task)

    # Parallel execution with automatic resource management
    results = await asyncio.gather(*tasks)
    return results

2. Comprehensive Observability

# Built-in monitoring and metrics
from prefect import get_run_logger

@task(name="Query Model Task")
async def run_query_model_task(context, task_data):
    logger = get_run_logger()

    # Automatic timing and resource tracking
    start_time = time.time()

    try:
        # Task execution with full tracing
        result = await model.invoke(task_data.template_inputs)

        # Custom metrics collection
        execution_time = time.time() - start_time
        logger.info(f"Task completed in {execution_time:.2f}s")

        return result
    except Exception as e:
        # Automatic error tracking and alerting
        logger.error(f"Task failed: {str(e)}")
        raise

3. Infrastructure Integration

# Seamless integration with existing infrastructure
@flow(
    name="External Topics Research",
    schedule=CronSchedule(cron="0 9 * * *"),  # Daily at 9 AM
    infrastructure=DockerContainer(
        image="inboundr/agents:latest",
        cpu_limit=4,
        memory_limit="8Gi"
    )
)
async def scheduled_topic_research():
    # Automatic deployment and resource management
    users = await get_all_active_users()

    for user in users:
        # Each user gets isolated execution
        await flow_generate_external_topics_for_user.submit(user.id)

Framework Evolution Considerations

Current State (2025)

Traditional AI agent frameworks are rapidly evolving, and our architectural decision may change as they mature:

Framework Limitations Being Addressed:

CrewAI: Adding enterprise features like multi-tenancy and advanced monitoring
LangChain: Improving production deployment capabilities with LangServe
AutoGen: Enhancing scalability for multi-user scenarios

When We Might Reconsider:

Native multi-tenancy: Frameworks provide built-in user isolation
Production observability: Comprehensive monitoring and debugging tools
Enterprise integration: Seamless integration with existing infrastructure
Performance optimization: Framework-level optimizations for large-scale deployments

Migration Strategy:
Our YAML-based approach provides a clean abstraction layer, making it possible to migrate to mature frameworks while preserving business logic:

# Current: Custom implementation
flow_template = compile_template("generate_post_from_user_topic.yml")
result = await process(flow_template, context)

# Future: Framework integration (hypothetical)
crew_template = convert_yaml_to_crew("generate_post_from_user_topic.yml")
result = await crew_template.execute(context)

This architectural decision demonstrates how production AI systems require careful consideration of scalability, observability, and control requirements that may not be met by general-purpose frameworks, especially in rapidly evolving domains like AI agents.

Performance Characteristics

Production Benchmarking Results

Based on actual production metrics from Inboundr's operational period, the platform demonstrated significant scale and performance across multiple channels:

Overall Content Generation Volume:

Total Posts Generated: 6,522 posts across all platforms
- Slack users: 6,178 posts
- Web-app users: 344 posts
Topic Suggestions Delivered: 73,187 total suggestions
- Web-app users: 20,287 suggestions
- Slack users: 52,900 suggestions
- 41,100 internet-sourced topics
- 12,200 video-sourced topics
- 5,227 suggestions from Slack discussions

Content Source Distribution:

Internet and YouTube Topics: 2,478 posts (40% of Slack content)
User-Entered Topics: 595 posts (10% of Slack content)
Slack Discussion Content: 3,105 posts (50% of Slack content)

Monthly Growth Patterns:

Slack Users (Peak Performance):

September 2024: 226 posts
October 2024: 626 posts (177% growth)
November 2024: 594 posts
December 2024: 470 posts
January 2025: 1,850 posts (Peak month - 293% growth)
February 2025: 1,108 posts
March 2025: 608 posts
April 2025: 437 posts
May 2025: 202 posts
June 2025: 53 posts

Web-App Users (Launch Phase):

April 2025: 41 posts (launch month)
May 2025: 225 posts (449% growth)
June 2025: 78 posts

Key Performance Insights

Platform Adoption Patterns:

Slack Integration: Demonstrated strong organic growth with 8x peak scaling (226 to 1,850 posts/month)
Web Application: Showed rapid initial adoption with 5x growth in second month
Content Diversity: Successfully processed content from multiple sources (internet, video, discussions)

Operational Efficiency:

Topic-to-Post Conversion: Approximately 1 post generated per 11 topic suggestions
Multi-Channel Success: Effective operation across both conversational (Slack) and traditional web interfaces
Content Source Flexibility: Successful processing of diverse content types from URLs, videos, and discussions

Advantages and Limitations

Strengths

Specialization: Each agent focuses on specific tasks, improving quality
Scalability: Parallel processing and hierarchical structure support growth
Consistency: Shared RAG system ensures coherent knowledge representation
Adaptability: Human-in-the-loop enables continuous improvement
Reliability: Multiple fallback mechanisms and error handling

Limitations

Complexity: Multi-agent coordination introduces architectural complexity
Latency: Sequential processing can create bottlenecks
Cost: Multiple LLM calls increase operational expenses
Maintenance: Complex workflows require specialized maintenance
Debugging: Distributed agent behavior is challenging to troubleshoot

Conclusion

Inboundr's architecture represents a sophisticated implementation of modern AI agent patterns, demonstrating how Sequential Multi-Agent systems can be effectively combined with RAG, external tools, and human oversight to create production-ready AI applications.

The system's success lies in its thoughtful balance of automation and human control, specialized agent roles, and comprehensive knowledge management. By leveraging the strengths of different architectural patterns—sequential processing for complex workflows, parallel execution for research tasks, and hierarchical management for system coordination—Inboundr delivers a robust, scalable, and effective solution for content creation.

The architecture serves as a valuable reference for organizations looking to implement similar multi-agent systems, demonstrating practical approaches to agent coordination, knowledge management, and human-AI collaboration in production environments.

As AI agent architectures continue to evolve, Inboundr's implementation provides insights into the practical challenges and solutions required for building sophisticated, production-ready AI systems that deliver real business value while maintaining quality and reliability standards.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.