Harish Kotra (he/him)

Posted on Jun 21

Building Internet Detective AI: A Production-Grade Multi-Agent AI System

#ai #programming #productivity #dailybuild2026

Paste your LinkedIn, GitHub, Twitter, and resume — get a brutally honest AI investigation of your entire internet personality.

Introduction

Internet Detective AI is a Next.js 15 application that accepts a person's digital footprint (LinkedIn URL, GitHub URL, Twitter handle, resume text) and runs it through a 7-agent AI pipeline to produce a detailed, entertaining investigation report. The output includes evidence-based facts, behavioral signals, hidden obsessions, a career prediction, a startup parody pitch, coworker quotes, a brutal roast, internet personality scores, and a "cooked level" meter.

It serves two purposes simultaneously:

A viral consumer app — Shareable, funny, and surprisingly insightful. Users compete for the best roasts and personality scores.
A reference architecture for production AI engineering — Every layer is designed to be educational, extensible, and production-ready.

The entire system was built in 24 hours and spans 14 areas of modern AI engineering, each documented independently in the docs/ folder.

The Vision

Most AI demo projects are either fun but shallow (a single API call wrapped in a nice UI) or educational but boring (a Jupyter notebook with no frontend). Internet Detective AI bridges this gap:

A single codebase that serves as both a genuinely fun consumer app AND a comprehensive reference architecture.

Every engineering decision was made with two audiences in mind: end-users who just want to see their roast, and developers who want to understand how production AI systems work. The result is a project that scales from "clone and run in 5 minutes" to "study every production pattern used by AI engineering teams."

Architecture Overview

The system follows a layered architecture where each layer has a single responsibility:

┌─────────────────────────────────────────────────────┐
│                   Next.js 15 App                     │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────┐ │
│  │  Landing     │  │ Investigation│  │  Developer  │ │
│  │  Page        │  │ Report Page  │  │  Dashboard  │ │
│  └──────┬───────┘  └──────┬───────┘  └──────┬─────┘ │
│         │                 │                  │        │
│  ┌──────┴─────────────────┴──────────────────┴─────┐ │
│  │              API Routes                          │ │
│  └─────────────────────┬───────────────────────────┘ │
├────────────────────────┼─────────────────────────────┤
│  ┌─────────────────────┴───────────────────────────┐ │
│  │          Multi-Agent Orchestrator                │ │
│  │  ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐  │ │
│  │  │Profile│ │Signal│ │Career│ │Start │ │ Roast│  │ │
│  │  │Analyst│ │Detect│ │Predic│ │Gener │ │Agent │  │ │
│  │  └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘  │ │
│  │     └────────┴────────┴────────┴────────┘       │ │
│  │              ┌──────────────┐                    │ │
│  │              │  Governance  │                    │ │
│  │              │    Agent     │                    │ │
│  │              └──────┬───────┘                    │ │
│  │              ┌──────┴───────┐                    │ │
│  │              │   Synthesis  │                    │ │
│  │              │    Agent     │                    │ │
│  │              └──────────────┘                    │ │
│  └─────────────────────┬───────────────────────────┘ │
│  ┌─────────────────────┴───────────────────────────┐ │
│  │              AI Service Layer                     │ │
│  └─────────────────────┬───────────────────────────┘ │
│  ┌─────────────────────┴───────────────────────────┐ │
│  │           Provider Abstraction Layer              │ │
│  │  Zen │ OpenAI │ Anthropic │ Gemini │ OpenRouter  │ │
│  └─────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘

The request flow is: HTTP Request → API Route → Context Builder → Orchestrator → 7 Agents → Governance Loop → Synthesis → Response.

Provider Abstraction

The most fundamental architectural decision was decoupling every agent from any specific AI provider. We use the Adapter pattern with a ProviderAdapter interface:

// src/lib/providers/types.ts
export interface ProviderAdapter {
  name: ProviderType;
  chat(request: ChatCompletionRequest): Promise<ChatCompletionResponse>;
  getModels(): Promise<string[]>;
  isAvailable(): Promise<boolean>;
}

Every provider implements this interface. The factory handles instantiation and configuration:

// src/lib/providers/factory.ts
export class ProviderFactory {
  private static registry = loadConfig();

  static createProvider(type: ProviderType): ProviderAdapter {
    const registration = ProviderFactory.registry[type];
    if (!registration) {
      throw new Error(`Unknown provider type: ${type}`);
    }
    if (!registration.config) {
      throw new Error(
        `Provider "${type}" is not configured. Set the required environment variable.`,
      );
    }
    return new registration.constructor(registration.config);
  }

  static getDefaultProvider(): ProviderAdapter {
    const providerType = (process.env.AI_PROVIDER || "zen") as ProviderType;
    return ProviderFactory.createProvider(providerType);
  }
}

The registry supports 7 providers out of the box: Zen, OpenAI, Anthropic, Gemini, OpenRouter, Featherless, and Ollama. Each provider is gated by environment variables — if the key is missing, it's simply excluded from getAllAvailableProviders().

The BaseProvider abstract class adds shared behavior like error handling with retry logic, latency measurement, and cost calculation:

// src/lib/providers/base.ts
export abstract class BaseProvider implements ProviderAdapter {
  abstract name: ProviderType;
  protected config: ProviderConfig;

  protected async executeWithRetry(
    request: ChatCompletionRequest,
  ): Promise<ChatCompletionResponse> {
    return retryWithBackoff(() => this.chat(request), 3);
  }

  protected handleError(error: unknown, context: string): never {
    if (error instanceof AIProviderError) throw error;
    if (error instanceof Error) {
      const statusCode = this.extractStatusCode(error);
      const retryable = statusCode >= 500 || statusCode === 429;
      throw new AIProviderError(`${this.name}: ${context} - ${error.message}`, {
        code: this.errorCodeFromStatus(statusCode),
        statusCode,
        provider: this.name,
        retryable,
        cause: error,
      });
    }
    throw new AIProviderError(`${this.name}: ${context} - Unknown error`, {
      code: "UNKNOWN_ERROR",
      provider: this.name,
      retryable: false,
      cause: error,
    });
  }
}

The result: Any agent can use any provider. Switch the entire app from GPT-4o to Claude to Gemini with one environment variable. Adding a new provider means writing exactly one adapter file.

Multi-Agent Architecture

The orchestrator runs 7 agents in a directed pipeline. Each agent has a single responsibility:

Agent	Input	Output	Responsibility
Profile Analyst	ContextPack	Facts, digital profile summary	Extract directly observable facts
Signal Detector	Context + Facts	Strong signals, hidden obsessions	Find behavioral patterns
Career Predictor	Context + Signals	Career prediction	Predict future trajectory
Startup Generator	Context + Signals	Startup parody	Create VC-pitch satire
Roast Agent	Context + Signals	Roasts, coworker quotes, verdict	Generate playful humor
Governance Agent	All outputs	Governance check	Validate ethical compliance
Final Synthesis	Everything	InvestigationReport	Assemble final report

Each agent extends BaseAgent, which provides shared infrastructure:

// src/lib/agents/base.ts
export abstract class BaseAgent {
  protected config: AgentConfig;
  protected ai: AIService;

  abstract process(input: any): Promise<{ output: any; trace: AgentTrace }>;

  protected async callAIJSON<T>(
    userPrompt: string,
  ): Promise<{ parsed: T; trace: AgentTrace }> {
    return this.ai.chatJSON<T>({
      systemPrompt: this.config.systemPrompt,
      userPrompt,
      model: this.config.model,
      temperature: this.config.temperature,
      responseFormat: "json_object",
      agentName: this.config.name,
    });
  }

  protected async safeProcess<T>(
    processFn: () => Promise<{ output: T; trace: AgentTrace }>,
    fallbackOutput: T,
  ): Promise<{ output: T; trace: AgentTrace }> {
    try {
      return await processFn();
    } catch (error) {
      return {
        output: fallbackOutput,
        trace: { /* error trace with success: false */ },
      };
    }
  }
}

Every agent has safe defaults and fallback outputs. If an agent call fails (timeout, parse error, provider outage), the orchestrator catches it and continues with degraded data rather than crashing the entire pipeline.

The orchestrator runs agents sequentially, passing their outputs forward:

// src/lib/agents/orchestrator.ts
export class InvestigationOrchestrator {
  async investigate(context: ContextPack) {
    // Step 1: Profile Analyst
    const { output: profileOutput, trace: profileTrace } =
      await this.runAgent(this.profileAnalyst, { context }, { facts: [] });

    // Step 2: Signal Detector (receives analyst facts)
    const { output: signalOutput, trace: signalTrace } =
      await this.runAgent(this.signalDetector,
        { context, facts: profileOutput.facts },
        { strongSignals: [], hiddenObsessions: [] });

    // ... Steps 3-5: Career Predictor, Startup Generator, Roast Agent

    // Step 6: Governance Check (with retries)
    for (let attempt = 0; attempt <= MAX_GOVERNANCE_RETRIES; attempt++) {
      const { output: govOutput, trace: govTrace } =
        await this.runAgent(this.governanceAgent, governanceInput, fallback);
      if (govOutput.passed) break;
      // Sanitize inputs and retry
    }

    // Step 7: Final Synthesis
    const { output: report, trace: synthesisTrace } =
      await this.runAgent(this.finalSynthesis, synthesisInput, fallbackReport);

    return { report, traces, governanceCheck };
  }
}

The governance loop is key: if the governance agent finds violations, the orchestrator sanitizes inputs and retries up to 2 times before accepting the best available result. This creates a self-correcting pipeline.

Context Engineering

Raw profile input comes in many shapes: LinkedIn URLs, GitHub usernames, free-text resumes, Twitter bios. The ContextBuilder class normalizes all of this into a structured ContextPack.

The context pipeline has three phases:

Extraction — Parse each input source independently, extracting education, work experience, skills, repos, and stats from LinkedIn, GitHub, Twitter, and resume text
Deduplication — Merge across sources, removing duplicates using composite keys:

   private deduplicateEducation(items: Education[]): Education[] {
     const seen = new Set<string>();
     return items.filter((item) => {
       const key = `${item.institution}|${item.degree}|${item.field}`;
       if (seen.has(key)) return false;
       seen.add(key);
       return true;
     });
   }

Compression — Remove duplicate lines, normalize whitespace, and calculate compression ratio:

   private compressContent(text: string): string {
     if (!text || text.length < 500) return text;
     const lines = text.split("\n");
     const compressed: string[] = [];
     const seen = new Set<string>();
     for (const line of lines) {
       const trimmed = line.trim();
       if (!trimmed) continue;
       const normalized = trimmed.toLowerCase().replace(/\s+/g, " ");
       if (seen.has(normalized)) continue;
       seen.add(normalized);
       compressed.push(trimmed);
     }
     return compressed.join("\n");
   }

The builder also extracts key signals early — things like years of experience, leadership roles, top companies, open-source popularity, trending focus areas — which get passed to every agent as context. This early signal extraction reduces the burden on each agent to rediscover obvious patterns, saving tokens and improving accuracy.

Structured Outputs

Every agent returns typed JSON. The entire report schema is defined in TypeScript:

// src/lib/types.ts
export interface InvestigationReport {
  id: string;
  profileHash: string;
  digitalProfileSummary: string;
  facts: Fact[];
  strongSignals: StrongSignal[];
  hiddenObsessions: HiddenObsession[];
  coworkerQuotes: CoworkerQuote[];
  startupParody: StartupParody;
  careerPrediction: CareerPrediction;
  brutalRoast: Roast[];
  wildGuesses: WildGuess[];
  finalVerdict: string;
  personalityScores: InternetPersonalityScores;
  cookedLevel: CookedLevel;
  metadata: ReportMetadata;
}

Agents use JSON mode (response_format: { type: "json_object" }) and a chatJSON<T> helper that handles parsing and error recovery:

// src/lib/ai.ts
async chatJSON<T>(
  options: AIRequestOptions,
): Promise<{ parsed: T; trace: AgentTrace }> {
  const jsonOptions = { ...options, responseFormat: "json_object" };
  const response = await this.chat(jsonOptions);

  if (!response.trace.success) {
    throw new Error(`AI chat failed: ${response.trace.error}`);
  }

  // Clean markdown code fences from JSON output
  const cleaned = response.content
    .replace(/```
{% endraw %}
json\s*/gi, "")
    .replace(/
{% raw %}
```\s*$/g, "")
    .trim();

  let parsed: T;
  try {
    parsed = JSON.parse(cleaned) as T;
  } catch (parseError) {
    throw new Error(
      `Failed to parse JSON response: ${parseError.message}\nRaw: ${response.content}`
    );
  }
  return { parsed, trace: response.trace };
}

Every JSON response goes through parser recovery — the system strips markdown code fences, trims whitespace, and falls back to partial matching if the output is malformed. Combined with the safeProcess pattern, this means a single agent failure never crashes the entire investigation.

Prompt Engineering

Prompts are stored as separate markdown files in prompts/system/ and loaded by a PromptRegistry:

// src/lib/prompts/index.ts
export class PromptRegistry {
  private prompts: Map<AgentType, string> = new Map();

  async load(): Promise<void> {
    const entries = await fs.promises.readdir(PROMPTS_DIR, { withFileTypes: true });
    for (const entry of entries.filter(f => f.isFile() && f.name.endsWith(".txt"))) {
      const agentType = this.resolveAgentType(entry.name);
      if (!agentType) continue;
      const content = await fs.promises.readFile(
        path.join(PROMPTS_DIR, entry.name), "utf-8"
      );
      this.prompts.set(agentType, content.trim());
    }
  }
}

This means prompts can be edited without touching any code — useful for iterating with non-technical stakeholders or A/B testing prompt variants.

Every prompt file follows a consistent structure: Purpose → Version → Expected Inputs → JSON Schema → Step-by-Step Instructions → Failure Modes → Guardrails → Example Outputs → "Why This Matters".

The "Why This Matters" section is the key educational feature. For example, the Roast Agent prompt explains why its design choices are effective:

🎭 Persona-Driven Prompting: The "roast comedian" persona is a deliberate choice — it constrains the model to a specific tone, vocabulary, and ethical framework. Persona prompts are one of the most effective prompt engineering techniques because they activate the model's understanding of social roles.

📎 Specificity = Funniest: The instruction to always reference real profile data is grounded in comedy theory: specific humor outperforms generic humor. "Their GitHub has 47 repos with 0 stars each" is funnier than "they code a lot."

⚖️ The Kindness Gate: Adding a "kindness check" step is an ethical prompt engineering pattern. Rather than a blunt guardrail ("don't be mean"), it frames the check as a social simulation leveraging the model's theory of mind capabilities.

This documentation-first approach turns the prompts directory into an educational resource.

Governance & Safety

The system has two independent safety layers running at different points in the pipeline:

Input Safety (`SafetyChecker`)

Runs before any agent processes data. Detects four threat categories:

// src/lib/safety/index.ts
export class SafetyChecker {
  checkPrompt(input: string): SafetyCheck {
    const threats: SafetyThreat[] = [
      ...this.detectPromptInjection(input),
      ...this.detectJailbreak(input),
      ...this.detectPII(input),
    ];
    return { passed: threats.length === 0, threats };
  }
}

Prompt injection detection uses 19 regex patterns targeting common escape techniques ("ignore all previous instructions", "you are now...", "DAN", "developer mode"). PII detection catches emails, phone numbers, SSNs, credit cards, addresses, and passport numbers.

Output Governance (`GovernanceValidator`)

Runs after agents generate content. Checks against 7 prohibited attributes: race, ethnicity, religion, sexual orientation, mental health, medical diagnosis, political affiliation, and criminal activity.

// src/lib/governance/index.ts
export class GovernanceValidator {
  validate(report: Partial<InvestigationReport>): GovernanceCheck {
    const violations: GovernanceViolation[] = [];

    for (const [fieldName, text] of this.extractTextFields(report)) {
      violations.push(...this.checkText(text, fieldName));
    }
    // Also check facts, strong signals, roasts, etc.

    return {
      passed: violations.length === 0,
      violations: this.deduplicateViolations(violations),
      checkedAt: new Date().toISOString(),
    };
  }

  sanitize(report, violations): Partial<InvestigationReport> {
    // Redact any text matching violation patterns
    const redacted = this.redactText(text, violations);
    return { ...report, digitalProfileSummary: redacted, ... };
  }
}

The governance agent operates in a validate → retry → sanitize loop within the orchestrator. If violations are found, the orchestrator strips offending facts and retries the governance check. If violations persist after MAX_GOVERNANCE_RETRIES, they're sanitized via redaction and recorded as metadata.

Evaluations

The evaluation framework measures every generated report across 5 metrics:

// src/lib/eval/index.ts
computeMetrics(report: InvestigationReport, context: ContextPack) {
  return {
    json_compliance: this.measureJSONCompliance(reportJson),
    consistency: this.measureConsistency(report),
    hallucination_rate: this.measureHallucinationRate(report, context),
    humor_score: this.measureHumorScore(report),
    accuracy: this.measureAccuracy(report, context),
  };
}

JSON Compliance — Checks all required fields exist and match expected types
Consistency — Measures how well facts align with the summary, career prediction, and timeline
Hallucination Rate — Compares fact text against source context using word-overlap analysis
Humor Score — Evaluates variety, intensity distribution, and quality signals in roasts
Accuracy — Inverse of hallucination rate

The evaluation datasets include 20 profiles across 4 categories (developers, designers, founders, creators), each with expected score ranges and minimum fact counts:

{
  "id": "dev-1",
  "name": "Senior Full-Stack Developer",
  "expectedOutputs": {
    "minFacts": 14,
    "keySignals": ["strong technical background", "open source contributor"],
    "expectedScores": {
      "builderScore": { "min": 75, "max": 100 },
      "operatorScore": { "min": 60, "max": 90 }
    }
  }
}

The compareModelsWithConfigs method runs the same dataset against multiple model/provider pairs, generating side-by-side comparisons of latency, cost, and quality scores — enabling data-driven model selection.

Production Features

Cost Tracking

Every agent call records its model, token usage, and estimated cost per model-specific pricing tables:

export class CostTracker {
  async recordCost(trace: AgentTrace, investigationId: string): Promise<CostRecord> {
    const record = {
      provider: trace.provider,
      model: trace.model,
      promptTokens: trace.tokenUsage.promptTokens,
      completionTokens: trace.tokenUsage.completionTokens,
      estimatedCost: estimateCost(trace.model,
        trace.tokenUsage.promptTokens,
        trace.tokenUsage.completionTokens),
      timestamp: new Date().toISOString(),
    };
  }
  // Query methods: getCostByProvider(), getCostByModel(), getCostByAgent()
}

Observability

The ObservabilityTracker stores all agent traces and investigations in-memory (with configurable size limits) and optionally forwards them to LangSmith for production monitoring:

export class ObservabilityTracker {
  async getTraceStats() {
    return {
      total, successful, failed,
      avgLatency, totalCost
    };
  }
  async getAgentPerformance() {
    return Record<agentName, { totalCalls, avgLatency, totalCost, successRate }>;
  }
}

Developer Dashboard

A full dashboard at /dashboard exposes real-time stats: per-agent latency and cost breakdowns, success rates, investigation history, and trace export (JSON/CSV). This turns the production data into actionable insights.

What's Next

The architecture is designed for extension. Here's what's on the roadmap:

Real API integrations — GitHub GraphQL, LinkedIn API, Twitter API for live data instead of pasted text
Browser extension — One-click investigation from any LinkedIn/GitHub profile page
New agents — Writing style analyzer, network graph agent, tech stack deep dive
New providers — Together AI, Groq, Replicate (each is one file)
OG image generation — Shareable report cards for social media
Mobile app — React Native wrapper for native sharing

Conclusion

Internet Detective AI demonstrates that a production-grade multi-agent AI system doesn't require a massive team or budget. The key patterns are:

Provider abstraction so you're never locked into one model
Safe defaults with graceful degradation so partial failures produce useful results
Structured outputs with typed schemas so agents produce machine-parseable results
Governance and safety as architectural layers, not afterthoughts
Documentation-first prompts that double as educational resources
Evaluation as a first-class feature for objective quality measurement

The entire codebase is open source at github.com/harishkotra/internet-detective-ai. Each of the 14 engineering topics covered by this project has its own documentation file in docs/.

Code & more: https://www.dailybuild.xyz/project/170-internet-detective-ai

DEV Community

Building Internet Detective AI: A Production-Grade Multi-Agent AI System

Introduction

The Vision

Architecture Overview

Provider Abstraction

Multi-Agent Architecture

Context Engineering

Structured Outputs

Prompt Engineering

Governance & Safety

Input Safety (`SafetyChecker`)

Output Governance (`GovernanceValidator`)

Evaluations

Production Features

Cost Tracking

Observability

Developer Dashboard

What's Next

Conclusion

Top comments (0)

Introduction

The Vision

Architecture Overview

Provider Abstraction

Multi-Agent Architecture

Context Engineering

Structured Outputs

Prompt Engineering

Governance & Safety

Input Safety (SafetyChecker)

Output Governance (GovernanceValidator)

Evaluations

Production Features

Cost Tracking

Observability

Developer Dashboard

What's Next

Conclusion

Input Safety (`SafetyChecker`)

Output Governance (`GovernanceValidator`)