DEV Community

Cover image for Building Intelligent Research Agents with OpenAI's Agents Framework
Bogdan Pistol
Bogdan Pistol

Posted on

Building Intelligent Research Agents with OpenAI's Agents Framework

Multi-agent systems are transforming how we build AI applications. Instead of relying on a single large language model to handle every task, we can now orchestrate specialized agents that work together, each focused on what it does best. In this tutorial, we'll build a practical research assistant using OpenAI's Agents Framework—and see how clean prompt management makes agent development faster and more reliable.

Why OpenAI's Agents Framework?

Released in March 2025 as a production-ready evolution of OpenAI's experimental Swarm project, the Agents Framework stands out in a crowded field of agent frameworks for several key reasons:

Lightweight by Design

Unlike heavyweight frameworks that require mastering complex graph structures or conversation patterns, OpenAI's Agents SDK introduces just four core primitives:

  • Agents - LLMs equipped with instructions and tools
  • Handoffs - Enable agents to delegate tasks to specialists
  • Guardrails - Validate inputs and outputs for safety
  • Sessions - Automatically maintain conversation history

This minimalist approach means you can build sophisticated multi-agent systems without wrestling with unnecessary abstraction layers.

Production-Ready from Day One

While frameworks like LangGraph excel at complex stateful workflows and AutoGen shines in multi-role conversations, OpenAI's Agents SDK focuses on production readiness. It includes:

  • Built-in tracing for debugging and monitoring
  • Automatic schema generation for function tools
  • Seamless integration with GPT-4o and other OpenAI models
  • Human-in-the-loop (HITL) approval for critical decisions

Provider-Agnostic with 100+ LLM Support

Despite being an OpenAI product, the framework is compatible with over 100 different language models, giving you flexibility as the AI landscape evolves.

What We're Building

Our research assistant demonstrates real multi-agent coordination:

  1. Research Planner breaks topics into subtopics and questions
  2. Analyst agents dive deep into each subtopic
  3. Synthesizer combines findings into coherent insights

This pattern applies to countless use cases: competitive analysis, market research, technical documentation, and more.

Setup: Getting Started

Prerequisites

You'll need:

  • Python 3.11.5+ or 3.12+ (earlier 3.11 versions have typing compatibility issues)
  • An OpenAI API key (get one here)

Installation

# Clone the repository
git clone https://github.com/bogdan-pistol/dakora.git
cd dakora/examples/openai-agents

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Configure Your API Key

# Copy the example environment file
cp .env.example .env

# Edit .env and add your OpenAI API key
# OPENAI_API_KEY=sk-proj-...
Enter fullscreen mode Exit fullscreen mode

.env file

Where to get your API key:

  1. Visit OpenAI Platform
  2. Navigate to API Keys in your account settings
  3. Create a new secret key
  4. Copy and paste it into your .env file

Running Your First Research

Let's see it in action:

python research_assistant.py "AI agent frameworks in 2025"
Enter fullscreen mode Exit fullscreen mode

Example Output

🔬 Starting research on: AI agent frameworks in 2025

📋 Planning research strategy...

📊 Research Plan:
Strategy: This research plan will employ a literature review, case studies, and expert
interviews to gather data on the evolution, integration, and ethical considerations of
AI agent frameworks, aiming to provide a comprehensive understanding of their state in 2025.
Subtopics: 3

🔍 Analyzing subtopic 1/3: Evolution of AI Agent Frameworks
✓ Analysis complete

🔍 Analyzing subtopic 2/3: Integration and Interoperability of AI Agents
✓ Analysis complete

🔍 Analyzing subtopic 3/3: Ethical and Regulatory Considerations
✓ Analysis complete

🔄 Synthesizing findings...
✓ Synthesis complete

================================================================================

📄 RESEARCH REPORT

================================================================================
**Executive Summary**

The evolution of AI agent frameworks in 2025 is marked by significant advancements
in technology, fostering sophisticated autonomous systems capable of handling complex
tasks across industries. These systems integrate machine learning models like GPT and
BERT, allowing for enhanced natural language processing and multi-agent collaboration...

[Full report output...]

================================================================================

💾 Full report saved to: research_output_20251006_103427.json
Enter fullscreen mode Exit fullscreen mode

Understanding the Architecture

The Multi-Agent Workflow

Our system follows a clear pattern:

User Query → Research Planner → Multiple Analysts (parallel) → Synthesizer → Report
Enter fullscreen mode Exit fullscreen mode

Each agent has a specific role, making the system easier to debug, test, and extend.

Key Code Components

Let's walk through the main building blocks:

1. Agent Creation with Managed Prompts

from agents import Agent
from dakora.vault import Vault

vault = Vault(config_path="dakora.yaml")

def create_research_planner():
    prompt = vault.get("planner_system")
    return Agent(
        name="Research Planner",
        instructions=prompt.render(),
        model="gpt-4o-mini"
    )
Enter fullscreen mode Exit fullscreen mode

Why this matters: Instead of hardcoding prompts in Python, we store them as versioned YAML files. This separation of concerns means prompt engineers can iterate on instructions without touching code.

2. Dynamic Prompt Rendering

analyst_prompt = vault.get("analyst_system")
analyst_instructions = analyst_prompt.render(
    subtopic=subtopic["title"],
    questions=subtopic.get("questions", []),
    context=f"Part of broader research on: {topic}",
    analysis_depth="standard",
    source_types=subtopic.get("sources", [])
)

analyst = Agent(
    name=f"Analyst-{idx}",
    instructions=analyst_instructions,
    model="gpt-4o-mini"
)
Enter fullscreen mode Exit fullscreen mode

Why this matters: Each analyst gets customized instructions based on its specific subtopic. The same prompt template adapts to different contexts, reducing duplication.

3. Agent Execution with OpenAI's Runner

from agents import Runner

plan_result = Runner.run_sync(planner, f"Create a research plan for: {topic}")

# Process the output
plan_data = json.loads(plan_result.final_output)
Enter fullscreen mode Exit fullscreen mode

Why this matters: The Runner handles session management, conversation history, and tool execution automatically. You focus on orchestration, not plumbing.

4. Parallel Analysis

findings = []

for idx, subtopic in enumerate(plan_data.get("subtopics", []), 1):
    analyst = Agent(
        name=f"Analyst-{idx}",
        instructions=analyst_instructions,
        model="gpt-4o-mini"
    )

    analysis_result = Runner.run_sync(analyst, f"Analyze: {subtopic['title']}")

    findings.append({
        "subtopic": subtopic["title"],
        "analysis": analysis_result.final_output
    })
Enter fullscreen mode Exit fullscreen mode

Why this matters: Each subtopic gets dedicated analysis. While this example processes sequentially, the pattern easily extends to parallel execution for faster results.

Introducing Dakora: Clean Prompt Management

As our research assistant grew, a challenge emerged: managing increasingly complex agent instructions. This is where Dakora proved invaluable.

The Prompt Management Problem

Multi-agent systems often mean:

  • Dozens of prompt variations
  • Dynamic content injection
  • Version tracking needs
  • Type safety requirements

Hardcoding prompts becomes unmaintainable. Template strings help, but lack structure. We needed something purpose-built.

Why Dakora?

Dakora is a lightweight Python library designed specifically for prompt template management. Here's what made it the right choice for this project:

Type-Safe Inputs

# prompts/analyst_system.yaml
id: analyst_system
version: 1.0.0
description: System prompt for deep analysis

inputs:
  subtopic:
    type: string
    required: true
  questions:
    type: array<string>
    required: true
  analysis_depth:
    type: string
    required: false
    default: standard
  source_types:
    type: array<string>
    required: false
    default: []
Enter fullscreen mode Exit fullscreen mode

Input validation happens before the LLM call, catching errors early and saving API costs.

Hot Reload During Development

Edit a prompt file, re-run your script—changes apply immediately. No restart required.

# Edit analyst prompt to add new analysis section
vim prompts/analyst_system.yaml

# Changes apply immediately
python research_assistant.py "your topic"
Enter fullscreen mode Exit fullscreen mode

Version Control Integration

id: analyst_system
version: 1.1.0  # Bumped from 1.0.0
description: Added economic impact analysis section

template: |
  You are a research analyst specializing in deep, structured analysis.

  Analysis Framework:
  1. Current State
  2. Key Developments
  3. Economic Impact  # New section
  4. Challenges & Opportunities
  5. Future Outlook
Enter fullscreen mode Exit fullscreen mode

Track prompt evolution alongside code changes. Roll back if needed. It's all in git.

Visual Prompt Editor

Dakora includes an interactive web playground for testing templates:

dakora playground
Enter fullscreen mode Exit fullscreen mode

Dakora Playground

This browser-based editor lets you:

  • Test prompts with different inputs
  • See rendered output instantly
  • Validate type safety
  • Share templates with your team

Try it online at playground.dakora.io

How Dakora Fits In

In our research assistant, Dakora handles:

  1. Storage - All prompts live in prompts/ as YAML files
  2. Validation - Type-safe inputs catch errors before API calls
  3. Rendering - Jinja2 templates with custom filters
  4. Versioning - Semantic versioning built-in

The OpenAI Agents Framework handles:

  1. Execution - Agent coordination and tool calling
  2. State - Conversation history and session management
  3. Integration - OpenAI API communication

This separation means cleaner code, faster iteration, and easier collaboration between developers and prompt engineers.

Exploring the Prompts

Let's look at how prompts are structured:

Research Planner Prompt

id: planner_system
version: 1.0.0
description: Creates research strategies and identifies subtopics

template: |
  You are a research planning specialist. Your task is to analyze research
  topics and create comprehensive research strategies.

  Research Topic: {{topic}}
  Number of Subtopics: {{num_subtopics}}
  {% if focus_areas|length > 0 %}
  Focus Areas: {{focus_areas|yaml}}
  {% endif %}

  IMPORTANT: You must respond with ONLY valid JSON, no additional text.

  Format your response as this exact JSON structure:
  {
    "subtopics": [
      {
        "title": "Subtopic title",
        "questions": ["Q1", "Q2", "Q3"],
        "sources": ["Source type 1", "Source type 2"],
        "priority": 1
      }
    ],
    "research_strategy": "Brief overview of the research approach"
  }

inputs:
  topic:
    type: string
    required: true
  num_subtopics:
    type: number
    default: 3
  focus_areas:
    type: array<string>
    default: []
Enter fullscreen mode Exit fullscreen mode

Planner prompt

Analyst Prompt with Conditional Logic

id: analyst_system
version: 1.0.0
description: Deep analysis framework with conditional depth

template: |
  You are a research analyst specializing in deep, structured analysis.

  Assignment:
  Subtopic: {{subtopic}}
  Research Questions: {{questions|yaml}}

  Analysis Depth: {{analysis_depth}}
  {% if analysis_depth == "comprehensive" %}
  Provide detailed analysis with specific examples, data points, and multiple viewpoints.
  {% elif analysis_depth == "standard" %}
  Provide balanced analysis covering main points with supporting examples.
  {% else %}
  Provide concise analysis focusing on the most critical insights.
  {% endif %}

inputs:
  subtopic:
    type: string
    required: true
  questions:
    type: array<string>
    required: true
  analysis_depth:
    type: string
    default: standard
Enter fullscreen mode Exit fullscreen mode

The {% if %} conditionals let one template handle multiple depth levels—no need for separate prompts.

Project Structure

Here's how everything is organized:

openai-agents/
├── research_assistant.py       # Main orchestration script
├── prompts/                    # Dakora template directory
│   ├── coordinator_system.yaml # Workflow orchestration
│   ├── planner_system.yaml     # Research strategy
│   ├── analyst_system.yaml     # Deep analysis
│   ├── synthesizer_system.yaml # Finding synthesis
│   └── report_template.yaml    # Output formatting
├── dakora.yaml                 # Dakora configuration
├── requirements.txt            # Python dependencies
├── .env                        # API keys (not in git)
└── README.md
Enter fullscreen mode Exit fullscreen mode

File Structure

The dakora.yaml config file:

registry: local
prompt_dir: ./prompts
logging:
  enabled: false
Enter fullscreen mode Exit fullscreen mode

Extending the System

Add a New Agent Type

  1. Create a prompt template:
# prompts/fact_checker.yaml
id: fact_checker
version: 1.0.0
description: Validates claims against sources

template: |
  You are a fact-checking specialist. Verify the following claims:

  {{claims}}

  For each claim, provide:
  - Verification status (verified/unverified/false)
  - Supporting evidence
  - Confidence level

inputs:
  claims:
    type: string
    required: true
Enter fullscreen mode Exit fullscreen mode
  1. Load it in your code:
fact_checker_prompt = vault.get("fact_checker")
fact_checker = Agent(
    name="Fact Checker",
    instructions=fact_checker_prompt.render(claims=findings_text)
)
Enter fullscreen mode Exit fullscreen mode

Implement Agent Handoffs

Use the framework's handoff mechanism:

from agents import Agent, Handoff

fact_checker = Agent(
    name="Fact Checker",
    instructions=vault.get("fact_checker").render()
)

analyst = Agent(
    name="Analyst",
    instructions=analyst_instructions,
    handoffs=[Handoff(target=fact_checker)]
)
Enter fullscreen mode Exit fullscreen mode

When the analyst encounters uncertain claims, it can hand off to the fact checker automatically.

Command Line Interface

Dakora includes a CLI for prompt management:

# List all templates
dakora list

# View a specific prompt
dakora get analyst_system

# Bump version
dakora bump analyst_system --minor

# Watch for changes (hot reload)
dakora watch
Enter fullscreen mode Exit fullscreen mode

Dakora CLI

Key Takeaways

OpenAI Agents Framework Strengths

  1. Simplicity - Four primitives cover most agent patterns
  2. Production-Ready - Built-in tracing, guardrails, and HITL
  3. Flexibility - Works with 100+ LLMs, not just OpenAI
  4. Growing Ecosystem - TypeScript support, voice agents, and more

When to Use This Framework

Great fit for:

  • Production applications requiring reliability
  • Projects already using OpenAI models
  • Teams valuing simplicity over complex workflows
  • Applications needing human oversight (HITL)

Consider alternatives if:

  • You need complex graph-based workflows (try LangGraph)
  • Multi-role conversations are central (try AutoGen)
  • You're building on non-OpenAI infrastructure

Dakora for Prompt Management

Use Dakora when:

  • Managing 10+ prompts across multiple agents
  • Collaborating between developers and prompt engineers
  • Version tracking and rollback are important
  • Type safety prevents costly API errors

Benefits we experienced:

  • 3x faster prompt iteration
  • Zero hardcoded strings in agent code
  • Easy A/B testing of prompt variations
  • Clear audit trail of prompt changes

Resources

Code & Documentation

Official Documentation

Community

Next Steps

Try building your own multi-agent system:

  1. Clone the example and run it locally
  2. Modify the analyst prompt to add new analysis sections
  3. Add a fourth agent type (summarizer, fact-checker, etc.)
  4. Experiment with different models (GPT-4o, GPT-4o-mini, Claude, etc.)
  5. Integrate with your own data sources or APIs

The combination of OpenAI's Agents Framework and Dakora's prompt management creates a powerful foundation for building reliable, maintainable AI agents. Start simple, iterate fast, and scale as you learn.


Built something cool with OpenAI Agents and Dakora? Share it in the Dakora Discord—we'd love to see what you create.

Top comments (0)