DEV Community

Cover image for OpenAI Agent Builder: Step-by-step guide to building agents with MCP
Developer Harsh for Composio

Posted on • Originally published at composio.dev

OpenAI Agent Builder: Step-by-step guide to building agents with MCP

Intro

OpenAI released Agent Builder, and netizens are going crazy about it over X and Reddit. Being an AI enthusiast myself, I took it for a spin.

I built a YouTube Q&A agent using Composio’s Rube MCP that answers my audience's queries naturally and conversationally.

Apart from that, I also explored the platform itself and examined ways to better prompt agents and implement safety nets in production.

Don’t worry, I will be covering my learning in this blog while showcasing my build step by step.

So, let’s get started!

TL; DR

  • Built a YouTube Q/A Agent using OpenAI Agent Builder and Composio that answers audience queries conversationally.
  • Explained step-by-step setup from Start, Guardrail, and Agent nodes to adding a Vector Store and Rube MCP integration.
  • Added safety checks, including moderation, jailbreak detection, and hallucination control, for secure production use.
  • Showcased testing process, confirming fast and effective Q/A performance.
  • Wrapped up with an overview of all 12 Agent Builder nodes and tips for building production-grade AI workflows efficiently.

Prerequisites

Completing prerequisites is easy,

Head to the Open AI Agent Builder Platform and log in with your OpenAI Credentials. If you don’t have one, create it and ensure that you add your billing details and verify your org for running the agents in preview.

Next in the Agent Builder Tab, you will see 3 Tabs and a familiar interface:

  • Workflows → Published Workflows. By default, a My Flow might be given.
  • Drafts → All unfinished / not published flows can be found here.
  • Templates → Predefined templates that work out of the box. Good for starters.

However, we will opt for a blank flow for maximum flexibility.

/img to add


Building The YouTube Q/A Agent Step by Step

For simplicity and clarity, I have broken the flow on a node-by-node basis. Each section covers how to set up, with additional details on the node.

For the overview, we are going to follow the following flow:

User Query -> Safety net -> Rag Query by Agent -> Results Returned 
Enter fullscreen mode Exit fullscreen mode

However, all this begins with adding a start node.

1. Set Entry Point with Start Node

Click on the + Create button. This will open a blank canvas, much like n8n, but with a Start Node.

Start Node

The start node acts as an entry point to any workflow. It offers two variables:

  • Input variables
    • Define inputs to the workflow.
    • Use input_as_text to ****represent user-provided text / append it to chat history
    • Default variable representing input, if no state variable is provided.
  • State variables
    • Additional input parameters are passed during the input process.
    • Persistent across the entire workflow and accessible using state nodes.
    • Can be defined just as variables and store a single data type

For now, let’s keep only the Input variables. Time to add input validation next

2. Input Validation with Guardrail Node

Gaurdrails

Next, we need to add input validation/guardrails to ensure that filtered query inputs are only passed to the model for a better chat experience.

To do this, let’s add a Guardrail node and connect it to the start node. If you click on the Guardrail node, you will find many options:

Gaurdrail Options

Here is what each does:

Field Function
Name Name of the node
Input Name of the input variable; can vary if text state variable is also defined
Personally identifiable information (PII) Detects and redacts any PII
Moderation Classifies and blocks harmful content if prompt contains it
Jailbreak Detects prompt injection and jailbreak; keeps model on task
Hallucination Verifies input by validating against the knowledge source (e.g., RAG store)
Continue on error Sets an error path if guardrail fails; not recommended for production flows

Having understood all of this, let’s set them up.

  • Moderation: Click ⚙️ → Most Critical → Save → Toggle On
  • Jailbreak: Toggle On (keep defaults)
  • Hallucinations: Click ⚙️ → Add Vector Store Id (generated in next section) → Save → Toggle On (For later step when mentioned)

How it looks
Moderation

Jailbreak

With this, we have set up our guardrail with both a pass and a fail path. Yup, it’s that simple!

Now let’s add the agent - the heart of the flow

3. Adding the Brain with Agent Node, Rube MCP & Vector Store

Agent

To add an agent, click on the Agent Node in the sidebar and connect it with the pass path.

Then, inside agents, let's configure a few things:

  • Name: Name of the agent, let’s use YouTube Q/A Agent
  • Instructions: Instructions on how agent should act use one from gist or you can u make your own following OpenAI agent builder prompting guide. Or, use the ✏️ icon to generate a prompt.
  • Include Chat History: Whether to include past conversation or not. Suitable for context building, but racks up cost. Let’s set it on.
  • Model: Model to use. Let’s go with GPT-5. You can opt for cheaper models if the usage is high gpt5-mini / gpt4o
  • Reasoning: Can’t be turned off, can only be set to minimum. Let’s go for medium here.
  • Output Format: Supports text, JSON, and widgets. Let’s keep default - text
  • Verbosity: Set too low for shorter answers and vice versa. Let go with medium
  • Summary: Shows the reasoning summary in chat. I am setting it to null, but you can keep it enabled if you prefer.
  • Write Conversation History: Let’s proceed, saving the data to the conversation history state.

Now, to add Rag Vector Store and MCP support, click on + beside tools

For Vector Store

  • Select File Search from the list
  • Add all files
  • Save
  • Copy the generated vector Id and paste it in the Hallucinations vector_id field and save.

How it Looks

Vector Store

For Rube MCP:

  • Select MCP Servers
  • Click + Servers
  • In the URL, put: https://rube.app/mcp
  • In Name put rube_mcp
  • Authentication: API Key → Get it from:
    • Going to the Rube app,
    • Selecting Install Rube
    • Navigating to N8N & Others Tab
    • Generate Token
    • & Copy Paste
  • Save

How it Looks

MCP

And you are done!

Is it that easy to connect both?

Apart from this, there are other fields within the OpenAI agent builder agent node as well. You can learn more at the OpenAI agent builder node docs.

Finally, the only thing left is to define the Fail Path, so let’s set that.

4. Ending Chat Session with End Node

End Node

Remember the failure path in the guardrail node, connect it with the End Node by selecting it from the sidebar.

The end node takes input_as_text and returns a JSON.

No, you don’t need to write it yourself; use OpenAI Agent Builder's native JSON Schema Generator and prompt it with your needs in natural language.

Let’s prompt it by clicking ✏️ Generate

Model should output failed and reason of failure.
Enter fullscreen mode Exit fullscreen mode

& here is the generated schema. Make sure to click update

Generated Schema

I have not often pursued this path.
Anyway, now that everything is done, it's time to test the agent!


Testing Time

For testing, head to the Preview at the top, and a chat window opens. Enter your query and see the responses along with intermediate querying and reasoning steps.

Here is what it looks like:

Note: This was initial run, after little while I changed the name of Agent to YT QA Agent for explain ability. Also since this is a tutorial, I haven’t published, but you can by clicking Publish button.

Amazing, we built our first multilingual YouTube Q/A agent that:

  • has a personality,
  • filter’s hate speech input output,
  • response based on only vector store searches, and in a friendly manner
  • handles jailbreak
  • takes multilingual input and responds only in English.

And all this took like 5 minutes and four nodes - much faster prototyping than N8N.

In case you want to get the code, just hit Code → Agent’s SDK → Python / TypeScript and Copy.

Generated Code (TypeScript)
import { fileSearchTool, Agent, AgentInputItem, Runner } from "@openai/agents";
import { OpenAI } from "openai";
import { runGuardrails } from "@openai/guardrails";


// Tool definitions
const fileSearch = fileSearchTool([
  "vs_68e4e5fa172c81919e7f47557727b06c"
])

// Shared client for guardrails and file search
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Guardrails definitions
const guardrailsConfig = {
  guardrails: [
    {
      name: "Moderation",
      config: {
        categories: [
          "sexual/minors",
          "hate/threatening",
          "harassment/threatening",
          "self-harm/instructions",
          "violence/graphic",
          "illicit/violent"
        ]
      }
    },
    {
      name: "Jailbreak",
      config: {
        model: "gpt-4.1-mini",
        confidence_threshold: 0.7
      }
    },
    {
      name: "Hallucination Detection",
      config: {
        model: "gpt-4.1-mini",
        knowledge_source: "vs_68e4e5fa172c81919e7f47557727b06c",
        confidence_threshold: 0.7
      }
    }
  ]
};
const context = { guardrailLlm: client };

// Guardrails utils
function guardrailsHasTripwire(results) {
    return (results ?? []).some((r) => r?.tripwireTriggered === true);
}

function getGuardrailSafeText(results, fallbackText) {
    // Prefer checked_text as the generic safe/processed text
    for (const r of results ?? []) {
        if (r?.info && ("checked_text" in r.info)) {
            return r.info.checked_text ?? fallbackText;
        }
    }
    // Fall back to PII-specific anonymized_text if present
    const pii = (results ?? []).find((r) => r?.info && "anonymized_text" in r.info);
    return pii?.info?.anonymized_text ?? fallbackText;
}

function buildGuardrailFailOutput(results) {
    const get = (name) => (results ?? []).find((r) => {
          const info = r?.info ?? {};
          const n = (info?.guardrail_name ?? info?.guardrailName);
          return n === name;
        }),
          pii = get("Contains PII"),
          mod = get("Moderation"),
          jb = get("Jailbreak"),
          hal = get("Hallucination Detection"),
          piiCounts = Object.entries(pii?.info?.detected_entities ?? {})
              .filter(([, v]) => Array.isArray(v))
              .map(([k, v]) => k + ":" + v.length),
          thr = jb?.info?.threshold,
          conf = jb?.info?.confidence;

    return {
        pii: {
            failed: (piiCounts.length > 0) || pii?.tripwireTriggered === true,
            ...(piiCounts.length ? { detected_counts: piiCounts } : {}),
            ...(pii?.executionFailed && pii?.info?.error ? { error: pii.info.error } : {}),
        },
        moderation: {
            failed: mod?.tripwireTriggered === true || ((mod?.info?.flagged_categories ?? []).length > 0),
            ...(mod?.info?.flagged_categories ? { flagged_categories: mod.info.flagged_categories } : {}),
            ...(mod?.executionFailed && mod?.info?.error ? { error: mod.info.error } : {}),
        },
        jailbreak: {
            // Rely on runtime-provided tripwire; don't recompute thresholds
            failed: jb?.tripwireTriggered === true,
            ...(jb?.executionFailed && jb?.info?.error ? { error: jb.info.error } : {}),
        },
        hallucination: {
            // Rely on runtime-provided tripwire; don't recompute
            failed: hal?.tripwireTriggered === true,
            ...(hal?.info?.reasoning ? { reasoning: hal.info.reasoning } : {}),
            ...(hal?.info?.hallucination_type ? { hallucination_type: hal.info.hallucination_type } : {}),
            ...(hal?.info?.hallucinated_statements ? { hallucinated_statements: hal.info.hallucinated_statements } : {}),
            ...(hal?.info?.verified_statements ? { verified_statements: hal.info.verified_statements } : {}),
            ...(hal?.executionFailed && hal?.info?.error ? { error: hal.info.error } : {}),
        },
    };
}
const ytQAAgent = new Agent({
  name: "YT Q/A Agent",
  instructions: `Answer user questions about computational thinking, programming, logic-building, and related topics using only information from the English-translated content of Developer Harsh’s YouTube transcripts abd validated by rube mcp (Hindi source; use English for replies). Respond as Harsh: clear, concise, warm, and engaging—never mention being an AI, backend processes, or system details.

If you find a relevant answer:
- Reply in a single, conversational English paragraph, engaging and helpful.

If no answer is found in transcripts:
- Say: \"I'm sorry, I couldn't find the answer to your question in the current videos. You can contact me directly at devloper.hs2015@gmail.com for more help.\"

# Output Format

- Respond  as Harsh with a very short, paragraph in English
- If no answer is found, use the specified fallback sentence. \" I'm sorry, I couldn't find the answer to your question in the current videos. You can contact me directly at devloper.hs2015@gmail.com for more help.\"

# Examples

**Example 1:**
*User:* How do I build logic for solving programming problems?  
*Reply:* Building logic starts with breaking problems into small steps and applying programming basics like loops and conditions. Practicing consistently will strengthen your logical thinking—happy to help with examples!

**Example 2:**
*User:* What is computational thinking in your videos?  
*Reply:* Computational thinking means breaking problems into parts, spotting patterns, and designing step-by-step solutions. It’s a skill that makes problem-solving easier, both in programming and everyday life!

**Example 3:**
*User:* Do your videos cover advanced topics like neural networks?  
*Reply:* I'm sorry, I couldn't find the answer to your question in the current videos. You can contact me directly at devloper.hs2015@gmail.com for more help.`,
  model: "gpt-5",
  tools: [
    fileSearch
  ],
  modelSettings: {
    reasoning: {
      effort: "medium",
      summary: "auto"
    },
    store: true
  }
});

type WorkflowInput = { input_as_text: string };


// Main code entrypoint
export const runWorkflow = async (workflow: WorkflowInput) => {
  const state = {

  };
  const conversationHistory: AgentInputItem[] = [
    {
      role: "user",
      content: [
        {
          type: "input_text",
          text: workflow.input_as_text
        }
      ]
    }
  ];
  const runner = new Runner({
    traceMetadata: {
      __trace_source__: "agent-builder",
      workflow_id: "wf_68e4d8693fd881909702d2253055a11803a8f6c933983bc3"
    }
  });
  const guardrailsInputtext = workflow.input_as_text;
  const guardrailsResult = await runGuardrails(guardrailsInputtext, guardrailsConfig, context);
  const guardrailsHastripwire = guardrailsHasTripwire(guardrailsResult);
  const guardrailsAnonymizedtext = getGuardrailSafeText(guardrailsResult, guardrailsInputtext);
  const guardrailsOutput = (guardrailsHastripwire ? buildGuardrailFailOutput(guardrailsResult ?? []) : { safe_text: (guardrailsAnonymizedtext ?? guardrailsInputtext) });
  if (guardrailsHastripwire) {
    const endResult = {
      failed: false,
      reason: ""
    };
    return endResult;
  } else {
    const ytQAAgentResultTemp = await runner.run(
      ytQAAgent,
      [
        ...conversationHistory
      ]
    );
    conversationHistory.push(...ytQAAgentResultTemp.newItems.map((item) => item.rawItem));

    if (!ytQAAgentResultTemp.finalOutput) {
        throw new Error("Agent result is undefined");
    }

    const ytQAAgentResult = {
      output_text: ytQAAgentResultTemp.finalOutput ?? ""
    };
  }
}

Enter fullscreen mode Exit fullscreen mode

We will look into Chakit in some other blog.

But we have only scratched the surface; there are more nodes to explore!


Exploring the Open AI Agent Builder Nodes (Optional)

Open AI Agent Builder comes with 12 nodes categorized into four sections, with three nodes in each section. Here is what it does at a high level:

Core

  • Agent → Already saw acting as a brain by letting the user call the model with their instructions, tools and agent actions. Use when you need LLM processing.
  • End → Ends the flow instantly / returns workflow output. Use for guardrails or handling unexpected user behavior. Ensure that the workflow output is set in the desired format.
  • Note → Add a stick note. Helpful in adding instructions in flow/documenting.

Tools

  • File Search → Another name for OpenAI Vector Store. Queries the vector store for relevant information. Ideal for rag-based use cases.
  • Guardrails → All about safety. Adds PII, jailbreak, hallucination checks & runs moderation for input output. Useful for production
  • MCP → Unlike N8N, open ai agent builder users MCPs instead of webhooks / APIs to interact with external or companies’ internal tools & services. Use with complex workflows requiring multiple service results. E.g. we used Rube MCP to verify the citations internally.

Logic

  • If / Else → For Conditional branching. Use to create conditions to branch workflows. Yup, multiple workflows in a single canvas allowed!
  • While → Loop till condition is true. Use when the total iteration is not known, for example, polling an api for call status as completed. Must write in CEL (click on learn more in the while window at the bottom)
  • User Approval → Add a human in the loop. Pauses execution for us to approve or reject a step. Great for moderation on high-risk tasks, such as finance, e-commerce, and legal, among others.

Data

  • Transform → Reshapes data using CEL. Support JSON output. Useful when enforcing type restrictions of the production or preprocessing input for agents to read
  • State → Global variable that can be used throughout the workflow. Useful if you need inputs and outputs that can be referenced throughout the workflow. Yup, it's the State variable we learnt at the start.

In case you are interested in learning more, refer to the OpenAI Agent Builder official documentation

Note: In case you feel something missing, just click copy the page and paste in ChatGPT, then ask your query.

Anyway, let’s look at the conclusion before wrapping up.

Conclusion

Knowing all the nodes is a good head start, but the real essence is understanding:

  • Entire workflow logic,
  • How data passes between nodes,
  • For your use case, which node will work more efficiently and why?
  • How production-grade workflows are built.
  • And what are the best practices to follow while building agents?

With Agent Builder, OpenAI enables businesses and developers alike to create simple prototypes and production-grade development with fewer nodes and setup.

Now one can go from Idea to production in significantly less time, and it will be interesting to see how businesses adapt to this new ecosystem shift

Time will only tell, but one thing is clear: learning a new tool never hurts, so head to the Open AI Agent Builder, connect Rube MCP and get started building.

See ya in next one 👋

Top comments (0)