DEV Community

Cover image for Build a Conversational AI with LlamaIndex
Gate of AI
Gate of AI

Posted on • Originally published at gateofai.com

Build a Conversational AI with LlamaIndex

🚀 Technical Briefing: This tutorial is part of our deep-dive series on Agentic Workflows at Gate of AI. For the full technical breakdown, interactive code sandbox, and the native Arabic translation, visit the original article here.

<span>Tutorial</span>
<span>Intermediate</span>
<span>⏱ 20 min read</span>
<span>© Gate of AI 2026-06-04</span>
Enter fullscreen mode Exit fullscreen mode

Learn to build an intelligent conversational AI agent by leveraging LlamaIndex to dynamically orchestrate and route prompts between OpenAI and Anthropic APIs.

Prerequisites


  • Python 3.10 or newer
  • API keys for OpenAI and Anthropic
  • Intermediate knowledge of Python and asynchronous programming

What We're Building


In this tutorial, we will construct an agentic conversational AI companion using LlamaIndex. Instead of using primitive hardcoded logic to route user intent, we will use a ReAct (Reasoning and Acting) agent loop. This workflow allows the core engine to evaluate a user's prompt and dynamically choose whether to delegate the task to OpenAI's GPT-4o or Anthropic's Claude 3.5 Sonnet.

Setup and Installation


Modern versions of LlamaIndex are modular. We must install the core framework alongside the specific multi-model provider integration packages.


pip install llama-index llama-index-llms-openai llama-index-llms-anthropic python-dotenv

Next, configure your environment variables securely. Create a .env file in your root project directory:



.env file

OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key

Step 1: Initializing LLMs with LlamaIndex Abstractions


First, we configure our environment and initialize our language models using the official LlamaIndex wrappers. This standardizes their inputs and outputs.



import os
import asyncio
from dotenv import load_dotenv
from llama_index.llms.openai import OpenAI
from llama_index.llms.anthropic import Anthropic

load_dotenv()

Instantiate the respective LLM configurations

gpt_model = OpenAI(model="gpt-4o")
claude_model = Anthropic(model="claude-3-5-sonnet-20241022")

Step 2: Defining Functional Agent Tools


To let our agent interact with these models dynamically, we wrap the model invocations inside LlamaIndex FunctionTool structures. The docstrings act as the prompt-hints the agent reads to make its structural decisions.



from llama_index.core.tools import FunctionTool

def call_gpt_engine(prompt: str) -> str:
"""Useful when queries require raw logic, structured JSON formatting, or math calculations."""
response = gpt_model.complete(prompt)
return str(response)

def call_claude_engine(prompt: str) -> str:
"""Useful when queries require deeply creative writing, code generation, or nuanced tonal analysis."""
response = claude_model.complete(prompt)
return str(response)

Convert functions to native tools

gpt_tool = FunctionTool.from_defaults(fn=call_gpt_engine)
claude_tool = FunctionTool.from_defaults(fn=call_claude_engine)

Step 3: Creating the ReAct Agent


Now, we construct the ReActAgent, passing our custom model tools directly into its execution layout. We will use GPT-4o as our central engine coordinator.



from llama_index.core.agent import ReActAgent

Bind tools to the orchestration framework

tools = [gpt_tool, claude_tool]
agent = ReActAgent.from_tools(tools, llm=gpt_model, verbose=True)

⚠️ Architecture Reminder: Ensure you are using separate packages like llama-index-llms-openai. Importing directly from a legacy global namespace will result in structural ModuleNotFoundError crashes.

Testing Your Implementation


Execute queries targeting different tasks to see LlamaIndex evaluate the text, pick a tool, format its variables, and cleanly handle responses.



async def main():
# This should trigger the Claude tool based on your docstring hints
creative_res = await agent.achat("Write a short, moody poem about artificial intelligence.")
print(f"Creative Task:\n{creative_res}\n")
# This should route to the GPT tool
structured_res = await agent.achat("Generate a structured list of 3 fake user profiles with keys: id, name.")
print(f"Structured Task:\n{structured_res}\n")
Enter fullscreen mode Exit fullscreen mode

if name == "main":
asyncio.run(main())

What to Build Next


  • Add a VectorStoreIndex to supply localized RAG context windows straight to your tool execution pathways.
  • Incorporate persistent database storage to preserve chat histories across multiple app sessions.
  • Expose your LlamaIndex orchestration wrapper via a robust FastAPI framework backend.

Top comments (0)