Seenivasa Ramadurai

Posted on Oct 5

Building an Intelligent RAG Agent with Azure AI Foundry: A Deep Dive into Sreeni-RAG

#rag #azure #tutorial #ai

Introduction

In today's fast-paced business environment, organizations need AI solutions that can analyze resumes and evaluate candidates with accuracy and transparency. That's exactly what I set out to build with Sreeni-RAG an intelligent agent that specializes in candidate evaluation and resume analysis with source-backed responses.

This blog post takes you through the journey of building this sophisticated AI agent using Azure AI Foundry, from the initial concept to deployment, including all the technical details, code snippets.

What is Sreeni-RAG?

Sreeni-RAG is a Retrieval Augmented Generation (RAG) agent built on Azure AI Foundry that specializes in candidate evaluation and resume analysis:

Resume Analysis - Extracting and structuring information from candidate resumes
Candidate Evaluation - Matching candidate qualifications to job requirements

The agent leverages a knowledge base to provide accurate, source-backed responses, making it an invaluable tool for HR departments and recruitment teams.

The Technical Architecture

Core Components
The Sreeni-RAG agent is built using several key technologies:

Azure AI Foundry - The foundation for agent creation and management
GPT-4o - The underlying language model powering the agent
Vector Store - For storing and retrieving relevant documents
FastAPI - For building the REST API interface
Python - The primary programming language

requirements.txt

python-dotenv
azure-identity
azure-ai-projects
fastapi
uvicorn

Step1: First login into protal.azure.com and search Azure AI Foundry resources and select the resource as shown below

Step 2: Select the Azure AI Foundry and create the project

Step 3: Next, Create Azure AI Foundry resource this will create the project and default Azure hub behind the scene

After deploying your Azure AI Foundry project, click the provided button or link to open Azure AI Foundry.
This will take you to the ai.azure.com portal.

From there, you can create an agent, deploy a model, add a knowledge base, and test it in the playground, as shown below.

After creating your Azure AI Foundry project, copy the project endpoint.

This endpoint is crucial it will be used in your Visual Studio Code project to connect to the Azure AI Foundry service, build the agent, and expose it to the outside world through Azure App Service or Azure Functions.

Deploy the model .

Configure the agent - Adding name , Adding Knowledge etc.

Add resume files to Azure AI search or create KB

Test the Agent using Test Playground from the azure portal itself

Developing the Agent and Exposing It as a REST Endpoint

Note:
In this application, several endpoints have been added — including those to check agent health, list all agents (useful when the app scales and multiple agent instances are created), and an endpoint to upload a new resume to the vector store.

However, to test the Sreeni-RAG Agent, you only need to use the /ask endpoint.

import os
import json
from contextlib import asynccontextmanager
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.agents.models import ListSortOrder, FilePurpose, VectorStoreDataSource, VectorStoreDataSourceAssetType
from fastapi import UploadFile, File

# Load environment variables
load_dotenv()

PROJECT_ENDPOINT = os.getenv("PROJECT_ENDPOINT")
AGENT_NAME = os.getenv("AGENT_NAME", "Dynamic Agent")
AGENT_ID = os.getenv("AGENT_ID")
SYSTEM_PROMPT = os.getenv("SYSTEM_PROMPT", """You are a professional resume Q&A assistant.

You are given the text or content of a candidate's resume. Your goal is to answer any question about the candidate accurately, clearly, and concisely based on the resume information only.

Guidelines:
- Base every answer strictly on the resume content.
- If a question cannot be answered from the resume, say: "That information is not available in the resume."
- Highlight experience, skills, education, or projects only if they are mentioned.
- Keep tone professional and factual.
- Avoid assumptions or adding external information.

Example:
Q: What programming languages does the candidate know?
A: The candidate is experienced in Python, C#, Java, and Scala.

Q: Does the resume mention cloud experience?
A: Yes. The candidate has hands-on experience with AWS and Microsoft Azure.""")

# Authenticate with Azure
credential = DefaultAzureCredential()
project_client = AIProjectClient(
    credential=credential,
    endpoint=PROJECT_ENDPOINT
)

# Cache agent information to avoid repeated API calls
agent_cache = {}


@asynccontextmanager
async def lifespan(app: FastAPI):
    """Dynamically select and use an agent from Azure AI Foundry."""
    global AGENT_ID, agent_cache

    try:
        # If no agent ID is provided, get the first available agent
        if not AGENT_ID:
            print("🔍 No agent ID provided, finding first available agent...")
            agents = list(project_client.agents.list_agents())
            if agents:
                AGENT_ID = agents[0].id
                print(f"✅ Auto-selected first available agent: {agents[0].name} ({AGENT_ID})")
            else:
                raise Exception("No agents found in the project")

        # Get the agent information and cache it
        agent = project_client.agents.get_agent(AGENT_ID)
        agent_cache = {
            "name": agent.name,
            "id": agent.id,
            "model": getattr(agent, 'model', 'Unknown'),
            "created_at": getattr(agent, 'created_at', 'Unknown')
        }
        print(f"✅ Using agent: {agent_cache['name']} ({agent_cache['id']})")
        print("✅ Agent has knowledge base access: AgentVectorStore_27091")

    except Exception as e:
        print(f"❌ Error getting agent info: {e}")
        print(f"✅ Using fallback configuration")
        agent_cache = {
            "name": AGENT_NAME,
            "id": AGENT_ID or "No agent selected",
            "model": "Unknown",
            "created_at": "Unknown",
            "error": f"Could not fetch agent details at startup: {str(e)}"
        }

    yield  # This is where the app runs

    print("🔄 Shutting down...")


app = FastAPI(
    title="Azure AI Foundry Agent API", 
    version="1.0", 
    lifespan=lifespan,
    description="AI Agent API with knowledge base access or Chat with resumes"
)


class UserMessage(BaseModel):
    message: str

class AgentSelection(BaseModel):
    agent_id: str

class UploadResponse(BaseModel):
    message: str
    file_id: str
    batch_id: str
    success: bool

class SystemPromptUpdate(BaseModel):
    system_prompt: str


@app.post("/ask")
def ask_agent(input: UserMessage):
    """Send a message to the Azure AI Foundry Agent."""
    try:
        # Display user input in console
        print("\n" + "="*80)
        print("🤖 AGENT REQUEST")
        print("="*80)
        print(f"👤 User Question: {input.message}")
        print(f"🎯 System Prompt: {SYSTEM_PROMPT[:100]}...")
        print("-"*80)

        # Get the agent
        agent = project_client.agents.get_agent(AGENT_ID)

        # Create a thread
        thread = project_client.agents.threads.create()

        # Create a user message with system prompt context
        full_message = f"System Instructions: {SYSTEM_PROMPT}\n\nUser Question: {input.message}"
        project_client.agents.messages.create(
            thread_id=thread.id,
            role="user",
            content=full_message
        )

        # Run the agent
        run = project_client.agents.runs.create_and_process(
            thread_id=thread.id,
            agent_id=agent.id
        )

        if run.status == "failed":
            # Display error in console
            print("❌ AGENT ERROR")
            print("-"*80)
            print(f"🚨 Error: {run.last_error}")
            print(f"📊 Status: {run.status}")
            print(f"🆔 Thread ID: {thread.id}")
            print(f"🆔 Run ID: {run.id}")
            print("="*80)

            return {
                "response": f"Run failed: {run.last_error}",
                "status": "failed",
                "thread_id": thread.id,
                "run_id": run.id
            }
        else:
            # Get the messages
            messages = project_client.agents.messages.list(
                thread_id=thread.id, 
                order=ListSortOrder.ASCENDING
            )

            # Find the assistant's response
            assistant_response = None
            for message in messages:
                if message.role == "assistant" and message.text_messages:
                    assistant_response = message.text_messages[-1].text.value
                    break

            # Display agent response in console
            print("🤖 AGENT RESPONSE")
            print("-"*80)
            print(f"📝 Response: {assistant_response or 'No response received'}")
            print(f"📊 Status: {run.status}")
            print(f"🆔 Thread ID: {thread.id}")
            print(f"🆔 Run ID: {run.id}")
            print("="*80)

            return {
                "response": assistant_response or "No response received",
                "status": run.status,
                "thread_id": thread.id,
                "run_id": run.id
            }
    except Exception as e:
        # Display exception in console
        print("💥 EXCEPTION ERROR")
        print("-"*80)
        print(f"🚨 Exception: {str(e)}")
        print("="*80)
        raise HTTPException(status_code=500, detail=str(e))


@app.get("/")
def root():
    """Check application status."""
    return {
        "status": "running", 
        "agent_name": agent_cache.get("name", AGENT_NAME),
        "agent_id": agent_cache.get("id", AGENT_ID or "Auto-selected"),
        "model": agent_cache.get("model", "Unknown"),
        "created_at": agent_cache.get("created_at", "Unknown"),
        "knowledge_base": "AgentVectorStore_27091",
        "endpoints": {
            "ask": "/ask - Send messages to the agent",
            "agents": "/agents - List all available agents",
            "switch_agent": "/switch-agent - Switch to a different agent",
            "upload_knowledge": "/upload-knowledge - Upload files to knowledge base",
            "update_system_prompt": "/update-system-prompt - Update agent system prompt",
            "get_system_prompt": "/system-prompt - Get current system prompt",
            "health": "/health - Health check",
            "docs": "/docs - API documentation",
            "status": "/ - Application status"
        }
    }


@app.get("/agents")
def list_agents():
    """List all available agents."""
    try:
        agents = list(project_client.agents.list_agents())
        return {
            "agents": [
                {
                    "id": agent.id,
                    "name": agent.name,
                    "model": getattr(agent, 'model', 'Unknown'),
                    "created_at": getattr(agent, 'created_at', 'Unknown')
                }
                for agent in agents
            ],
            "current_agent_id": AGENT_ID,
            "total_count": len(agents)
        }
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))


@app.post("/switch-agent")
def switch_agent(selection: AgentSelection):
    """Switch to a different agent."""
    global AGENT_ID, agent_cache

    try:
        # Validate the agent exists
        agent = project_client.agents.get_agent(selection.agent_id)

        # Update the global agent ID
        AGENT_ID = selection.agent_id

        # Update the cache
        agent_cache = {
            "name": agent.name,
            "id": agent.id,
            "model": getattr(agent, 'model', 'Unknown'),
            "created_at": getattr(agent, 'created_at', 'Unknown')
        }

        return {
            "message": f"Successfully switched to agent: {agent.name}",
            "agent_id": agent.id,
            "agent_name": agent.name,
            "model": agent_cache["model"]
        }
    except Exception as e:
        raise HTTPException(status_code=400, detail=f"Agent not found or error: {str(e)}")


@app.post("/update-system-prompt")
def update_system_prompt(update: SystemPromptUpdate):
    """Update the system prompt for the agent."""
    global SYSTEM_PROMPT

    try:
        # Update the global system prompt
        SYSTEM_PROMPT = update.system_prompt

        return {
            "message": "System prompt updated successfully",
            "system_prompt": SYSTEM_PROMPT
        }
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Failed to update system prompt: {str(e)}")


@app.get("/system-prompt")
def get_system_prompt():
    """Get the current system prompt."""
    return {
        "system_prompt": SYSTEM_PROMPT
    }


@app.post("/upload-knowledge", response_model=UploadResponse)
def upload_knowledge(file: UploadFile = File(...)):
    """Upload a file to the knowledge base vector store."""
    try:
        # Read file content
        file_content = file.file.read()

        # Upload file to Azure AI
        uploaded_file = project_client.agents.files.upload_and_poll(
            file=file_content,
            purpose=FilePurpose.AGENTS,
            filename=file.filename
        )

        # Add file to vector store
        data_source = VectorStoreDataSource(
            asset_identifier=uploaded_file.id,
            asset_type=VectorStoreDataSourceAssetType.FILE_ID
        )

        # Add to vector store
        vector_store_batch = project_client.agents.vector_store_file_batches.create_and_poll(
            vector_store_id="vs_20cOCQ9HlBzFKtU59jbUkWBt",  # Your vector store ID
            data_sources=[data_source]
        )

        return UploadResponse(
            message=f"Successfully uploaded {file.filename} to knowledge base",
            file_id=uploaded_file.id,
            batch_id=vector_store_batch.id,
            success=True
        )

    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Upload failed: {str(e)}")


@app.get("/health")
def health_check():
    """Health check endpoint."""
    try:
        # Quick agent check
        agent = project_client.agents.get_agent(AGENT_ID)
        return {
            "status": "healthy",
            "agent_accessible": True,
            "agent_name": agent.name,
            "agent_id": agent.id
        }
    except Exception as e:
        return {
            "status": "unhealthy",
            "agent_accessible": False,
            "error": str(e)
        }

API Documentation

The application automatically generates API documentation at http://127.0.0.1:8000/docs, making it easy for developers to understand and use the endpoints.

Available Endpoints
GET / - Application status
POST /ask - Send messages to the Sreeni-RAG agent
GET /agents - List all available agents
POST /switch-agent - Switch to a different agent
POST /upload-knowledge - Upload files to the knowledge base
GET /health - Health check
GET /docs - API documentation

Invoking Agent from Swagger User Interface

Query 1:

Query 2:

Conclusion

The Azure AI Agent project demonstrates how easily intelligent agents can be built, customized, and deployed using Azure AI Foundry.
By leveraging managed AI services, developers can focus on designing agent logic, integrating knowledge bases, and exposing capabilities through REST APIs without worrying about infrastructure or model hosting.

Using FastAPI and Azure AI Foundry, we created a scalable, secure, and production-ready solution where the heavy lifting (LLM reasoning, memory, and tool execution) runs in Azure’s managed environment.

Our application layer simply interacts with the agent through endpoints such as /ask, enabling flexible integrations with web apps, Teams bots, or enterprise systems.

This architecture provides:

Seamless scalability using Azure App Service or Container Apps
Secure authentication through Azure Identity and Managed Identities
Extendibility for RAG, tool calling, or domain-specific knowledge bases

In summary, Azure AI Foundry Agents bring the power of large language models to enterprise applications with control, compliance, and simplicity—helping teams turn intelligent ideas into production-grade AI solutions quickly and confidently.

Thanks
Sreeni Ramadorai