Ryan Giggs

Posted on Jan 23

OCI Generative AI Agents: Building Enterprise RAG Applications Without Code

#ocigenai #ai #agents #rag

Oracle Cloud Infrastructure (OCI) Generative AI Agents represents a paradigm shift in building AI applications—enabling enterprises to deploy production-grade RAG (Retrieval-Augmented Generation) systems without writing a single line of code. This comprehensive guide explores the architecture, capabilities, and practical implementation of OCI Gen AI Agents for enterprise use cases.

What are OCI Generative AI Agents?

OCI Generative AI Agents is a fully managed service that combines the power of large language models (LLMs) with an intelligent retrieval system to create contextually relevant answers by searching your knowledge base, making your AI applications smart and efficient.

AI Agents are applications of LLMs packaged and validated, ready to use out of the box for enterprise deployments.

Core Capabilities

An AI agent is an LLM-based autonomous system that understands and generates human-like text with high answerability and groundedness. Specifically, an AI agent can:

Perform complex tasks autonomously: Execute multi-step workflows without constant human intervention
Mimic human chain-of-thought processing: Reason through problems systematically
Be an effective tool for automating processes: Handle repetitive tasks at scale
Utilize your knowledge: Access and reason over enterprise-specific information

How Agents Work

An AI agent understands and interprets user queries, determines necessary actions, retrieves data if necessary, and executes actions to deliver accurate, contextually relevant responses.

The Complete Workflow:

User Input: Natural language query submitted to the agent
Query Understanding: LLM interprets the intent and context
Knowledge Retrieval: Agent searches knowledge base for relevant information
Response Formulation: LLM generates answer based on retrieved context
Output: User receives response with source citations

Key Agent Components

Knowledge Base

A knowledge base is the base for all the data sources that an agent can use to retrieve information for its chat answers. An agent connects to a knowledge base, which is a vector-based storage that can connect to and ingest data from a data source.

Knowledge bases support multiple data store types:

OCI Object Storage: Simple file-based storage (PDF, TXT, JSON, HTML, Markdown)
OCI OpenSearch: Pre-indexed documents with custom chunking
Oracle Database 23ai: AI Vector Search with structured and unstructured data
MySQL HeatWave: In-database vector capabilities

Data Sources

Data sources provide connection information to the data stores that an agent uses to generate responses. Each knowledge base can have multiple data sources, though only one bucket is allowed per Object Storage data source.

Data Ingestion

Data ingestion is a process that extracts data from data source documents, converts it into a structured format suitable for analysis, and then stores it in a knowledge database.

The ingestion pipeline handles:

Document parsing and text extraction
Chunking into smaller segments
Embedding generation
Vector indexing
Metadata extraction

Agent Concepts and Terminology

Session

A session is a series of exchanges where the user sends queries or prompts and the agent responds with relevant information. Sessions maintain conversation context, enabling multi-turn dialogues where follow-up questions reference previous exchanges.

Agent Endpoint

An agent endpoint is a specific point of access in a network or system that agents use to interact with other systems or devices. Endpoints provide:

REST API access for programmatic integration
Chat interface for interactive testing
Authentication and authorization controls
Rate limiting and monitoring

Trace

A trace is a feature to track and display conversation history during a chat conversation. Traces enable:

Debugging agent responses
Understanding retrieval decisions
Monitoring performance
Auditing user interactions

Citation

A citation is the source of information for an agent's response. OCI Generative AI Agents provides source attribution for all answers, displaying:

Document name
Page number (for PDFs)
Hyperlinks to original sources
Relevance scores

This transparency enables users to verify information and builds trust in agent responses.

Content Moderation

Content moderation is a feature designed to help detect or filter out certain toxic, violent, abusive, hateful, threatening, insulting, and harassing phrases from generated responses or user prompts in LLMs.

Content moderation protects both users and organizations by:

Filtering inappropriate user inputs
Screening generated outputs
Blocking harmful content
Maintaining brand safety

Object Storage Guidelines for GenAI Agents

When using Object Storage as your data source, follow these critical guidelines to ensure successful ingestion.

File Requirements

Data for GenAI Agents must be uploaded as files to an Object Storage bucket.

Supported File Types:

As of 2025, OCI Generative AI Agents supports:

PDF files
TXT files
JSON files
HTML files
Markdown (MD) files

File Size Limits:

Each file must be no larger than 100 MB
Files exceeding this limit are ignored during ingestion
PDF images, charts, and reference tables must not exceed 8 MB within each PDF

Bucket Restrictions:

Only one bucket is allowed per data source
However, you can use multiple data sources per knowledge base
You can ingest up to 1,000 files from an OCI Object Storage bucket into a knowledge base

Chart and Table Guidelines

Charts:

No special preparation is needed for charts as long as they're two-dimensional with labeled axes. The model can answer questions about charts without explicit explanations.

Requirements for optimal chart understanding:

Two-dimensional visualization
Clearly labeled X and Y axes
Legend (if multiple data series)
Title or caption
Size within 8 MB limit

Tables:

Use reference tables with several rows and columns. For enhanced table understanding, ensure:

All cells separated with visible lines or object boundaries
All columns including the first column have header names
Each table has more than one column and more than one row (excluding headers)
Tables use consistent formatting

Example ingestion message:

Count of tables that support enhanced table understanding in following PDFs:
- 2025_Report1.pdf has 4 tables processed successfully
- 2025_Report2.pdf has 3 tables processed successfully

Hyperlinks and URLs

All hyperlinks present in PDF documents are extracted and displayed as hyperlinks in chat responses. This enables users to navigate directly to referenced resources without manual searching.

Planning for Future Data

If data isn't yet available, create an empty folder for the data source and populate it later. This allows you to:

Set up knowledge base infrastructure in advance
Define data source connections
Run ingestion jobs when content becomes available
Maintain consistent agent configuration

Advanced Features (March 2025 Release)

Oracle released significant enhancements to the RAG tool in March 2025 based on customer feedback.

Enhanced Response Quality

Improved accuracy by refining contextual understanding and relevance of generated answers through:

Better semantic understanding
More precise context extraction
Improved answer synthesis

Hybrid Search

Combines keyword and vector search for highly precise, efficient retrieval. Hybrid search uses:

Keyword (BM25) search: Exact term matching for specific terminology
Vector search: Semantic similarity for conceptual matches
Fusion ranking: Combining scores from both approaches

This dual approach ensures agents find documents matching both semantic meaning and specific keywords.

Improved Multi-Modal Parsing

Extract insights from rich content formats such as images and charts. Enhanced capabilities include:

Image understanding within PDFs
Chart data extraction
Table structure recognition
Diagram interpretation

Custom Instructions

Define preferences to fine-tune behavior and response styles of the RAG tool. Custom instructions enable:

Setting agent personality and tone
Defining response formats
Establishing domain-specific terminology
Configuring citation styles

Multi-Lingual Support

Seamlessly interact across several languages, including:

French
Spanish
Portuguese
Arabic
German
Italian
Japanese
English

This enables global deployment with localized experiences.

Multiple Knowledge Bases

To help improve coverage and depth of knowledge, set up agents to retrieve information from several knowledge bases. Benefits include:

Specialized knowledge bases per domain
Improved organization and maintenance
Better access control
Reduced search latency

Metadata Ingestion & Filtering

Use metadata to refine searches and categorize content more effectively. Metadata filtering enables:

Date-based filtering (e.g., "documents from 2024")
Department/category filtering
Author-based retrieval
Custom taxonomy support

Users can apply metadata filters during chat conversations to narrow results.

Building Your First Agent: Step-by-Step

Prerequisites

OCI Account: With appropriate IAM permissions
Identity Domain: Required for agent creation
Region Subscription: Chicago region (primary availability)
Object Storage Bucket: With source documents uploaded

Step 1: Create Knowledge Base

Navigate to Analytics & AI → Generative AI Agents → Knowledge Bases

Click Create knowledge base
Enter name (1-255 characters, start with letter or underscore)
Select compartment
Choose data store type (Object Storage, OpenSearch, or Database)
Configure data source:
- Select Object Storage bucket
- Choose files or use prefixes
- Select "Automatically start ingestion job"

Step 2: Monitor Ingestion

The ingestion job automatically starts and processes:

Document parsing
Text extraction
Chunking (512 tokens for OpenSearch)
Embedding generation
Vector indexing

Check ingestion logs to confirm success and view detailed status including number of ingested, ignored, and failed files.

Step 3: Create Agent

Navigate to Agents page:

Click Create agent
Enter agent name and description
Select LLM (Command R+, Llama 4, etc.)
Add RAG tool configuration:
- Select knowledge base(s)
- Configure custom instructions
- Set hybrid search options
Enable content moderation (optional)
Choose to create endpoint now or later

Step 4: Create Agent Endpoint

If not created during agent setup:

Navigate to agent details
Click Create endpoint
Enter endpoint name
Select region
Configure authentication

Copy the endpoint OCID for API access.

Step 5: Test in Playground

Launch chat from agent or endpoint details page
Select agent and endpoint
Submit test queries
Review responses and citations
Verify knowledge base retrieval

Step 6: Integrate via API

Use the REST API to integrate with applications:

import requests

endpoint_url = "https://agent.generativeai.us-chicago-1.oci.oraclecloud.com/20240531/agentEndpoints/{endpoint_id}/actions/chat"

payload = {
    "message": "What is our company policy on remote work?",
    "sessionId": "session-123",
    "shouldStream": False
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer <your_token>"
}

response = requests.post(endpoint_url, json=payload, headers=headers)
agent_response = response.json()

print(agent_response['message'])
print(agent_response['citations'])

Use Cases for OCI Generative AI Agents

1. Customer Support

In the customer service industry, RAG agents retrieve information from a company's knowledge base to provide correct and contextually relevant answers to customer inquiries, reducing response times and improving customer satisfaction.

Benefits:

24/7 availability
Instant access to knowledge base
Consistent responses
Escalation to human agents when needed

2. Legal Research

Legal professionals can use RAG agents to quickly find precedents and relevant case law from vast legal databases, streamlining the research process and ensuring thorough consideration of relevant legal texts.

3. Healthcare and Medical Guidance

In healthcare, RAG agents help doctors and medical staff by providing diagnostic support, retrieving medical literature, treatment protocols, and patient history to suggest potential diagnoses and treatments.

4. Financial Analysis

In finance, RAG agents analyze large volumes of financial data, reports, and news to provide analysis and recommendations for traders and analysts, helping them make informed decisions.

5. Travel Planning

In the travel industry, RAG agents serve as interactive travel guides, pulling information on destinations, weather, local attractions, and regulations to provide personalized travel advice and itineraries.

6. HR and Internal Knowledge

Deploy agents for employee self-service:

Benefits information
Policy questions
Onboarding guidance
Training materials

Best Practices for Production Agents

1. Organize Knowledge Bases Strategically

Create separate knowledge bases for different domains
Use multiple data sources for comprehensive coverage
Maintain clear naming conventions
Document knowledge base purposes

2. Optimize Document Preparation

Keep files under 100 MB
Ensure charts have clear labels and legends
Format tables with proper headers
Include descriptive filenames
Add custom URLs for citation tracking

3. Leverage Metadata Effectively

Tag documents with:

Department/category
Publication date
Author
Document type
Sensitivity level

4. Implement Content Moderation

Enable input and output filtering
Define custom moderation rules
Monitor flagged content
Establish escalation procedures

5. Monitor and Iterate

Track key metrics:

Response accuracy
Citation quality
User satisfaction
Query volume
Latency

6. Plan for Scale

Request limit increases proactively (agents, knowledge bases, files)
Implement caching for common queries
Design for multi-region deployment
Establish backup and recovery procedures

Limitations and Considerations

Regional Availability

As of early 2025, OCI Generative AI Agents is primarily available in:

US Midwest (Chicago) - Primary region

Additional regions are planned for future release.

Resource Limits

Default limits per tenancy:

2 agents
2 knowledge bases per agent
1,000 files per Object Storage data source

Limits can be increased through service request.

File Type Restrictions

Supported formats are limited to PDF, TXT, JSON, HTML, and Markdown. Other formats (DOCX, XLSX, PPTX) require conversion before upload.

Security and Compliance

Data Privacy

Data remains within your tenancy
No training on customer data
Encryption at rest and in transit
Access controlled via IAM policies

Enterprise Features

RBAC: Fine-grained access control
Audit Logging: Complete activity trails
Content Moderation: Built-in safety controls
Citation Tracking: Source attribution for all responses

Conclusion

OCI Generative AI Agents democratizes enterprise AI by providing a fully managed, no-code platform for building production-grade RAG applications. Key advantages include:

Ease of Use:

No coding required for basic setup
Visual interface for configuration
Automated ingestion and indexing
Built-in testing playground

Enterprise Features:

Multiple knowledge base support
Hybrid search capabilities
Multi-lingual support
Content moderation
Source attribution

Flexibility:

Support for various data sources
Custom instructions
Metadata filtering
API and chat interfaces

Scalability:

Fully managed service
Enterprise-grade security
Compliance ready
Oracle infrastructure reliability

By following the guidelines for Object Storage (100 MB file limit, supported formats, chart/table preparation), leveraging advanced features (hybrid search, multi-modal parsing, metadata filtering), and implementing best practices, organizations can deploy sophisticated AI agents that deliver accurate, grounded, and trustworthy responses to users.

Whether you're building customer support chatbots, internal knowledge assistants, or specialized research tools, OCI Generative AI Agents provides the foundation for enterprise-grade AI applications—without the complexity of traditional RAG implementations.