DEV Community

Cover image for OCI Generative AI Agents: Building Enterprise RAG Applications Without Code
Ryan Giggs
Ryan Giggs

Posted on

OCI Generative AI Agents: Building Enterprise RAG Applications Without Code

Oracle Cloud Infrastructure (OCI) Generative AI Agents represents a paradigm shift in building AI applications—enabling enterprises to deploy production-grade RAG (Retrieval-Augmented Generation) systems without writing a single line of code. This comprehensive guide explores the architecture, capabilities, and practical implementation of OCI Gen AI Agents for enterprise use cases.

What are OCI Generative AI Agents?

OCI Generative AI Agents is a fully managed service that combines the power of large language models (LLMs) with an intelligent retrieval system to create contextually relevant answers by searching your knowledge base, making your AI applications smart and efficient.

AI Agents are applications of LLMs packaged and validated, ready to use out of the box for enterprise deployments.

Core Capabilities

An AI agent is an LLM-based autonomous system that understands and generates human-like text with high answerability and groundedness. Specifically, an AI agent can:

  • Perform complex tasks autonomously: Execute multi-step workflows without constant human intervention
  • Mimic human chain-of-thought processing: Reason through problems systematically
  • Be an effective tool for automating processes: Handle repetitive tasks at scale
  • Utilize your knowledge: Access and reason over enterprise-specific information

How Agents Work

An AI agent understands and interprets user queries, determines necessary actions, retrieves data if necessary, and executes actions to deliver accurate, contextually relevant responses.

The Complete Workflow:

  1. User Input: Natural language query submitted to the agent
  2. Query Understanding: LLM interprets the intent and context
  3. Knowledge Retrieval: Agent searches knowledge base for relevant information
  4. Response Formulation: LLM generates answer based on retrieved context
  5. Output: User receives response with source citations

Key Agent Components

Knowledge Base

A knowledge base is the base for all the data sources that an agent can use to retrieve information for its chat answers. An agent connects to a knowledge base, which is a vector-based storage that can connect to and ingest data from a data source.

Knowledge bases support multiple data store types:

  • OCI Object Storage: Simple file-based storage (PDF, TXT, JSON, HTML, Markdown)
  • OCI OpenSearch: Pre-indexed documents with custom chunking
  • Oracle Database 23ai: AI Vector Search with structured and unstructured data
  • MySQL HeatWave: In-database vector capabilities

Data Sources

Data sources provide connection information to the data stores that an agent uses to generate responses. Each knowledge base can have multiple data sources, though only one bucket is allowed per Object Storage data source.

Data Ingestion

Data ingestion is a process that extracts data from data source documents, converts it into a structured format suitable for analysis, and then stores it in a knowledge database.

The ingestion pipeline handles:

  • Document parsing and text extraction
  • Chunking into smaller segments
  • Embedding generation
  • Vector indexing
  • Metadata extraction

Agent Concepts and Terminology

Session

A session is a series of exchanges where the user sends queries or prompts and the agent responds with relevant information. Sessions maintain conversation context, enabling multi-turn dialogues where follow-up questions reference previous exchanges.

Agent Endpoint

An agent endpoint is a specific point of access in a network or system that agents use to interact with other systems or devices. Endpoints provide:

  • REST API access for programmatic integration
  • Chat interface for interactive testing
  • Authentication and authorization controls
  • Rate limiting and monitoring

Trace

A trace is a feature to track and display conversation history during a chat conversation. Traces enable:

  • Debugging agent responses
  • Understanding retrieval decisions
  • Monitoring performance
  • Auditing user interactions

Citation

A citation is the source of information for an agent's response. OCI Generative AI Agents provides source attribution for all answers, displaying:

  • Document name
  • Page number (for PDFs)
  • Hyperlinks to original sources
  • Relevance scores

This transparency enables users to verify information and builds trust in agent responses.

Content Moderation

Content moderation is a feature designed to help detect or filter out certain toxic, violent, abusive, hateful, threatening, insulting, and harassing phrases from generated responses or user prompts in LLMs.

Content moderation protects both users and organizations by:

  • Filtering inappropriate user inputs
  • Screening generated outputs
  • Blocking harmful content
  • Maintaining brand safety

Object Storage Guidelines for GenAI Agents

When using Object Storage as your data source, follow these critical guidelines to ensure successful ingestion.

File Requirements

Data for GenAI Agents must be uploaded as files to an Object Storage bucket.

Supported File Types:

As of 2025, OCI Generative AI Agents supports:

  • PDF files
  • TXT files
  • JSON files
  • HTML files
  • Markdown (MD) files

File Size Limits:

  • Each file must be no larger than 100 MB
  • Files exceeding this limit are ignored during ingestion
  • PDF images, charts, and reference tables must not exceed 8 MB within each PDF

Bucket Restrictions:

  • Only one bucket is allowed per data source
  • However, you can use multiple data sources per knowledge base
  • You can ingest up to 1,000 files from an OCI Object Storage bucket into a knowledge base

Chart and Table Guidelines

Charts:

No special preparation is needed for charts as long as they're two-dimensional with labeled axes. The model can answer questions about charts without explicit explanations.

Requirements for optimal chart understanding:

  • Two-dimensional visualization
  • Clearly labeled X and Y axes
  • Legend (if multiple data series)
  • Title or caption
  • Size within 8 MB limit

Tables:

Use reference tables with several rows and columns. For enhanced table understanding, ensure:

  • All cells separated with visible lines or object boundaries
  • All columns including the first column have header names
  • Each table has more than one column and more than one row (excluding headers)
  • Tables use consistent formatting

Example ingestion message:

Count of tables that support enhanced table understanding in following PDFs:
- 2025_Report1.pdf has 4 tables processed successfully
- 2025_Report2.pdf has 3 tables processed successfully
Enter fullscreen mode Exit fullscreen mode

Hyperlinks and URLs

All hyperlinks present in PDF documents are extracted and displayed as hyperlinks in chat responses. This enables users to navigate directly to referenced resources without manual searching.

Planning for Future Data

If data isn't yet available, create an empty folder for the data source and populate it later. This allows you to:

  • Set up knowledge base infrastructure in advance
  • Define data source connections
  • Run ingestion jobs when content becomes available
  • Maintain consistent agent configuration

Advanced Features (March 2025 Release)

Oracle released significant enhancements to the RAG tool in March 2025 based on customer feedback.

Enhanced Response Quality

Improved accuracy by refining contextual understanding and relevance of generated answers through:

  • Better semantic understanding
  • More precise context extraction
  • Improved answer synthesis

Hybrid Search

Combines keyword and vector search for highly precise, efficient retrieval. Hybrid search uses:

  • Keyword (BM25) search: Exact term matching for specific terminology
  • Vector search: Semantic similarity for conceptual matches
  • Fusion ranking: Combining scores from both approaches

This dual approach ensures agents find documents matching both semantic meaning and specific keywords.

Improved Multi-Modal Parsing

Extract insights from rich content formats such as images and charts. Enhanced capabilities include:

  • Image understanding within PDFs
  • Chart data extraction
  • Table structure recognition
  • Diagram interpretation

Custom Instructions

Define preferences to fine-tune behavior and response styles of the RAG tool. Custom instructions enable:

  • Setting agent personality and tone
  • Defining response formats
  • Establishing domain-specific terminology
  • Configuring citation styles

Multi-Lingual Support

Seamlessly interact across several languages, including:

  • French
  • Spanish
  • Portuguese
  • Arabic
  • German
  • Italian
  • Japanese
  • English

This enables global deployment with localized experiences.

Multiple Knowledge Bases

To help improve coverage and depth of knowledge, set up agents to retrieve information from several knowledge bases. Benefits include:

  • Specialized knowledge bases per domain
  • Improved organization and maintenance
  • Better access control
  • Reduced search latency

Metadata Ingestion & Filtering

Use metadata to refine searches and categorize content more effectively. Metadata filtering enables:

  • Date-based filtering (e.g., "documents from 2024")
  • Department/category filtering
  • Author-based retrieval
  • Custom taxonomy support

Users can apply metadata filters during chat conversations to narrow results.

Building Your First Agent: Step-by-Step

Prerequisites

  1. OCI Account: With appropriate IAM permissions
  2. Identity Domain: Required for agent creation
  3. Region Subscription: Chicago region (primary availability)
  4. Object Storage Bucket: With source documents uploaded

Step 1: Create Knowledge Base

Navigate to Analytics & AI → Generative AI Agents → Knowledge Bases

  1. Click Create knowledge base
  2. Enter name (1-255 characters, start with letter or underscore)
  3. Select compartment
  4. Choose data store type (Object Storage, OpenSearch, or Database)
  5. Configure data source:
    • Select Object Storage bucket
    • Choose files or use prefixes
    • Select "Automatically start ingestion job"

Step 2: Monitor Ingestion

The ingestion job automatically starts and processes:

  • Document parsing
  • Text extraction
  • Chunking (512 tokens for OpenSearch)
  • Embedding generation
  • Vector indexing

Check ingestion logs to confirm success and view detailed status including number of ingested, ignored, and failed files.

Step 3: Create Agent

Navigate to Agents page:

  1. Click Create agent
  2. Enter agent name and description
  3. Select LLM (Command R+, Llama 4, etc.)
  4. Add RAG tool configuration:
    • Select knowledge base(s)
    • Configure custom instructions
    • Set hybrid search options
  5. Enable content moderation (optional)
  6. Choose to create endpoint now or later

Step 4: Create Agent Endpoint

If not created during agent setup:

  1. Navigate to agent details
  2. Click Create endpoint
  3. Enter endpoint name
  4. Select region
  5. Configure authentication

Copy the endpoint OCID for API access.

Step 5: Test in Playground

  1. Launch chat from agent or endpoint details page
  2. Select agent and endpoint
  3. Submit test queries
  4. Review responses and citations
  5. Verify knowledge base retrieval

Step 6: Integrate via API

Use the REST API to integrate with applications:

import requests

endpoint_url = "https://agent.generativeai.us-chicago-1.oci.oraclecloud.com/20240531/agentEndpoints/{endpoint_id}/actions/chat"

payload = {
    "message": "What is our company policy on remote work?",
    "sessionId": "session-123",
    "shouldStream": False
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer <your_token>"
}

response = requests.post(endpoint_url, json=payload, headers=headers)
agent_response = response.json()

print(agent_response['message'])
print(agent_response['citations'])
Enter fullscreen mode Exit fullscreen mode

Use Cases for OCI Generative AI Agents

1. Customer Support

In the customer service industry, RAG agents retrieve information from a company's knowledge base to provide correct and contextually relevant answers to customer inquiries, reducing response times and improving customer satisfaction.

Benefits:

  • 24/7 availability
  • Instant access to knowledge base
  • Consistent responses
  • Escalation to human agents when needed

2. Legal Research

Legal professionals can use RAG agents to quickly find precedents and relevant case law from vast legal databases, streamlining the research process and ensuring thorough consideration of relevant legal texts.

3. Healthcare and Medical Guidance

In healthcare, RAG agents help doctors and medical staff by providing diagnostic support, retrieving medical literature, treatment protocols, and patient history to suggest potential diagnoses and treatments.

4. Financial Analysis

In finance, RAG agents analyze large volumes of financial data, reports, and news to provide analysis and recommendations for traders and analysts, helping them make informed decisions.

5. Travel Planning

In the travel industry, RAG agents serve as interactive travel guides, pulling information on destinations, weather, local attractions, and regulations to provide personalized travel advice and itineraries.

6. HR and Internal Knowledge

Deploy agents for employee self-service:

  • Benefits information
  • Policy questions
  • Onboarding guidance
  • Training materials

Best Practices for Production Agents

1. Organize Knowledge Bases Strategically

  • Create separate knowledge bases for different domains
  • Use multiple data sources for comprehensive coverage
  • Maintain clear naming conventions
  • Document knowledge base purposes

2. Optimize Document Preparation

  • Keep files under 100 MB
  • Ensure charts have clear labels and legends
  • Format tables with proper headers
  • Include descriptive filenames
  • Add custom URLs for citation tracking

3. Leverage Metadata Effectively

Tag documents with:

  • Department/category
  • Publication date
  • Author
  • Document type
  • Sensitivity level

4. Implement Content Moderation

  • Enable input and output filtering
  • Define custom moderation rules
  • Monitor flagged content
  • Establish escalation procedures

5. Monitor and Iterate

Track key metrics:

  • Response accuracy
  • Citation quality
  • User satisfaction
  • Query volume
  • Latency

6. Plan for Scale

  • Request limit increases proactively (agents, knowledge bases, files)
  • Implement caching for common queries
  • Design for multi-region deployment
  • Establish backup and recovery procedures

Limitations and Considerations

Regional Availability

As of early 2025, OCI Generative AI Agents is primarily available in:

  • US Midwest (Chicago) - Primary region

Additional regions are planned for future release.

Resource Limits

Default limits per tenancy:

  • 2 agents
  • 2 knowledge bases per agent
  • 1,000 files per Object Storage data source

Limits can be increased through service request.

File Type Restrictions

Supported formats are limited to PDF, TXT, JSON, HTML, and Markdown. Other formats (DOCX, XLSX, PPTX) require conversion before upload.

Security and Compliance

Data Privacy

  • Data remains within your tenancy
  • No training on customer data
  • Encryption at rest and in transit
  • Access controlled via IAM policies

Enterprise Features

  • RBAC: Fine-grained access control
  • Audit Logging: Complete activity trails
  • Content Moderation: Built-in safety controls
  • Citation Tracking: Source attribution for all responses

Conclusion

OCI Generative AI Agents democratizes enterprise AI by providing a fully managed, no-code platform for building production-grade RAG applications. Key advantages include:

Ease of Use:

  • No coding required for basic setup
  • Visual interface for configuration
  • Automated ingestion and indexing
  • Built-in testing playground

Enterprise Features:

  • Multiple knowledge base support
  • Hybrid search capabilities
  • Multi-lingual support
  • Content moderation
  • Source attribution

Flexibility:

  • Support for various data sources
  • Custom instructions
  • Metadata filtering
  • API and chat interfaces

Scalability:

  • Fully managed service
  • Enterprise-grade security
  • Compliance ready
  • Oracle infrastructure reliability

By following the guidelines for Object Storage (100 MB file limit, supported formats, chart/table preparation), leveraging advanced features (hybrid search, multi-modal parsing, metadata filtering), and implementing best practices, organizations can deploy sophisticated AI agents that deliver accurate, grounded, and trustworthy responses to users.

Whether you're building customer support chatbots, internal knowledge assistants, or specialized research tools, OCI Generative AI Agents provides the foundation for enterprise-grade AI applications—without the complexity of traditional RAG implementations.

Top comments (0)