Oracle Cloud Infrastructure (OCI) Generative AI Agents represents a paradigm shift in building AI applications—enabling enterprises to deploy production-grade RAG (Retrieval-Augmented Generation) systems without writing a single line of code. This comprehensive guide explores the architecture, capabilities, and practical implementation of OCI Gen AI Agents for enterprise use cases.
What are OCI Generative AI Agents?
OCI Generative AI Agents is a fully managed service that combines the power of large language models (LLMs) with an intelligent retrieval system to create contextually relevant answers by searching your knowledge base, making your AI applications smart and efficient.
AI Agents are applications of LLMs packaged and validated, ready to use out of the box for enterprise deployments.
Core Capabilities
An AI agent is an LLM-based autonomous system that understands and generates human-like text with high answerability and groundedness. Specifically, an AI agent can:
- Perform complex tasks autonomously: Execute multi-step workflows without constant human intervention
- Mimic human chain-of-thought processing: Reason through problems systematically
- Be an effective tool for automating processes: Handle repetitive tasks at scale
- Utilize your knowledge: Access and reason over enterprise-specific information
How Agents Work
An AI agent understands and interprets user queries, determines necessary actions, retrieves data if necessary, and executes actions to deliver accurate, contextually relevant responses.
The Complete Workflow:
- User Input: Natural language query submitted to the agent
- Query Understanding: LLM interprets the intent and context
- Knowledge Retrieval: Agent searches knowledge base for relevant information
- Response Formulation: LLM generates answer based on retrieved context
- Output: User receives response with source citations
Key Agent Components
Knowledge Base
A knowledge base is the base for all the data sources that an agent can use to retrieve information for its chat answers. An agent connects to a knowledge base, which is a vector-based storage that can connect to and ingest data from a data source.
Knowledge bases support multiple data store types:
- OCI Object Storage: Simple file-based storage (PDF, TXT, JSON, HTML, Markdown)
- OCI OpenSearch: Pre-indexed documents with custom chunking
- Oracle Database 23ai: AI Vector Search with structured and unstructured data
- MySQL HeatWave: In-database vector capabilities
Data Sources
Data sources provide connection information to the data stores that an agent uses to generate responses. Each knowledge base can have multiple data sources, though only one bucket is allowed per Object Storage data source.
Data Ingestion
Data ingestion is a process that extracts data from data source documents, converts it into a structured format suitable for analysis, and then stores it in a knowledge database.
The ingestion pipeline handles:
- Document parsing and text extraction
- Chunking into smaller segments
- Embedding generation
- Vector indexing
- Metadata extraction
Agent Concepts and Terminology
Session
A session is a series of exchanges where the user sends queries or prompts and the agent responds with relevant information. Sessions maintain conversation context, enabling multi-turn dialogues where follow-up questions reference previous exchanges.
Agent Endpoint
An agent endpoint is a specific point of access in a network or system that agents use to interact with other systems or devices. Endpoints provide:
- REST API access for programmatic integration
- Chat interface for interactive testing
- Authentication and authorization controls
- Rate limiting and monitoring
Trace
A trace is a feature to track and display conversation history during a chat conversation. Traces enable:
- Debugging agent responses
- Understanding retrieval decisions
- Monitoring performance
- Auditing user interactions
Citation
A citation is the source of information for an agent's response. OCI Generative AI Agents provides source attribution for all answers, displaying:
- Document name
- Page number (for PDFs)
- Hyperlinks to original sources
- Relevance scores
This transparency enables users to verify information and builds trust in agent responses.
Content Moderation
Content moderation is a feature designed to help detect or filter out certain toxic, violent, abusive, hateful, threatening, insulting, and harassing phrases from generated responses or user prompts in LLMs.
Content moderation protects both users and organizations by:
- Filtering inappropriate user inputs
- Screening generated outputs
- Blocking harmful content
- Maintaining brand safety
Object Storage Guidelines for GenAI Agents
When using Object Storage as your data source, follow these critical guidelines to ensure successful ingestion.
File Requirements
Data for GenAI Agents must be uploaded as files to an Object Storage bucket.
Supported File Types:
As of 2025, OCI Generative AI Agents supports:
- PDF files
- TXT files
- JSON files
- HTML files
- Markdown (MD) files
File Size Limits:
- Each file must be no larger than 100 MB
- Files exceeding this limit are ignored during ingestion
- PDF images, charts, and reference tables must not exceed 8 MB within each PDF
Bucket Restrictions:
- Only one bucket is allowed per data source
- However, you can use multiple data sources per knowledge base
- You can ingest up to 1,000 files from an OCI Object Storage bucket into a knowledge base
Chart and Table Guidelines
Charts:
No special preparation is needed for charts as long as they're two-dimensional with labeled axes. The model can answer questions about charts without explicit explanations.
Requirements for optimal chart understanding:
- Two-dimensional visualization
- Clearly labeled X and Y axes
- Legend (if multiple data series)
- Title or caption
- Size within 8 MB limit
Tables:
Use reference tables with several rows and columns. For enhanced table understanding, ensure:
- All cells separated with visible lines or object boundaries
- All columns including the first column have header names
- Each table has more than one column and more than one row (excluding headers)
- Tables use consistent formatting
Example ingestion message:
Count of tables that support enhanced table understanding in following PDFs:
- 2025_Report1.pdf has 4 tables processed successfully
- 2025_Report2.pdf has 3 tables processed successfully
Hyperlinks and URLs
All hyperlinks present in PDF documents are extracted and displayed as hyperlinks in chat responses. This enables users to navigate directly to referenced resources without manual searching.
Planning for Future Data
If data isn't yet available, create an empty folder for the data source and populate it later. This allows you to:
- Set up knowledge base infrastructure in advance
- Define data source connections
- Run ingestion jobs when content becomes available
- Maintain consistent agent configuration
Advanced Features (March 2025 Release)
Oracle released significant enhancements to the RAG tool in March 2025 based on customer feedback.
Enhanced Response Quality
Improved accuracy by refining contextual understanding and relevance of generated answers through:
- Better semantic understanding
- More precise context extraction
- Improved answer synthesis
Hybrid Search
Combines keyword and vector search for highly precise, efficient retrieval. Hybrid search uses:
- Keyword (BM25) search: Exact term matching for specific terminology
- Vector search: Semantic similarity for conceptual matches
- Fusion ranking: Combining scores from both approaches
This dual approach ensures agents find documents matching both semantic meaning and specific keywords.
Improved Multi-Modal Parsing
Extract insights from rich content formats such as images and charts. Enhanced capabilities include:
- Image understanding within PDFs
- Chart data extraction
- Table structure recognition
- Diagram interpretation
Custom Instructions
Define preferences to fine-tune behavior and response styles of the RAG tool. Custom instructions enable:
- Setting agent personality and tone
- Defining response formats
- Establishing domain-specific terminology
- Configuring citation styles
Multi-Lingual Support
Seamlessly interact across several languages, including:
- French
- Spanish
- Portuguese
- Arabic
- German
- Italian
- Japanese
- English
This enables global deployment with localized experiences.
Multiple Knowledge Bases
To help improve coverage and depth of knowledge, set up agents to retrieve information from several knowledge bases. Benefits include:
- Specialized knowledge bases per domain
- Improved organization and maintenance
- Better access control
- Reduced search latency
Metadata Ingestion & Filtering
Use metadata to refine searches and categorize content more effectively. Metadata filtering enables:
- Date-based filtering (e.g., "documents from 2024")
- Department/category filtering
- Author-based retrieval
- Custom taxonomy support
Users can apply metadata filters during chat conversations to narrow results.
Building Your First Agent: Step-by-Step
Prerequisites
- OCI Account: With appropriate IAM permissions
- Identity Domain: Required for agent creation
- Region Subscription: Chicago region (primary availability)
- Object Storage Bucket: With source documents uploaded
Step 1: Create Knowledge Base
Navigate to Analytics & AI → Generative AI Agents → Knowledge Bases
- Click Create knowledge base
- Enter name (1-255 characters, start with letter or underscore)
- Select compartment
- Choose data store type (Object Storage, OpenSearch, or Database)
- Configure data source:
- Select Object Storage bucket
- Choose files or use prefixes
- Select "Automatically start ingestion job"
Step 2: Monitor Ingestion
The ingestion job automatically starts and processes:
- Document parsing
- Text extraction
- Chunking (512 tokens for OpenSearch)
- Embedding generation
- Vector indexing
Check ingestion logs to confirm success and view detailed status including number of ingested, ignored, and failed files.
Step 3: Create Agent
Navigate to Agents page:
- Click Create agent
- Enter agent name and description
- Select LLM (Command R+, Llama 4, etc.)
- Add RAG tool configuration:
- Select knowledge base(s)
- Configure custom instructions
- Set hybrid search options
- Enable content moderation (optional)
- Choose to create endpoint now or later
Step 4: Create Agent Endpoint
If not created during agent setup:
- Navigate to agent details
- Click Create endpoint
- Enter endpoint name
- Select region
- Configure authentication
Copy the endpoint OCID for API access.
Step 5: Test in Playground
- Launch chat from agent or endpoint details page
- Select agent and endpoint
- Submit test queries
- Review responses and citations
- Verify knowledge base retrieval
Step 6: Integrate via API
Use the REST API to integrate with applications:
import requests
endpoint_url = "https://agent.generativeai.us-chicago-1.oci.oraclecloud.com/20240531/agentEndpoints/{endpoint_id}/actions/chat"
payload = {
"message": "What is our company policy on remote work?",
"sessionId": "session-123",
"shouldStream": False
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer <your_token>"
}
response = requests.post(endpoint_url, json=payload, headers=headers)
agent_response = response.json()
print(agent_response['message'])
print(agent_response['citations'])
Use Cases for OCI Generative AI Agents
1. Customer Support
In the customer service industry, RAG agents retrieve information from a company's knowledge base to provide correct and contextually relevant answers to customer inquiries, reducing response times and improving customer satisfaction.
Benefits:
- 24/7 availability
- Instant access to knowledge base
- Consistent responses
- Escalation to human agents when needed
2. Legal Research
Legal professionals can use RAG agents to quickly find precedents and relevant case law from vast legal databases, streamlining the research process and ensuring thorough consideration of relevant legal texts.
3. Healthcare and Medical Guidance
In healthcare, RAG agents help doctors and medical staff by providing diagnostic support, retrieving medical literature, treatment protocols, and patient history to suggest potential diagnoses and treatments.
4. Financial Analysis
In finance, RAG agents analyze large volumes of financial data, reports, and news to provide analysis and recommendations for traders and analysts, helping them make informed decisions.
5. Travel Planning
In the travel industry, RAG agents serve as interactive travel guides, pulling information on destinations, weather, local attractions, and regulations to provide personalized travel advice and itineraries.
6. HR and Internal Knowledge
Deploy agents for employee self-service:
- Benefits information
- Policy questions
- Onboarding guidance
- Training materials
Best Practices for Production Agents
1. Organize Knowledge Bases Strategically
- Create separate knowledge bases for different domains
- Use multiple data sources for comprehensive coverage
- Maintain clear naming conventions
- Document knowledge base purposes
2. Optimize Document Preparation
- Keep files under 100 MB
- Ensure charts have clear labels and legends
- Format tables with proper headers
- Include descriptive filenames
- Add custom URLs for citation tracking
3. Leverage Metadata Effectively
Tag documents with:
- Department/category
- Publication date
- Author
- Document type
- Sensitivity level
4. Implement Content Moderation
- Enable input and output filtering
- Define custom moderation rules
- Monitor flagged content
- Establish escalation procedures
5. Monitor and Iterate
Track key metrics:
- Response accuracy
- Citation quality
- User satisfaction
- Query volume
- Latency
6. Plan for Scale
- Request limit increases proactively (agents, knowledge bases, files)
- Implement caching for common queries
- Design for multi-region deployment
- Establish backup and recovery procedures
Limitations and Considerations
Regional Availability
As of early 2025, OCI Generative AI Agents is primarily available in:
- US Midwest (Chicago) - Primary region
Additional regions are planned for future release.
Resource Limits
Default limits per tenancy:
- 2 agents
- 2 knowledge bases per agent
- 1,000 files per Object Storage data source
Limits can be increased through service request.
File Type Restrictions
Supported formats are limited to PDF, TXT, JSON, HTML, and Markdown. Other formats (DOCX, XLSX, PPTX) require conversion before upload.
Security and Compliance
Data Privacy
- Data remains within your tenancy
- No training on customer data
- Encryption at rest and in transit
- Access controlled via IAM policies
Enterprise Features
- RBAC: Fine-grained access control
- Audit Logging: Complete activity trails
- Content Moderation: Built-in safety controls
- Citation Tracking: Source attribution for all responses
Conclusion
OCI Generative AI Agents democratizes enterprise AI by providing a fully managed, no-code platform for building production-grade RAG applications. Key advantages include:
Ease of Use:
- No coding required for basic setup
- Visual interface for configuration
- Automated ingestion and indexing
- Built-in testing playground
Enterprise Features:
- Multiple knowledge base support
- Hybrid search capabilities
- Multi-lingual support
- Content moderation
- Source attribution
Flexibility:
- Support for various data sources
- Custom instructions
- Metadata filtering
- API and chat interfaces
Scalability:
- Fully managed service
- Enterprise-grade security
- Compliance ready
- Oracle infrastructure reliability
By following the guidelines for Object Storage (100 MB file limit, supported formats, chart/table preparation), leveraging advanced features (hybrid search, multi-modal parsing, metadata filtering), and implementing best practices, organizations can deploy sophisticated AI agents that deliver accurate, grounded, and trustworthy responses to users.
Whether you're building customer support chatbots, internal knowledge assistants, or specialized research tools, OCI Generative AI Agents provides the foundation for enterprise-grade AI applications—without the complexity of traditional RAG implementations.
Top comments (0)