Oracle Database 23ai introduces AI Vector Search, a powerful feature that enables semantic search capabilities directly within the database. Unlike traditional keyword-based searches, vector search understands the meaning and context behind your queries, allowing you to find relevant information even when exact keywords don't match.
This guide walks through the complete Oracle AI Vector Search workflow, from generating vector embeddings to performing sophisticated similarity searches.
What Are Vector Embeddings?
Vector embeddings are mathematical representations that capture the semantic meaning of your data. They convert unstructured data like text, images, or audio into numerical vectors that can be mathematically compared.
Key Properties:
- Each embedding is an array of numbers (typically 128 to 1536 dimensions)
- Semantically similar content has vectors that are close together in vector space
- Distance between vectors indicates similarity in meaning
For example, the vectors for "apple" and "orange" are closer together than the vectors for "apple" and "dog" because fruits share more semantic similarity than a fruit and an animal.
The Complete Oracle AI Vector Search Workflow
The workflow consists of four primary steps that transform raw data into searchable semantic knowledge:
Step 1: Generate Vector Embeddings
You can generate vector embeddings either outside or within Oracle Database, depending on your requirements.
Option A: Generate Embeddings Within Oracle Database (ONNX)
This is the recommended approach for integrated, secure workflows. Oracle Database 23ai supports loading ONNX (Open Neural Network Exchange) format embedding models directly into the database.
Loading an ONNX Model:
-- Load the embedding model into the database
EXECUTE DBMS_VECTOR.LOAD_ONNX_MODEL(
'DM_DUMP',
'all_MiniLM_L12_v2.onnx',
'doc_model'
);
Note: The directory 'DM_DUMP' must exist and point to the OS directory containing your ONNX model file.
Generating Embeddings from Text:
-- Generate a vector embedding from a text string
SELECT TO_VECTOR(
VECTOR_EMBEDDING(doc_model USING 'hello world' AS data)
) AS embedding;
Common Embedding Generation Patterns:
- Convert Text String to Embedding:
-- Simple text to vector
SELECT VECTOR_EMBEDDING(doc_model USING 'semantic search' AS data) AS vec;
- Convert File to Text to Chunks to Embeddings:
-- Load PDF, convert to text, chunk, and embed
INSERT INTO doc_chunks (doc_id, chunk_id, chunk_data, embedding)
SELECT
d.id,
JSON_VALUE(c.column_value, '$.chunk_id' RETURNING NUMBER),
JSON_VALUE(c.column_value, '$.chunk_data'),
VECTOR_EMBEDDING(doc_model USING
JSON_VALUE(c.column_value, '$.chunk_data') AS data)
FROM documents d,
DBMS_VECTOR_CHAIN.UTL_TO_CHUNKS(
DBMS_VECTOR_CHAIN.UTL_TO_TEXT(d.pdf_content),
JSON('{"by":"words", "max":"100", "overlap":"20",
"split":"recursively", "language":"american"}')
) c;
- Convert Files Directly to Embeddings:
-- For images or other media files
SELECT VECTOR_EMBEDDING(
image_model USING file_content AS data
) AS image_vec
FROM media_files;
- End-to-End Pipeline (Chainable Functions):
Oracle's DBMS_VECTOR_CHAIN package provides chainable utility functions for complete data transformation pipelines:
-- Complete pipeline: PDF → Text → Chunks → Embeddings
INSERT INTO vectorized_docs
SELECT
d.doc_id,
chunks.chunk_id,
chunks.chunk_data,
embeddings.embed_vector
FROM source_documents d,
DBMS_VECTOR_CHAIN.UTL_TO_CHUNKS(
DBMS_VECTOR_CHAIN.UTL_TO_TEXT(d.pdf_blob),
JSON('{"by":"words", "max":"100", "overlap":"20"}')
) chunks,
DBMS_VECTOR_CHAIN.UTL_TO_EMBEDDINGS(
chunks.chunk_data,
JSON('{"provider":"database", "model":"doc_model"}')
) embeddings;
Understanding Document Chunking:
When working with large documents, you need to split them into smaller chunks because:
- Embedding models have input size limitations (typically 512-4000 characters)
- Smaller chunks provide more precise semantic matching
- Better retrieval accuracy in similarity searches
Chunking Parameters:
-
by: Chunking method (
words,sentences,paragraphs,vocabulary) - max: Maximum size per chunk (e.g., 100 words)
- overlap: Number of overlapping units between chunks (prevents context loss at boundaries)
-
split: Splitting strategy (
recursively,custom) - language: Language-specific rules for sentence boundaries
- normalize: Text normalization options
Option B: Generate Embeddings Outside Oracle Database
You can also generate embeddings externally using third-party services or local models, then load them into Oracle Database.
External Generation Options:
- Cloud APIs: OpenAI, Cohere, Google Vertex AI, Hugging Face
- Local Models: Python libraries (sentence-transformers, transformers)
- Self-hosted Services: ONNX Runtime, TensorFlow Serving
Example (Python):
from sentence_transformers import SentenceTransformer
import oracledb
# Generate embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')
text = "Oracle AI Vector Search"
embedding = model.encode(text)
# Store in Oracle Database
connection = oracledb.connect(user="user", password="pwd", dsn="localhost/FREEPDB1")
cursor = connection.cursor()
cursor.execute(
"INSERT INTO documents (id, text, embedding) VALUES (:1, :2, :3)",
[1, text, embedding.tolist()]
)
connection.commit()
Advantages of In-Database Generation:
- No data movement between systems
- Reduced latency
- Simplified architecture
- Automatic consistency between data and embeddings
- Enterprise security and governance apply to the entire pipeline
Advantages of External Generation:
- Flexibility in model choice
- Access to latest models and APIs
- Can use specialized or proprietary models
- Independent scaling of embedding generation
Step 2: Store Vector Embeddings
Store vector embeddings in columns with the VECTOR data type within standard relational tables. This allows you to keep business data and semantic vectors together in a unified system.
Creating Tables with Vector Columns:
-- Create table with VECTOR data type
CREATE TABLE products (
product_id NUMBER PRIMARY KEY,
product_name VARCHAR2(200),
description CLOB,
price NUMBER(10,2),
category VARCHAR2(100),
-- Vector column with 384 dimensions, FLOAT32 format
description_vector VECTOR(384, FLOAT32)
);
Supported Vector Formats:
- INT8: 8-bit integers (memory efficient, slight accuracy trade-off)
- FLOAT32: 32-bit floating-point (most common, balanced)
- FLOAT64: 64-bit floating-point (highest precision, more storage)
Inserting Data with Vectors:
-- Insert with inline embedding generation
INSERT INTO products (product_id, product_name, description, description_vector)
VALUES (
1001,
'Wireless Headphones',
'Noise-cancelling Bluetooth headphones with 30-hour battery',
VECTOR_EMBEDDING(doc_model USING
'Noise-cancelling Bluetooth headphones with 30-hour battery' AS data)
);
-- Or insert data first, then update vectors
INSERT INTO products (product_id, product_name, description, price)
VALUES (1002, 'Smart Watch', 'Fitness tracker with heart rate monitor', 299.99);
UPDATE products
SET description_vector = VECTOR_EMBEDDING(doc_model USING description AS data)
WHERE product_id = 1002;
Step 3: Create Vector Indexes
For large vector datasets, creating vector indexes is essential for performance. Without indexes, similarity searches require comparing the query vector against every vector in the table (exhaustive search), which becomes slow as data grows.
Why Vector Indexes Matter:
An exhaustive similarity search is accurate but can be slow because it calculates distance to every vector in the column. Vector indexes use Approximate Nearest Neighbor (ANN) algorithms to dramatically reduce the number of distance calculations while maintaining high accuracy (typically 90-99%).
Performance Impact:
- Without Index (Exhaustive): O(n) complexity - checks every vector
- With Index (Approximate): O(log n) complexity - checks subset of vectors
- Real-world improvement: Searches that take hours can complete in milliseconds
Types of Vector Indexes in Oracle 23ai:
1. HNSW (Hierarchical Navigable Small World) - In-Memory Index:
CREATE VECTOR INDEX product_hnsw_idx ON products (description_vector)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 95;
Characteristics:
- Stored in Vector Memory Pool (part of SGA)
- Extremely fast query performance
- Best for datasets that fit in available memory
- Requires configuring
VECTOR_MEMORY_SIZEparameter
Configure Vector Memory:
ALTER SYSTEM SET VECTOR_MEMORY_SIZE = 2G SCOPE=BOTH;
2. IVF (Inverted File Flat) - Storage-Based Index:
CREATE VECTOR INDEX product_ivf_idx ON products (description_vector)
ORGANIZATION NEIGHBOR PARTITIONS
DISTANCE COSINE
WITH TARGET ACCURACY 95;
Characteristics:
- Stored on disk (not memory-constrained)
- Excellent for very large datasets
- Uses clustering to partition vectors by similarity
- Supports both global and local partitioning
3. Hybrid Vector Index - Combined Text and Semantic Search:
CREATE VECTOR INDEX product_hybrid_idx ON products (description)
ORGANIZATION HYBRID
PARAMETERS('EMBEDDING MODEL doc_model');
Characteristics:
- Combines full-text search and vector similarity
- Single index for keyword and semantic searches
- Ideal for mixed query types
Choosing the Right Index:
- HNSW: When dataset fits in memory and you need maximum speed
- IVF: When dataset is very large or memory is limited
- Hybrid: When you need both keyword and semantic search capabilities
Step 4: Query Data with Similarity Search
Oracle AI Vector Search uses native SQL with specialized functions for similarity searches. You can combine semantic search with traditional relational queries for powerful hybrid searches.
Basic Similarity Search:
-- Find products similar to a search query
SELECT
product_name,
description,
VECTOR_DISTANCE(
description_vector,
VECTOR_EMBEDDING(doc_model USING 'wireless audio device' AS data),
COSINE
) AS similarity_score
FROM products
ORDER BY similarity_score
FETCH FIRST 5 ROWS ONLY;
Using Shorthand Operators:
Oracle provides convenient distance operators:
-- <=> is the cosine distance operator
SELECT product_name, description
FROM products
ORDER BY description_vector <=>
VECTOR_EMBEDDING(doc_model USING 'wireless audio device' AS data)
FETCH FIRST 5 ROWS ONLY;
-- <-> is the Euclidean distance operator
SELECT product_name
FROM products
ORDER BY description_vector <-> :query_vector
FETCH FIRST 10 ROWS ONLY;
Approximate vs. Exact Search:
-- Approximate search (uses index if available, faster)
SELECT product_name
FROM products
ORDER BY VECTOR_DISTANCE(description_vector, :query_vec, COSINE)
FETCH FIRST 10 ROWS ONLY;
-- Exact search (no index, 100% accurate, slower)
SELECT product_name
FROM products
ORDER BY VECTOR_DISTANCE(description_vector, :query_vec, COSINE)
FETCH EXACT FIRST 10 ROWS ONLY;
Combining Semantic and Relational Searches:
This is where Oracle AI Vector Search truly shines—combining semantic similarity with business logic in a single query:
-- Find similar products within a specific price range and category
SELECT
product_name,
price,
category,
VECTOR_DISTANCE(description_vector, :query_vec, COSINE) AS similarity
FROM products
WHERE price BETWEEN 100 AND 500
AND category = 'Electronics'
ORDER BY similarity
FETCH FIRST 10 ROWS ONLY;
Advanced Hybrid Query Example:
-- Semantic search with business filters and aggregation
SELECT
p.category,
COUNT(*) AS matching_products,
AVG(p.price) AS avg_price,
MIN(VECTOR_DISTANCE(p.description_vector, :search_vec, COSINE)) AS best_match
FROM products p
WHERE p.price < 1000
AND p.in_stock = 'Y'
GROUP BY p.category
HAVING COUNT(*) > 5
ORDER BY best_match
FETCH FIRST 3 ROWS ONLY;
Distance Metrics Available:
Oracle supports multiple distance metrics for different use cases:
| Metric | Function | Best For | Notes |
|---|---|---|---|
| COSINE |
COSINE_DISTANCE() or <=>
|
Text embeddings | Measures angular similarity |
| EUCLIDEAN |
L2_DISTANCE() or <->
|
General purpose | Straight-line distance |
| EUCLIDEAN_SQUARED | VECTOR_DISTANCE(v1, v2, EUCLIDEAN_SQUARED) |
When only ranking matters | Faster (no sqrt) |
| DOT | INNER_PRODUCT() |
Normalized vectors | Efficient for unit vectors |
| MANHATTAN | L1_DISTANCE() |
Specific applications | Sum of absolute differences |
| HAMMING | VECTOR_DISTANCE(v1, v2, HAMMING) |
Binary vectors | Counts differing bits |
Complete Workflow Summary
Here's how all the pieces fit together:
1. Generate Vector Embeddings
- Load ONNX embedding model into database
- OR use external APIs/services
- Convert text, images, or files into vector representations
- Use chunking for large documents
2. Store Vector Embeddings
- Create tables with VECTOR data type columns
- Store vectors alongside business data
- Maintain data integrity and consistency
3. Create Vector Indexes
- Choose appropriate index type (HNSW, IVF, or Hybrid)
- Configure memory and accuracy parameters
- Optimize for your dataset size and query patterns
4. Query with Similarity Search
- Use native SQL with
VECTOR_DISTANCE()function - Combine semantic and relational searches
- Apply business filters and aggregations
- Leverage approximate search for speed or exact search for accuracy
End-to-End Example: Customer Support Knowledge Base
Let's walk through a complete real-world example of building a semantic search system for customer support articles.
Step 1: Set Up and Load Model
-- Create directory and load embedding model
CREATE DIRECTORY model_dir AS '/opt/oracle/models';
EXECUTE DBMS_VECTOR.LOAD_ONNX_MODEL(
'MODEL_DIR',
'all-MiniLM-L12-v2.onnx',
'support_model'
);
Step 2: Create Table Structure
CREATE TABLE support_articles (
article_id NUMBER PRIMARY KEY,
title VARCHAR2(500),
content CLOB,
category VARCHAR2(100),
created_date DATE,
views NUMBER,
helpful_votes NUMBER,
-- Vector column: 384 dimensions for all-MiniLM-L12-v2 model
content_vector VECTOR(384, FLOAT32)
);
Step 3: Load Data and Generate Embeddings
-- Insert articles with automatic embedding generation
INSERT INTO support_articles (article_id, title, content, category, content_vector)
VALUES (
1,
'How to Reset Your Password',
'To reset your password, navigate to the login page and click "Forgot Password"...',
'Account Management',
VECTOR_EMBEDDING(support_model USING
'To reset your password, navigate to the login page and click "Forgot Password"...'
AS data)
);
-- Or update existing data in batch
UPDATE support_articles
SET content_vector = VECTOR_EMBEDDING(support_model USING content AS data)
WHERE content_vector IS NULL;
Step 4: Create Vector Index
-- Create HNSW index for fast similarity search
CREATE VECTOR INDEX support_vec_idx ON support_articles (content_vector)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 95;
Step 5: Perform Semantic Searches
-- User searches: "can't log into my account"
SELECT
article_id,
title,
category,
VECTOR_DISTANCE(
content_vector,
VECTOR_EMBEDDING(support_model USING 'cannot log into my account' AS data),
COSINE
) AS relevance_score
FROM support_articles
ORDER BY relevance_score
FETCH FIRST 5 ROWS ONLY;
Expected Results:
Even though the search query doesn't contain exact words like "reset" or "password," the semantic search will return relevant articles about:
- Password reset procedures
- Account recovery options
- Login troubleshooting
- Authentication issues
Advanced Query with Business Logic:
-- Find relevant articles, prioritizing popular and recent ones
SELECT
article_id,
title,
category,
views,
created_date,
VECTOR_DISTANCE(content_vector, :query_vec, COSINE) AS semantic_score
FROM support_articles
WHERE category IN ('Account Management', 'Security', 'Login Issues')
AND created_date > ADD_MONTHS(SYSDATE, -12)
ORDER BY
semantic_score ASC, -- Most semantically relevant
helpful_votes DESC, -- Most helpful
views DESC -- Most viewed
FETCH FIRST 10 ROWS ONLY;
Document Processing: Converting Files to Searchable Vectors
Oracle provides powerful utilities for processing various file formats:
Processing PDFs and Documents
-- Create table for source documents
CREATE TABLE source_docs (
doc_id NUMBER PRIMARY KEY,
filename VARCHAR2(500),
file_content BLOB
);
-- Create table for chunked, vectorized content
CREATE TABLE doc_chunks (
chunk_id NUMBER PRIMARY KEY,
doc_id NUMBER,
chunk_offset NUMBER,
chunk_length NUMBER,
chunk_text VARCHAR2(4000),
chunk_vector VECTOR(384, FLOAT32)
);
-- Process: PDF → Text → Chunks → Vectors (all in one statement)
INSERT INTO doc_chunks (chunk_id, doc_id, chunk_offset, chunk_length, chunk_text, chunk_vector)
SELECT
ROW_NUMBER() OVER (ORDER BY d.doc_id,
JSON_VALUE(c.column_value, '$.chunk_offset' RETURNING NUMBER)),
d.doc_id,
JSON_VALUE(c.column_value, '$.chunk_offset' RETURNING NUMBER),
JSON_VALUE(c.column_value, '$.chunk_length' RETURNING NUMBER),
JSON_VALUE(c.column_value, '$.chunk_data'),
VECTOR_EMBEDDING(doc_model USING
JSON_VALUE(c.column_value, '$.chunk_data') AS data)
FROM source_docs d,
DBMS_VECTOR_CHAIN.UTL_TO_CHUNKS(
DBMS_VECTOR_CHAIN.UTL_TO_TEXT(d.file_content),
JSON('{"by":"words", "max":"100", "overlap":"20",
"split":"recursively", "language":"american", "normalize":"all"}')
) c;
This single SQL statement:
- Converts PDFs to plain text using
UTL_TO_TEXT - Splits text into 100-word chunks with 20-word overlap using
UTL_TO_CHUNKS - Generates vector embeddings for each chunk using
VECTOR_EMBEDDING - Stores everything in the database
Searching Across Document Chunks
-- Find the most relevant document chunks for a user query
SELECT
dc.doc_id,
sd.filename,
dc.chunk_text,
VECTOR_DISTANCE(dc.chunk_vector,
VECTOR_EMBEDDING(doc_model USING 'database performance tuning' AS data),
COSINE
) AS relevance
FROM doc_chunks dc
JOIN source_docs sd ON dc.doc_id = sd.doc_id
ORDER BY relevance
FETCH FIRST 3 ROWS ONLY;
Optional: Retrieval Augmented Generation (RAG)
While not part of the core vector search workflow, many applications extend it with RAG to generate AI-powered responses.
RAG Workflow:
- User submits a natural language query
- Convert query to vector embedding
- Use similarity search to find relevant content (top-k chunks)
- Combine retrieved content with user's original question
- Send augmented prompt to LLM
- LLM generates response grounded in retrieved data
Simple RAG Example:
-- Retrieve relevant context for RAG
WITH relevant_chunks AS (
SELECT chunk_text
FROM doc_chunks
ORDER BY VECTOR_DISTANCE(
chunk_vector,
VECTOR_EMBEDDING(doc_model USING :user_question AS data),
COSINE
)
FETCH FIRST 3 ROWS ONLY
)
SELECT LISTAGG(chunk_text, ' ') AS context
FROM relevant_chunks;
-- This context is then combined with the user's question
-- and sent to an LLM (like OCI Generative AI, OpenAI, etc.)
Best Practices
1. Chunking Strategy
- Use 20-30% overlap to prevent context loss at chunk boundaries
- Choose chunk size based on embedding model limits and content type
- Smaller chunks (50-100 words) for precise matching
- Larger chunks (200-300 words) for broader context
2. Embedding Model Selection
- For general text: all-MiniLM-L12-v2 or all-mpnet-base-v2
- For code: CodeBERT or GraphCodeBERT
- For multilingual: multilingual-e5-large
- For images: CLIP or ResNet
3. Index Configuration
- Set
TARGET ACCURACYbetween 90-99% (higher = slower but more accurate) - Allocate sufficient memory for HNSW indexes
- Monitor index size and rebuild periodically for IVF indexes
4. Query Optimization
- Use
FETCH FIRST n ROWS ONLYinstead ofROWNUMfor better performance - Combine vector search with filters early in the query
- Use bind variables for query vectors to enable plan caching
- Consider using approximate search for most queries, exact search only when necessary
5. Vector Format Selection
- FLOAT32: Best balance for most applications (default choice)
- INT8: Use when memory is constrained and slight accuracy loss is acceptable
- FLOAT64: Rarely needed, only for highest precision requirements
Real-World Use Cases
- E-commerce Product Search: Find products based on natural language descriptions
- Customer Support: Match customer questions to relevant help articles
- Document Management: Search enterprise documents by meaning, not just keywords
- Fraud Detection: Identify transactions similar to known fraud patterns
- Recommendation Engines: Suggest products, content, or services based on similarity
- Image Search: Find visually similar products or detect duplicate images
- Code Search: Search codebases using natural language queries
Performance Considerations
Memory Requirements for HNSW
Estimate memory needed:
Memory = 1.3 × (vector_format_size) × (dimensions) × (row_count)
Example:
1 million vectors × 384 dimensions × 4 bytes (FLOAT32) × 1.3 = ~2 GB
Query Performance
Typical Performance:
- Exact search: Seconds to hours (depending on dataset size)
- Approximate search with HNSW: Milliseconds to seconds
- Approximate search with IVF: Seconds (for billion-scale datasets)
Monitoring and Tuning
-- Check vector index details
SELECT index_name, index_type, accuracy
FROM user_vector_indexes;
-- Monitor vector memory usage
SELECT pool, alloc_bytes, used_bytes
FROM v$vector_memory_pool;
Oracle AI Vector Search provides a complete, integrated solution for semantic search within the database. The workflow—generate, store, index, and query—enables powerful AI applications without the complexity of managing separate vector databases.
Key Advantages:
- Unified System: Vectors and business data in one database
- Native SQL: Use familiar SQL syntax with vector extensions
- Enterprise Features: Security, availability, and performance features apply to vectors
- Flexible Deployment: Choose in-database or external embedding generation
- Scalable: From thousands to billions of vectors
- Hybrid Queries: Combine semantic search with relational filters
By keeping AI algorithms where the data lives, Oracle Database 23ai eliminates data fragmentation, reduces complexity, and provides enterprise-grade security and performance for your AI-powered applications.
Top comments (0)