Ryan Giggs

Posted on Feb 6

Oracle AI Vector Search Workflow: From Data to Semantic Search

#oracledatabase #semanticsearch #vectordatabase

Oracle Database 23ai introduces AI Vector Search, a powerful feature that enables semantic search capabilities directly within the database. Unlike traditional keyword-based searches, vector search understands the meaning and context behind your queries, allowing you to find relevant information even when exact keywords don't match.

This guide walks through the complete Oracle AI Vector Search workflow, from generating vector embeddings to performing sophisticated similarity searches.

What Are Vector Embeddings?

Vector embeddings are mathematical representations that capture the semantic meaning of your data. They convert unstructured data like text, images, or audio into numerical vectors that can be mathematically compared.

Key Properties:

Each embedding is an array of numbers (typically 128 to 1536 dimensions)
Semantically similar content has vectors that are close together in vector space
Distance between vectors indicates similarity in meaning

For example, the vectors for "apple" and "orange" are closer together than the vectors for "apple" and "dog" because fruits share more semantic similarity than a fruit and an animal.

The Complete Oracle AI Vector Search Workflow

The workflow consists of four primary steps that transform raw data into searchable semantic knowledge:

Step 1: Generate Vector Embeddings

You can generate vector embeddings either outside or within Oracle Database, depending on your requirements.

Option A: Generate Embeddings Within Oracle Database (ONNX)

This is the recommended approach for integrated, secure workflows. Oracle Database 23ai supports loading ONNX (Open Neural Network Exchange) format embedding models directly into the database.

Loading an ONNX Model:

-- Load the embedding model into the database
EXECUTE DBMS_VECTOR.LOAD_ONNX_MODEL(
    'DM_DUMP',
    'all_MiniLM_L12_v2.onnx',
    'doc_model'
);

Note: The directory 'DM_DUMP' must exist and point to the OS directory containing your ONNX model file.

Generating Embeddings from Text:

-- Generate a vector embedding from a text string
SELECT TO_VECTOR(
    VECTOR_EMBEDDING(doc_model USING 'hello world' AS data)
) AS embedding;

Common Embedding Generation Patterns:

Convert Text String to Embedding:

-- Simple text to vector
SELECT VECTOR_EMBEDDING(doc_model USING 'semantic search' AS data) AS vec;

Convert File to Text to Chunks to Embeddings:

-- Load PDF, convert to text, chunk, and embed
INSERT INTO doc_chunks (doc_id, chunk_id, chunk_data, embedding)
SELECT 
    d.id,
    JSON_VALUE(c.column_value, '$.chunk_id' RETURNING NUMBER),
    JSON_VALUE(c.column_value, '$.chunk_data'),
    VECTOR_EMBEDDING(doc_model USING 
        JSON_VALUE(c.column_value, '$.chunk_data') AS data)
FROM documents d,
    DBMS_VECTOR_CHAIN.UTL_TO_CHUNKS(
        DBMS_VECTOR_CHAIN.UTL_TO_TEXT(d.pdf_content),
        JSON('{"by":"words", "max":"100", "overlap":"20", 
              "split":"recursively", "language":"american"}')
    ) c;

Convert Files Directly to Embeddings:

-- For images or other media files
SELECT VECTOR_EMBEDDING(
    image_model USING file_content AS data
) AS image_vec
FROM media_files;

End-to-End Pipeline (Chainable Functions):

Oracle's DBMS_VECTOR_CHAIN package provides chainable utility functions for complete data transformation pipelines:

-- Complete pipeline: PDF → Text → Chunks → Embeddings
INSERT INTO vectorized_docs
SELECT 
    d.doc_id,
    chunks.chunk_id,
    chunks.chunk_data,
    embeddings.embed_vector
FROM source_documents d,
    DBMS_VECTOR_CHAIN.UTL_TO_CHUNKS(
        DBMS_VECTOR_CHAIN.UTL_TO_TEXT(d.pdf_blob),
        JSON('{"by":"words", "max":"100", "overlap":"20"}')
    ) chunks,
    DBMS_VECTOR_CHAIN.UTL_TO_EMBEDDINGS(
        chunks.chunk_data,
        JSON('{"provider":"database", "model":"doc_model"}')
    ) embeddings;

Understanding Document Chunking:

When working with large documents, you need to split them into smaller chunks because:

Embedding models have input size limitations (typically 512-4000 characters)
Smaller chunks provide more precise semantic matching
Better retrieval accuracy in similarity searches

Chunking Parameters:

by: Chunking method (words, sentences, paragraphs, vocabulary)
max: Maximum size per chunk (e.g., 100 words)
overlap: Number of overlapping units between chunks (prevents context loss at boundaries)
split: Splitting strategy (recursively, custom)
language: Language-specific rules for sentence boundaries
normalize: Text normalization options

Option B: Generate Embeddings Outside Oracle Database

You can also generate embeddings externally using third-party services or local models, then load them into Oracle Database.

External Generation Options:

Cloud APIs: OpenAI, Cohere, Google Vertex AI, Hugging Face
Local Models: Python libraries (sentence-transformers, transformers)
Self-hosted Services: ONNX Runtime, TensorFlow Serving

Example (Python):

from sentence_transformers import SentenceTransformer
import oracledb

# Generate embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')
text = "Oracle AI Vector Search"
embedding = model.encode(text)

# Store in Oracle Database
connection = oracledb.connect(user="user", password="pwd", dsn="localhost/FREEPDB1")
cursor = connection.cursor()
cursor.execute(
    "INSERT INTO documents (id, text, embedding) VALUES (:1, :2, :3)",
    [1, text, embedding.tolist()]
)
connection.commit()

Advantages of In-Database Generation:

No data movement between systems
Reduced latency
Simplified architecture
Automatic consistency between data and embeddings
Enterprise security and governance apply to the entire pipeline

Advantages of External Generation:

Flexibility in model choice
Access to latest models and APIs
Can use specialized or proprietary models
Independent scaling of embedding generation

Step 2: Store Vector Embeddings

Store vector embeddings in columns with the VECTOR data type within standard relational tables. This allows you to keep business data and semantic vectors together in a unified system.

Creating Tables with Vector Columns:

-- Create table with VECTOR data type
CREATE TABLE products (
    product_id NUMBER PRIMARY KEY,
    product_name VARCHAR2(200),
    description CLOB,
    price NUMBER(10,2),
    category VARCHAR2(100),
    -- Vector column with 384 dimensions, FLOAT32 format
    description_vector VECTOR(384, FLOAT32)
);

Supported Vector Formats:

INT8: 8-bit integers (memory efficient, slight accuracy trade-off)
FLOAT32: 32-bit floating-point (most common, balanced)
FLOAT64: 64-bit floating-point (highest precision, more storage)

Inserting Data with Vectors:

-- Insert with inline embedding generation
INSERT INTO products (product_id, product_name, description, description_vector)
VALUES (
    1001,
    'Wireless Headphones',
    'Noise-cancelling Bluetooth headphones with 30-hour battery',
    VECTOR_EMBEDDING(doc_model USING 
        'Noise-cancelling Bluetooth headphones with 30-hour battery' AS data)
);

-- Or insert data first, then update vectors
INSERT INTO products (product_id, product_name, description, price)
VALUES (1002, 'Smart Watch', 'Fitness tracker with heart rate monitor', 299.99);

UPDATE products
SET description_vector = VECTOR_EMBEDDING(doc_model USING description AS data)
WHERE product_id = 1002;

Step 3: Create Vector Indexes

For large vector datasets, creating vector indexes is essential for performance. Without indexes, similarity searches require comparing the query vector against every vector in the table (exhaustive search), which becomes slow as data grows.

Why Vector Indexes Matter:

An exhaustive similarity search is accurate but can be slow because it calculates distance to every vector in the column. Vector indexes use Approximate Nearest Neighbor (ANN) algorithms to dramatically reduce the number of distance calculations while maintaining high accuracy (typically 90-99%).

Performance Impact:

Without Index (Exhaustive): O(n) complexity - checks every vector
With Index (Approximate): O(log n) complexity - checks subset of vectors
Real-world improvement: Searches that take hours can complete in milliseconds

Types of Vector Indexes in Oracle 23ai:

1. HNSW (Hierarchical Navigable Small World) - In-Memory Index:

CREATE VECTOR INDEX product_hnsw_idx ON products (description_vector)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 95;

Characteristics:

Stored in Vector Memory Pool (part of SGA)
Extremely fast query performance
Best for datasets that fit in available memory
Requires configuring VECTOR_MEMORY_SIZE parameter

Configure Vector Memory:

ALTER SYSTEM SET VECTOR_MEMORY_SIZE = 2G SCOPE=BOTH;

2. IVF (Inverted File Flat) - Storage-Based Index:

CREATE VECTOR INDEX product_ivf_idx ON products (description_vector)
ORGANIZATION NEIGHBOR PARTITIONS
DISTANCE COSINE
WITH TARGET ACCURACY 95;

Characteristics:

Stored on disk (not memory-constrained)
Excellent for very large datasets
Uses clustering to partition vectors by similarity
Supports both global and local partitioning

3. Hybrid Vector Index - Combined Text and Semantic Search:

CREATE VECTOR INDEX product_hybrid_idx ON products (description)
ORGANIZATION HYBRID
PARAMETERS('EMBEDDING MODEL doc_model');

Characteristics:

Combines full-text search and vector similarity
Single index for keyword and semantic searches
Ideal for mixed query types

Choosing the Right Index:

HNSW: When dataset fits in memory and you need maximum speed
IVF: When dataset is very large or memory is limited
Hybrid: When you need both keyword and semantic search capabilities

Step 4: Query Data with Similarity Search

Oracle AI Vector Search uses native SQL with specialized functions for similarity searches. You can combine semantic search with traditional relational queries for powerful hybrid searches.

Basic Similarity Search:

-- Find products similar to a search query
SELECT 
    product_name,
    description,
    VECTOR_DISTANCE(
        description_vector,
        VECTOR_EMBEDDING(doc_model USING 'wireless audio device' AS data),
        COSINE
    ) AS similarity_score
FROM products
ORDER BY similarity_score
FETCH FIRST 5 ROWS ONLY;

Using Shorthand Operators:

Oracle provides convenient distance operators:

-- <=> is the cosine distance operator
SELECT product_name, description
FROM products
ORDER BY description_vector <=> 
    VECTOR_EMBEDDING(doc_model USING 'wireless audio device' AS data)
FETCH FIRST 5 ROWS ONLY;

-- <-> is the Euclidean distance operator
SELECT product_name
FROM products
ORDER BY description_vector <-> :query_vector
FETCH FIRST 10 ROWS ONLY;

Approximate vs. Exact Search:

-- Approximate search (uses index if available, faster)
SELECT product_name
FROM products
ORDER BY VECTOR_DISTANCE(description_vector, :query_vec, COSINE)
FETCH FIRST 10 ROWS ONLY;

-- Exact search (no index, 100% accurate, slower)
SELECT product_name
FROM products
ORDER BY VECTOR_DISTANCE(description_vector, :query_vec, COSINE)
FETCH EXACT FIRST 10 ROWS ONLY;

Combining Semantic and Relational Searches:

This is where Oracle AI Vector Search truly shines—combining semantic similarity with business logic in a single query:

-- Find similar products within a specific price range and category
SELECT 
    product_name,
    price,
    category,
    VECTOR_DISTANCE(description_vector, :query_vec, COSINE) AS similarity
FROM products
WHERE price BETWEEN 100 AND 500
    AND category = 'Electronics'
ORDER BY similarity
FETCH FIRST 10 ROWS ONLY;

Advanced Hybrid Query Example:

-- Semantic search with business filters and aggregation
SELECT 
    p.category,
    COUNT(*) AS matching_products,
    AVG(p.price) AS avg_price,
    MIN(VECTOR_DISTANCE(p.description_vector, :search_vec, COSINE)) AS best_match
FROM products p
WHERE p.price < 1000
    AND p.in_stock = 'Y'
GROUP BY p.category
HAVING COUNT(*) > 5
ORDER BY best_match
FETCH FIRST 3 ROWS ONLY;

Distance Metrics Available:

Oracle supports multiple distance metrics for different use cases:

Metric	Function	Best For	Notes
COSINE	`COSINE_DISTANCE()` or `<=>`	Text embeddings	Measures angular similarity
EUCLIDEAN	`L2_DISTANCE()` or `<->`	General purpose	Straight-line distance
EUCLIDEAN_SQUARED	`VECTOR_DISTANCE(v1, v2, EUCLIDEAN_SQUARED)`	When only ranking matters	Faster (no sqrt)
DOT	`INNER_PRODUCT()`	Normalized vectors	Efficient for unit vectors
MANHATTAN	`L1_DISTANCE()`	Specific applications	Sum of absolute differences
HAMMING	`VECTOR_DISTANCE(v1, v2, HAMMING)`	Binary vectors	Counts differing bits

Complete Workflow Summary

Here's how all the pieces fit together:

1. Generate Vector Embeddings

Load ONNX embedding model into database
OR use external APIs/services
Convert text, images, or files into vector representations
Use chunking for large documents

2. Store Vector Embeddings

Create tables with VECTOR data type columns
Store vectors alongside business data
Maintain data integrity and consistency

3. Create Vector Indexes

Choose appropriate index type (HNSW, IVF, or Hybrid)
Configure memory and accuracy parameters
Optimize for your dataset size and query patterns

4. Query with Similarity Search

Use native SQL with VECTOR_DISTANCE() function
Combine semantic and relational searches
Apply business filters and aggregations
Leverage approximate search for speed or exact search for accuracy

End-to-End Example: Customer Support Knowledge Base

Let's walk through a complete real-world example of building a semantic search system for customer support articles.

Step 1: Set Up and Load Model

-- Create directory and load embedding model
CREATE DIRECTORY model_dir AS '/opt/oracle/models';

EXECUTE DBMS_VECTOR.LOAD_ONNX_MODEL(
    'MODEL_DIR',
    'all-MiniLM-L12-v2.onnx',
    'support_model'
);

Step 2: Create Table Structure

CREATE TABLE support_articles (
    article_id NUMBER PRIMARY KEY,
    title VARCHAR2(500),
    content CLOB,
    category VARCHAR2(100),
    created_date DATE,
    views NUMBER,
    helpful_votes NUMBER,
    -- Vector column: 384 dimensions for all-MiniLM-L12-v2 model
    content_vector VECTOR(384, FLOAT32)
);

Step 3: Load Data and Generate Embeddings

-- Insert articles with automatic embedding generation
INSERT INTO support_articles (article_id, title, content, category, content_vector)
VALUES (
    1,
    'How to Reset Your Password',
    'To reset your password, navigate to the login page and click "Forgot Password"...',
    'Account Management',
    VECTOR_EMBEDDING(support_model USING 
        'To reset your password, navigate to the login page and click "Forgot Password"...' 
        AS data)
);

-- Or update existing data in batch
UPDATE support_articles
SET content_vector = VECTOR_EMBEDDING(support_model USING content AS data)
WHERE content_vector IS NULL;

Step 4: Create Vector Index

-- Create HNSW index for fast similarity search
CREATE VECTOR INDEX support_vec_idx ON support_articles (content_vector)
ORGANIZATION INMEMORY NEIGHBOR GRAPH
DISTANCE COSINE
WITH TARGET ACCURACY 95;

Step 5: Perform Semantic Searches

-- User searches: "can't log into my account"
SELECT 
    article_id,
    title,
    category,
    VECTOR_DISTANCE(
        content_vector,
        VECTOR_EMBEDDING(support_model USING 'cannot log into my account' AS data),
        COSINE
    ) AS relevance_score
FROM support_articles
ORDER BY relevance_score
FETCH FIRST 5 ROWS ONLY;

Expected Results:
Even though the search query doesn't contain exact words like "reset" or "password," the semantic search will return relevant articles about:

Password reset procedures
Account recovery options
Login troubleshooting
Authentication issues

Advanced Query with Business Logic:

-- Find relevant articles, prioritizing popular and recent ones
SELECT 
    article_id,
    title,
    category,
    views,
    created_date,
    VECTOR_DISTANCE(content_vector, :query_vec, COSINE) AS semantic_score
FROM support_articles
WHERE category IN ('Account Management', 'Security', 'Login Issues')
    AND created_date > ADD_MONTHS(SYSDATE, -12)
ORDER BY 
    semantic_score ASC,           -- Most semantically relevant
    helpful_votes DESC,            -- Most helpful
    views DESC                     -- Most viewed
FETCH FIRST 10 ROWS ONLY;

Document Processing: Converting Files to Searchable Vectors

Oracle provides powerful utilities for processing various file formats:

Processing PDFs and Documents

-- Create table for source documents
CREATE TABLE source_docs (
    doc_id NUMBER PRIMARY KEY,
    filename VARCHAR2(500),
    file_content BLOB
);

-- Create table for chunked, vectorized content
CREATE TABLE doc_chunks (
    chunk_id NUMBER PRIMARY KEY,
    doc_id NUMBER,
    chunk_offset NUMBER,
    chunk_length NUMBER,
    chunk_text VARCHAR2(4000),
    chunk_vector VECTOR(384, FLOAT32)
);

-- Process: PDF → Text → Chunks → Vectors (all in one statement)
INSERT INTO doc_chunks (chunk_id, doc_id, chunk_offset, chunk_length, chunk_text, chunk_vector)
SELECT 
    ROW_NUMBER() OVER (ORDER BY d.doc_id, 
        JSON_VALUE(c.column_value, '$.chunk_offset' RETURNING NUMBER)),
    d.doc_id,
    JSON_VALUE(c.column_value, '$.chunk_offset' RETURNING NUMBER),
    JSON_VALUE(c.column_value, '$.chunk_length' RETURNING NUMBER),
    JSON_VALUE(c.column_value, '$.chunk_data'),
    VECTOR_EMBEDDING(doc_model USING 
        JSON_VALUE(c.column_value, '$.chunk_data') AS data)
FROM source_docs d,
    DBMS_VECTOR_CHAIN.UTL_TO_CHUNKS(
        DBMS_VECTOR_CHAIN.UTL_TO_TEXT(d.file_content),
        JSON('{"by":"words", "max":"100", "overlap":"20", 
              "split":"recursively", "language":"american", "normalize":"all"}')
    ) c;

This single SQL statement:

Converts PDFs to plain text using UTL_TO_TEXT
Splits text into 100-word chunks with 20-word overlap using UTL_TO_CHUNKS
Generates vector embeddings for each chunk using VECTOR_EMBEDDING
Stores everything in the database

Searching Across Document Chunks

-- Find the most relevant document chunks for a user query
SELECT 
    dc.doc_id,
    sd.filename,
    dc.chunk_text,
    VECTOR_DISTANCE(dc.chunk_vector, 
        VECTOR_EMBEDDING(doc_model USING 'database performance tuning' AS data),
        COSINE
    ) AS relevance
FROM doc_chunks dc
JOIN source_docs sd ON dc.doc_id = sd.doc_id
ORDER BY relevance
FETCH FIRST 3 ROWS ONLY;

Optional: Retrieval Augmented Generation (RAG)

While not part of the core vector search workflow, many applications extend it with RAG to generate AI-powered responses.

RAG Workflow:

User submits a natural language query
Convert query to vector embedding
Use similarity search to find relevant content (top-k chunks)
Combine retrieved content with user's original question
Send augmented prompt to LLM
LLM generates response grounded in retrieved data

Simple RAG Example:

-- Retrieve relevant context for RAG
WITH relevant_chunks AS (
    SELECT chunk_text
    FROM doc_chunks
    ORDER BY VECTOR_DISTANCE(
        chunk_vector,
        VECTOR_EMBEDDING(doc_model USING :user_question AS data),
        COSINE
    )
    FETCH FIRST 3 ROWS ONLY
)
SELECT LISTAGG(chunk_text, ' ') AS context
FROM relevant_chunks;

-- This context is then combined with the user's question
-- and sent to an LLM (like OCI Generative AI, OpenAI, etc.)

Best Practices

1. Chunking Strategy

Use 20-30% overlap to prevent context loss at chunk boundaries
Choose chunk size based on embedding model limits and content type
Smaller chunks (50-100 words) for precise matching
Larger chunks (200-300 words) for broader context

2. Embedding Model Selection

For general text: all-MiniLM-L12-v2 or all-mpnet-base-v2
For code: CodeBERT or GraphCodeBERT
For multilingual: multilingual-e5-large
For images: CLIP or ResNet

3. Index Configuration

Set TARGET ACCURACY between 90-99% (higher = slower but more accurate)
Allocate sufficient memory for HNSW indexes
Monitor index size and rebuild periodically for IVF indexes

4. Query Optimization

Use FETCH FIRST n ROWS ONLY instead of ROWNUM for better performance
Combine vector search with filters early in the query
Use bind variables for query vectors to enable plan caching
Consider using approximate search for most queries, exact search only when necessary

5. Vector Format Selection

FLOAT32: Best balance for most applications (default choice)
INT8: Use when memory is constrained and slight accuracy loss is acceptable
FLOAT64: Rarely needed, only for highest precision requirements

Real-World Use Cases

E-commerce Product Search: Find products based on natural language descriptions
Customer Support: Match customer questions to relevant help articles
Document Management: Search enterprise documents by meaning, not just keywords
Fraud Detection: Identify transactions similar to known fraud patterns
Recommendation Engines: Suggest products, content, or services based on similarity
Image Search: Find visually similar products or detect duplicate images
Code Search: Search codebases using natural language queries

Performance Considerations

Memory Requirements for HNSW

Estimate memory needed:

Memory = 1.3 × (vector_format_size) × (dimensions) × (row_count)

Example:
1 million vectors × 384 dimensions × 4 bytes (FLOAT32) × 1.3 = ~2 GB

Query Performance

Typical Performance:

Exact search: Seconds to hours (depending on dataset size)
Approximate search with HNSW: Milliseconds to seconds
Approximate search with IVF: Seconds (for billion-scale datasets)

Monitoring and Tuning

-- Check vector index details
SELECT index_name, index_type, accuracy
FROM user_vector_indexes;

-- Monitor vector memory usage
SELECT pool, alloc_bytes, used_bytes
FROM v$vector_memory_pool;

Oracle AI Vector Search provides a complete, integrated solution for semantic search within the database. The workflow—generate, store, index, and query—enables powerful AI applications without the complexity of managing separate vector databases.

Key Advantages:

Unified System: Vectors and business data in one database
Native SQL: Use familiar SQL syntax with vector extensions
Enterprise Features: Security, availability, and performance features apply to vectors
Flexible Deployment: Choose in-database or external embedding generation
Scalable: From thousands to billions of vectors
Hybrid Queries: Combine semantic search with relational filters

By keeping AI algorithms where the data lives, Oracle Database 23ai eliminates data fragmentation, reduces complexity, and provides enterprise-grade security and performance for your AI-powered applications.

DEV Community

Oracle AI Vector Search Workflow: From Data to Semantic Search

What Are Vector Embeddings?

The Complete Oracle AI Vector Search Workflow

Step 1: Generate Vector Embeddings

Option A: Generate Embeddings Within Oracle Database (ONNX)

Option B: Generate Embeddings Outside Oracle Database

Step 2: Store Vector Embeddings

Step 3: Create Vector Indexes

Step 4: Query Data with Similarity Search

Complete Workflow Summary

1. Generate Vector Embeddings

2. Store Vector Embeddings

3. Create Vector Indexes

4. Query with Similarity Search

End-to-End Example: Customer Support Knowledge Base

Document Processing: Converting Files to Searchable Vectors

Processing PDFs and Documents

Searching Across Document Chunks

Optional: Retrieval Augmented Generation (RAG)

Best Practices

1. Chunking Strategy

2. Embedding Model Selection

3. Index Configuration

4. Query Optimization

5. Vector Format Selection

Real-World Use Cases

Performance Considerations

Memory Requirements for HNSW

Query Performance

Monitoring and Tuning

Additional Resources

Top comments (0)