DEV Community

Tsubasa Kanno
Tsubasa Kanno

Posted on

Snowflake AI_EMBED Function - Your Gateway to Unified Multimodal Vector Search

Introduction

I'm excited to share insights about Snowflake's latest AI_EMBED function, a revolutionary addition to Cortex AISQL! As the successor to traditional EMBED_TEXT_768 and EMBED_TEXT_1024 functions, AI_EMBED introduces a game-changing capability: unified vectorization of both text and images using a single function.

Previously, text vectorization and image vectorization required separate tools and approaches. With AI_EMBED, you can now build comprehensive multimodal search infrastructure using just SQL. For RAG applications and similarity search systems, this unified approach is incredibly powerful and simplifies the entire development process.

If you're building AI applications that need to handle both text and visual content, this feature will transform how you approach multimodal data processing!

Note (2025/7/26): AI_EMBED function is currently in public preview, so features may undergo significant updates in the future.

Note: This article represents my personal views and not those of Snowflake.

Understanding Snowflake Cortex AISQL

Snowflake Cortex AISQL provides a comprehensive set of functions that enable direct AI functionality calls from SQL. A perfect example is the AI_COMPLETE function, which demonstrates the unified approach by processing both text and images using the same function interface:

-- Text processing
SELECT AI_COMPLETE('llama4-maverick', 'Explain the key features of Snowflake');

-- Image processing (same function!)
SELECT AI_COMPLETE('llama4-maverick', 'Describe this image', TO_FILE('@image_stage', 'dog.jpeg'));
Enter fullscreen mode Exit fullscreen mode

Image processed by AI_COMPLETE function
Image processed by AI_COMPLETE function (Generated by Google Gemini)

-- Text processing result
"Snowflake is a cloud-based data warehouse solution with the following key features..."
Enter fullscreen mode Exit fullscreen mode
-- Image processing result
This image shows a close-up of a dog's face with white fur and large eyes. The dog has its mouth open...
Enter fullscreen mode Exit fullscreen mode

This same unified multimodal processing capability is now available for vectorization through the AI_EMBED function.

Previous Vectorization Approaches

To better understand AI_EMBED's value, let's review traditional vectorization methods in Snowflake. Previously available embedding functions included:

  • EMBED_TEXT_768
  • EMBED_TEXT_1024

For detailed analysis of these functions and their performance characteristics, I covered them extensively in my previous article about Snowflake vectorization options.

The key limitation was that text vectorization used these dedicated functions, while image vectorization required external tools or services, creating a fragmented development experience.

AI_EMBED Function Features

Unified Interface

The primary advantage of AI_EMBED is processing both text and images with the same function. This unified approach delivers several benefits:

  • Simplified Learning Curve: No need to master multiple functions or methods for different data types
  • Consistent Model Interface: Same function works across different embedding models
  • Streamlined Data Governance: All vectorization processing happens within Snowflake's secure environment
  • Easy Migration: Similar syntax to existing embedding functions enables smooth transitions

Available Models

AI_EMBED supports the following models:

Text Models

  • snowflake-arctic-embed-l-v2.0
  • snowflake-arctic-embed-l-v2.0-8k
  • nv-embed-qa-4
  • multilingual-e5-large
  • voyage-multilingual-2

Image Model

  • voyage-multimodal-3

Important: Only voyage-multimodal-3 supports image vectorization. Interestingly, this image model can also process text data effectively.

Model Characteristics

Understanding model selection is crucial for optimal results:

  • snowflake-arctic-embed-l-v2.0-8k: Supports up to 8,000 tokens, ideal for technical documents and long articles. This extended context can potentially eliminate chunking preprocessing for certain documents
  • nv-embed-qa-4: English-only model, not suitable for multilingual environments
  • Other models: Multilingual support with excellent performance across various languages

Choose snowflake-arctic-embed-l-v2.0-8k for long-form content and any model except nv-embed-qa-4 for multilingual applications.

Multimodal Vectorization Value

Business Value

Multimodal vectorization delivers substantial business benefits:

  • Enhanced Search Accuracy: Unified text and image search reveals related content that traditional keyword searches miss
  • Improved Customer Experience: Enables intuitive experiences like image-based product search or text-to-image discovery
  • Operational Efficiency: Centralized management and search across documents, diagrams, photos, and notes significantly reduces information access time
  • New Business Models: Enables previously impossible multimodal search services and recommendation engines

Technical Value

The technical advantages are equally compelling:

  • Data Silo Elimination: Solves the problem of text and image data managed in separate systems through unified vector space
  • Reduced Development Costs: Single platform approach with Snowflake eliminates system complexity compared to multiple specialized tools
  • Scalability: Snowflake's cloud-native architecture efficiently handles large-scale multimodal data processing
  • Security & Governance: Complete data and vectorization processing within Snowflake enables centralized governance management

Business Use Cases

AI_EMBED enables powerful business applications:

1. Multimodal Search Systems
Build e-commerce platforms where customers can search for similar products using both product images and text descriptions.

2. Content Management Systems
Create enterprise CMS solutions that enable unified search and classification of documents and visual assets.

3. Customer Support Enhancement
Develop systems that analyze both inquiry text and attached images to provide comprehensive, context-aware responses.

4. RAG Chatbots
Build enterprise chatbots that search across both textual documents and visual content to incorporate domain knowledge into LLM responses.

Practical Implementation Examples

Text Vectorization

Basic text vectorization is straightforward:

-- Text vectorization
SELECT AI_EMBED('snowflake-arctic-embed-l-v2.0-8k', 'Snowflake Summit 2025 introduced many exciting new features');
Enter fullscreen mode Exit fullscreen mode
-- Text vectorization result
[0.001018,0.002565,-0.024353,0.004829, ...]
Enter fullscreen mode Exit fullscreen mode

Image Vectorization

Image vectorization uses the same function with proper file handling:

-- Image vectorization
SELECT AI_EMBED('voyage-multimodal-3', TO_FILE('@image_stage', 'dog.jpeg'));
Enter fullscreen mode Exit fullscreen mode
-- Image vectorization result
[-0.015381,0.008240,-0.012634,-0.024048, ...]
Enter fullscreen mode Exit fullscreen mode

Vector Similarity Calculations

AI_EMBED generated vectors work seamlessly with Snowflake's vector similarity functions like cosine similarity. Here are three fundamental patterns:

1. Text-to-Text Similarity

-- Text similarity calculation
SELECT VECTOR_COSINE_SIMILARITY(
    AI_EMBED('snowflake-arctic-embed-l-v2.0', 'Beautiful sunny weather today'),
    AI_EMBED('snowflake-arctic-embed-l-v2.0', 'Today is blessed with great climate')
) as text_similarity;
Enter fullscreen mode Exit fullscreen mode
-- Text similarity result
0.8324767643
Enter fullscreen mode Exit fullscreen mode

2. Image-to-Image Similarity

-- Image similarity calculation
SELECT VECTOR_COSINE_SIMILARITY(
    AI_EMBED('voyage-multimodal-3', TO_FILE('@image_stage', 'cat.jpeg')),
    AI_EMBED('voyage-multimodal-3', TO_FILE('@image_stage', 'dog.jpeg'))
) as image_similarity;
Enter fullscreen mode Exit fullscreen mode
-- Image similarity result
0.5069280956
Enter fullscreen mode Exit fullscreen mode

3. Cross-Modal Text-to-Image Similarity

-- Cross-modal similarity calculation
SELECT VECTOR_COSINE_SIMILARITY(
    AI_EMBED('voyage-multimodal-3', 'Close-up of a dog face with white fur and large eyes'),
    AI_EMBED('voyage-multimodal-3', TO_FILE('@image_stage', 'dog.jpeg'))
) as cross_modal_similarity;
Enter fullscreen mode Exit fullscreen mode
-- Cross-modal similarity result
0.6030817788
Enter fullscreen mode Exit fullscreen mode

Cosine similarity ranges from -1 to 1, with values closer to 1 indicating higher similarity.

Summary

AI_EMBED represents a significant advancement in Snowflake's vectorization capabilities. The unified interface for processing both text and images makes developing multimodal AI applications significantly more accessible and efficient.

Migration from existing EMBED_TEXT_1024 functions is straightforward, enabling gradual application upgrades. As data workloads increasingly involve mixed text and image content, AI_EMBED provides the foundation for building next-generation data utilization platforms efficiently.

The future of search and AI applications is multimodal, and AI_EMBED positions Snowflake users to capitalize on these emerging opportunities. I encourage you to explore AI_EMBED and discover new possibilities for your business applications!


What multimodal use cases are you most excited to build with AI_EMBED? Share your thoughts in the comments below!


Promotion

Snowflake What's New Updates on X

I share Snowflake What's New updates on X. Follow for the latest insights:

English Version

Snowflake What's New Bot (English Version)

Japanese Version

Snowflake's What's New Bot (Japanese Version)

Change Log

(20250726) Initial post

Original Japanese Article

https://zenn.dev/tsubasa_tech/articles/e7683605e7d7aa

Top comments (0)