Introduction
Cortex Analyst is one of Snowflake's most exciting features, enabling revolutionary natural language data analysis. Business users who don't usually write queries can simply ask "What are the top 10 products by sales last month?" and get instant insights without SQL - true data democratization in action!
However, for Cortex Analyst to reach its full potential, you need a well-structured semantic layer as the foundation. Snowflake offers Semantic Models or Semantic Views to accomplish this, but creating semantic layers requires time and effort: understanding table structures, defining business logic, setting appropriate descriptions and synonyms, and more.
If you're thinking "I want to use Cortex Analyst, but setting up the semantic layer is taking forever..." - you're in the right place! Today I'm introducing a stored procedure that auto-generates semantic views using AI. This tool lets you quickly create a baseline semantic view from existing tables, giving you a fast start toward delivering analytics environments to business users.
Note: This stored procedure is designed to assist with initial semantic view creation. We recommend customizing the generated views according to your business requirements.
Note: At the June 2025 Snowflake Summit, an official semantic layer auto-generation feature was announced. This stored procedure serves as a stopgap solution until the official feature is released. When the official feature launches, I'll write about it as well!
Note: This article represents my personal views and not those of Snowflake.
What is a Semantic View?
A semantic view is a Snowflake schema-level object that adds business context to physical database tables. It maps technical column names (e.g., amt_ttl_pre_dsc
) to business-friendly concepts (e.g., "Total Sales", "Sales Amount") with detailed descriptions (e.g., "Total sales value for stores in the metropolitan area"), bridging the gap between data and business.
Even more importantly, semantic views help AI correctly understand database structure and business logic. Cortex Analyst can generate high-precision SQL queries from natural language questions precisely because of the semantic layer. Through synonyms and detailed descriptions, AI learns to understand that "last month's sales", "total sales", and "sales amount" all refer to the same concept, enabling it to select the right tables and columns for accurate query generation.
Components of a Semantic View
Semantic views consist of these elements:
Element | Description | Examples |
---|---|---|
Facts | Primarily numeric fact data (row-level attributes) | Sales quantity, unit price, amount, etc. |
Dimensions | Category and attribute data (analysis axes) | Product name, customer ID, date, region, etc. |
Metrics | Aggregated indicators (business KPIs) | Total sales, average unit price, sales growth rate, etc. |
Each element can be defined with:
- Synonyms: Business aliases and natural phrasings (e.g., "sales" → "revenue", "sales amount")
- Comment: Business explanations including units, ranges, and other details
Why Semantic Views Matter
Semantic views provide these key values:
- Improved Cortex Analyst Accuracy: Serves as the foundation for AI to understand natural language and generate accurate SQL
- Consistent Business Definitions: Unified business metrics across the entire organization
- Data Democratization: Enables non-technical users to access data
For more details, see the official Snowflake documentation - Semantic View Overview.
Note: For information about Semantic Models (YAML format), see the Cortex Analyst Semantic Model Documentation. This article focuses on the newer Semantic Views feature.
Challenges in Creating Semantic Layers
Creating semantic layers manually involves these tasks:
- Understanding table structure (table definitions, column names, data types, sample data review)
- Classifying Facts, Dimensions, and Metrics
- Identifying business-relevant synonyms actually used in the field
- Considering appropriate descriptions (Comments) for each element
- Defining the structure in SQL syntax
For large tables or multiple tables, this process takes considerable time. However, spending too much time on preparation delays the delivery of analytics environments to business users - having a quality baseline to start with is extremely valuable!
AI-Powered Auto-Generation Stored Procedure
Overview
The GENERATE_SEMANTIC_VIEW
stored procedure I'm introducing auto-generates semantic views through these steps:
-
Retrieve Table Information: Get column info, data types, and comments using
DESCRIBE TABLE
- Analyze Sample Data: Infer actual usage and data characteristics from the first 5 rows
- AI-Generated SQL: LLM via AI_COMPLETE function (like claude-sonnet-4-5) generates semantic view DDL
- Create Semantic View: Execute the generated SQL
Key Features
Feature | Description |
---|---|
Fully Automated | Automatically classifies Facts, Dimensions, and Metrics by just specifying table name |
Rich Synonyms | Infers actual usage from sample data to generate natural business-relevant synonyms |
Detailed Descriptions | Auto-generates Comments with business meaning for each element |
Customizable | Specify which LLM model to use (default: claude-sonnet-4-5) |
Stored Procedure Code
The complete code for the GENERATE_SEMANTIC_VIEW
stored procedure is provided below.
CREATE OR REPLACE PROCEDURE GENERATE_SEMANTIC_VIEW(
SOURCE_DATABASE VARCHAR, -- Source database name
SOURCE_SCHEMA VARCHAR, -- Source schema name
SOURCE_TABLE VARCHAR, -- Source table name
TARGET_DATABASE VARCHAR, -- Target database name
TARGET_SCHEMA VARCHAR, -- Target schema name
TARGET_VIEW_NAME VARCHAR, -- Semantic view name to create
LLM_MODEL VARCHAR DEFAULT 'claude-sonnet-4-5' -- LLM model to use
)
RETURNS VARCHAR
LANGUAGE SQL
EXECUTE AS CALLER
AS
$$
DECLARE
-- Table information
column_info VARCHAR;
sample_data_json VARCHAR;
table_comment VARCHAR;
-- AI generation results
ai_response VARCHAR;
-- Final SQL
create_view_sql VARCHAR;
BEGIN
-- ================================================================================
-- STEP 1: Retrieve table information
-- ================================================================================
-- Get column information (using DESCRIBE TABLE)
DECLARE
describe_sql VARCHAR;
rs RESULTSET;
BEGIN
describe_sql := 'DESCRIBE TABLE ' || :SOURCE_DATABASE || '.' || :SOURCE_SCHEMA || '.' || :SOURCE_TABLE;
rs := (EXECUTE IMMEDIATE :describe_sql);
-- Get information from results
-- $1: Column name, $2: Data type, $10: Comment
SELECT LISTAGG(
$1 || ':' || $2 || CASE WHEN $10 IS NOT NULL AND $10 != '' THEN '(' || $10 || ')' ELSE '' END,
', '
) INTO column_info
FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()));
EXCEPTION
WHEN OTHER THEN
RETURN 'Error: Could not retrieve table information - ' || :SOURCE_DATABASE || '.' || :SOURCE_SCHEMA || '.' || :SOURCE_TABLE || '. Error: ' || SQLERRM;
END;
-- Error check
IF (column_info IS NULL OR LENGTH(:column_info) = 0) THEN
RETURN 'Error: Column information is empty - ' || :SOURCE_DATABASE || '.' || :SOURCE_SCHEMA || '.' || :SOURCE_TABLE;
END IF;
-- Get table comment
SELECT COMMENT INTO table_comment
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_CATALOG = :SOURCE_DATABASE
AND TABLE_SCHEMA = :SOURCE_SCHEMA
AND TABLE_NAME = :SOURCE_TABLE;
-- Get sample data (first 5 rows)
DECLARE
sample_sql VARCHAR;
sample_rs RESULTSET;
BEGIN
sample_sql := 'SELECT * FROM ' || :SOURCE_DATABASE || '.' || :SOURCE_SCHEMA || '.' || :SOURCE_TABLE || ' LIMIT 5';
sample_rs := (EXECUTE IMMEDIATE :sample_sql);
-- Get sample data in JSON format
SELECT TO_VARCHAR(ARRAY_AGG(OBJECT_CONSTRUCT(*))) INTO sample_data_json
FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()));
EXCEPTION
WHEN OTHER THEN
sample_data_json := 'Sample data unavailable';
END;
-- ================================================================================
-- STEP 2: Generate complete Semantic View definition with AI
-- ================================================================================
SELECT SNOWFLAKE.CORTEX.COMPLETE(
:LLM_MODEL,
CONCAT(
'[IMPORTANT] Generate a Snowflake semantic view using the information below.',
'\n\n★ DO NOT use any column names that do not exist ★',
'\n\n[TABLE INFORMATION]',
'\nTable: ', :SOURCE_DATABASE, '.', :SOURCE_SCHEMA, '.', :SOURCE_TABLE,
'\nAlias: ', :SOURCE_TABLE,
'\nTable Comment: ', COALESCE(:table_comment, 'None'),
'\n\n[EXISTING COLUMNS (Use ONLY these)]',
'\n', :column_info,
'\n\n[SAMPLE DATA (For understanding data usage)]',
'\n', SUBSTRING(:sample_data_json, 1, 1000),
'\n\n[SQL TO GENERATE]',
'\nCREATE OR REPLACE SEMANTIC VIEW ', :TARGET_DATABASE, '.', :TARGET_SCHEMA, '.', :TARGET_VIEW_NAME,
'\n TABLES (',
'\n ', :SOURCE_TABLE, ' AS ', :SOURCE_DATABASE, '.', :SOURCE_SCHEMA, '.', :SOURCE_TABLE,
'\n WITH SYNONYMS (''alias1'', ''alias2'', ''alias3'', ''alias4'')',
'\n COMMENT = ''Detailed table description''',
'\n )',
'\n FACTS (',
'\n ', :SOURCE_TABLE, '.fact_name AS existing_column_name',
'\n WITH SYNONYMS (''alias1'', ''alias2'')',
'\n COMMENT = ''Detailed column description''',
'\n )',
'\n DIMENSIONS (',
'\n ', :SOURCE_TABLE, '.dim_name AS existing_column_name',
'\n WITH SYNONYMS (''alias1'', ''alias2'')',
'\n COMMENT = ''Detailed column description''',
'\n )',
'\n METRICS (',
'\n ', :SOURCE_TABLE, '.metric_name AS SUM(existing_column_name)',
'\n WITH SYNONYMS (''alias1'', ''alias2'')',
'\n COMMENT = ''Detailed column description''',
'\n )',
'\n COMMENT = ''Detailed semantic view description'';',
'\n\n[ABSOLUTE RULES]',
'\n★★★ MOST IMPORTANT ★★★ Use ONLY column names from the above list',
'\n\n0. Omit Primary Key:',
'\n - Do not write PRIMARY KEY clause',
'\n - Primary key constraints do not function in Snowflake and are optional',
'\n\n1. Column Name Usage:',
'\n - Column names used on the right side of AS must also exist in the list',
'\n - Example OK: YEAR(TRANSACTION_DATE) ← only if TRANSACTION_DATE exists',
'\n - Example NG: YEAR(ORDER_DATE) ← if ORDER_DATE does not exist',
'\n\n2. FACTS (Numeric Data):',
'\n - Quantity, amount, price, score, rate - numeric type columns',
'\n - Include both simple column references and calculated formulas',
'\n - Example (simple): QUANTITY, PRICE, AMOUNT',
'\n - Example (calculated): revenue AS QUANTITY * UNIT_PRICE, discount_amount AS TOTAL_PRICE * 0.1',
'\n\n3. DIMENSIONS (Attribute Data):',
'\n - IDs, names, dates, categories, statuses',
'\n - Do not overlap with FACTS',
'\n\n4. No Duplicate Names: All table.name combinations on the left must be unique',
'\n\n5. Date Derivative Naming: original_column_name_year, original_column_name_month',
'\n\n6. Rich Synonyms (Critical for Cortex Analyst to understand user requests):',
'\n - Tables: 4-5 business-relevant aliases',
'\n - Each DIMENSION, FACT, METRIC: 2-3 natural aliases',
'\n - Infer appropriate synonyms from sample data actual usage',
'\n - Example: sales amount → "total sales", "sales total", "revenue"',
'\n - Example: product ID → "product code", "item number", "SKU"',
'\n\n7. Rich COMMENTS:',
'\n - Generate specific descriptions from sample data',
'\n - Include business meaning',
'\n - Specify units and ranges',
'\n\n8. Utilize Sample Data:',
'\n - Understand actual usage from sample data content',
'\n - Generate more concrete, practical synonyms',
'\n - Reflect data characteristics (range, units, etc.) in COMMENTS',
'\n\n9. Generate Abundant FACTS and DIMENSIONS:',
'\n - FACTS: Around 5-10 (simple columns + calculated formulas)',
'\n - DIMENSIONS: Around 5-15 (original columns + date derivatives)',
'\n - METRICS: Around 4-8 (various aggregation patterns)',
'\n\n10. Output: SQL syntax only (no ```, no explanatory text)',
'\n\nOutput only the CREATE SEMANTIC VIEW statement:'
)
) INTO ai_response;
-- Debug: Verify AI response length
IF (:ai_response IS NULL OR LENGTH(:ai_response) = 0) THEN
RETURN 'Error: AI_COMPLETE returned an empty response. Column info: ' || SUBSTRING(:column_info, 1, 500);
END IF;
-- Remove unnecessary characters
create_view_sql := :ai_response;
create_view_sql := REPLACE(:create_view_sql, '```sql', '');
create_view_sql := REPLACE(:create_view_sql, '```', '');
create_view_sql := REPLACE(:create_view_sql, '"', '''');
create_view_sql := REPLACE(:create_view_sql, 'Revised:', '');
create_view_sql := REPLACE(:create_view_sql, 'Modifications:', '');
create_view_sql := REPLACE(:create_view_sql, '**', '');
create_view_sql := TRIM(:create_view_sql);
-- Final verification: Check SQL statement is not empty
IF (:create_view_sql IS NULL OR LENGTH(:create_view_sql) < 50) THEN
RETURN 'Error: Generated SQL is empty or too short. AI response: ' || SUBSTRING(:ai_response, 1, 1000);
END IF;
-- ================================================================================
-- STEP 4: Create semantic view
-- ================================================================================
BEGIN
EXECUTE IMMEDIATE :create_view_sql;
RETURN 'Semantic view ' || :TARGET_DATABASE || '.' || :TARGET_SCHEMA || '.' || :TARGET_VIEW_NAME || ' created successfully. SQL: ' || :create_view_sql;
EXCEPTION
WHEN OTHER THEN
RETURN 'Error: ' || SQLERRM || ' -- Generated SQL: ' || SUBSTRING(:create_view_sql, 1, 2000);
END;
END;
$$;
-- ================================================================================
-- Usage Example
-- ================================================================================
/*
-- Basic usage
CALL GENERATE_SEMANTIC_VIEW(
'<source_database>',
'<source_schema>',
'<source_table>',
'<target_database>',
'<target_schema>',
'<semantic_view_name>',
'<LLM_model_name>'
);
-- Verify created semantic view
DESCRIBE SEMANTIC VIEW <target_database>.<target_schema>.<semantic_view_name>;
*/
Note: You can copy and paste this stored procedure directly into your Snowflake environment for immediate use.
How to Use
Prerequisites
To use this stored procedure, you'll need:
- A Snowflake account
- An environment where Cortex LLM is available (cross-region inference makes this available in most regions)
- Permissions to create semantic views
Basic Usage Example
Here's how to generate a semantic view from the SALES_DATABASE.PUBLIC.ORDERS
table:
-- Call the stored procedure
CALL GENERATE_SEMANTIC_VIEW(
'SALES_DATABASE', -- Source database
'PUBLIC', -- Source schema
'ORDERS', -- Source table
'ANALYTICS_DATABASE', -- Target database
'SEMANTIC', -- Target schema
'ORDERS_SEMANTIC_VIEW', -- Semantic view name
'claude-sonnet-4-5' -- LLM model (optional)
);
-- Verify the created semantic view
DESCRIBE SEMANTIC VIEW ANALYTICS_DATABASE.SEMANTIC.ORDERS_SEMANTIC_VIEW;
Example Generated Semantic View
For example, given this table structure:
-- Sample table
CREATE TABLE ORDERS (
ORDER_ID NUMBER,
CUSTOMER_ID NUMBER,
ORDER_DATE DATE,
PRODUCT_NAME VARCHAR,
QUANTITY NUMBER,
UNIT_PRICE NUMBER,
TOTAL_AMOUNT NUMBER
);
The stored procedure auto-generates a semantic view like this:
CREATE OR REPLACE SEMANTIC VIEW ANALYTICS_DATABASE.SEMANTIC.ORDERS_SEMANTIC_VIEW
TABLES (
ORDERS AS SALES_DATABASE.PUBLIC.ORDERS
WITH SYNONYMS ('Order Table', 'Sales Data', 'Purchase History', 'Transactions')
COMMENT = 'Table managing customer order information'
)
FACTS (
ORDERS.quantity AS QUANTITY
WITH SYNONYMS ('Qty', 'Amount Ordered', 'Number of Items')
COMMENT = 'Quantity of products ordered',
ORDERS.unit_price AS UNIT_PRICE
WITH SYNONYMS ('Price', 'Item Price', 'Product Price')
COMMENT = 'Price per product unit',
ORDERS.total_amount AS TOTAL_AMOUNT
WITH SYNONYMS ('Total', 'Order Amount', 'Purchase Total')
COMMENT = 'Total order amount'
)
DIMENSIONS (
ORDERS.order_id AS ORDER_ID
WITH SYNONYMS ('Order Number', 'Transaction ID', 'Order Ref')
COMMENT = 'Unique identifier for the order',
ORDERS.customer_id AS CUSTOMER_ID
WITH SYNONYMS ('Client ID', 'Customer Number', 'Account ID')
COMMENT = 'Customer identifier',
ORDERS.order_date AS ORDER_DATE
WITH SYNONYMS ('Purchase Date', 'Transaction Date', 'Date Ordered')
COMMENT = 'Date the order was placed',
ORDERS.product_name AS PRODUCT_NAME
WITH SYNONYMS ('Product', 'Item Name', 'Item')
COMMENT = 'Name of the ordered product'
)
METRICS (
ORDERS.total_sales AS SUM(TOTAL_AMOUNT)
WITH SYNONYMS ('Total Revenue', 'Sales Total', 'Revenue')
COMMENT = 'Total sales amount for all orders',
ORDERS.avg_order_amount AS AVG(TOTAL_AMOUNT)
WITH SYNONYMS ('Average Order Value', 'AOV', 'Avg Purchase')
COMMENT = 'Average amount per order',
ORDERS.total_orders AS COUNT(*)
WITH SYNONYMS ('Order Count', 'Number of Orders', 'Total Transactions')
COMMENT = 'Total number of orders'
)
COMMENT = 'Semantic view for order data analysis';
Real-World Use Cases
Use Case 1: Fast Start
When introducing Cortex Analyst to a new project, use this stored procedure to quickly generate semantic views for key tables. Then customize only the necessary parts according to business requirements and add relationships between tables - several times faster than building from scratch, helping you set up your semantic layer efficiently.
Use Case 2: Data Catalog
Generated semantic views contain rich synonyms and comments, making them excellent table documentation. They serve as reference material for new team members learning the table structure or can be loaded into AI Agents as an AI-oriented data catalog.
Use Case 3: POC (Proof of Concept)
For POCs validating Cortex Analyst's effectiveness, you often don't want to spend much time creating semantic layers. This stored procedure lets you build a demo environment in minutes and immediately experience Cortex Analyst's value.
Customization Tips
While auto-generated semantic views may work as-is, customizing these points will achieve even more accurate query generation:
Note: You can customize semantic views not just through SQL commands (ALTER SEMANTIC VIEW), but also via GUI in Snowsight's AI/ML Studio. AI/ML Studio lets you visually confirm the semantic view structure while intuitively adding synonyms, editing comments, and adding new Metrics.
1. Adding/Modifying Synonyms
Add industry-specific terms and internal company terminology to help Cortex Analyst better understand users' natural language:
-- Example: Add industry terms
ALTER SEMANTIC VIEW ORDERS_SEMANTIC_VIEW
MODIFY DIMENSION ORDERS.product_name
ADD SYNONYMS ('SKU', 'Item Code', 'Barcode');
2. Adding Complex Calculated Metrics
Add advanced calculation formulas according to business logic:
-- Example: Add profit margin
ALTER SEMANTIC VIEW ORDERS_SEMANTIC_VIEW
ADD METRIC ORDERS.profit_margin AS (SUM(TOTAL_AMOUNT) - SUM(COST)) / SUM(TOTAL_AMOUNT)
WITH SYNONYMS ('Margin', 'Profit Rate', 'Return Rate')
COMMENT = 'Profit ratio relative to sales';
3. Integrating Multiple Tables into a Semantic View
To analyze across multiple tables, integrate individually generated semantic views and define relationships between tables:
-- Example: Create semantic view containing both ORDERS and CUSTOMERS tables
-- First, generate semantic views for each table individually,
-- Then integrate into a new semantic view
CREATE OR REPLACE SEMANTIC VIEW SALES_ANALYSIS_VIEW
TABLES (
ORDERS AS SALES_DATABASE.PUBLIC.ORDERS
WITH SYNONYMS ('Orders', 'Sales Data'),
CUSTOMERS AS SALES_DATABASE.PUBLIC.CUSTOMERS
WITH SYNONYMS ('Customers', 'Client Info')
)
RELATIONSHIPS (
ORDERS.CUSTOMER_ID = CUSTOMERS.CUSTOMER_ID
)
-- Facts, Dimensions, Metrics omitted
;
For detailed customization options, see the official Snowflake documentation - Using SQL commands to create and manage semantic views.
Integration with Cortex Analyst
Generated semantic views are immediately usable with Cortex Analyst.
Verifying in Snowsight
After creating your semantic view, verify it works right away in Snowsight's AI/ML Studio Cortex Analyst playground. For implementation, Snowflake offers various options including calling from Snowflake Intelligence, calling from Cortex Agents, or configuring as a tool in Snowflake MCP Server.
Important Notes
Tool Limitations
This stored procedure has the following constraint:
- Single Table Only: This tool generates semantic views from a single table only. To create semantic views combining multiple tables, first generate views for each table individually, then manually add relationships between tables.
Generated Semantic View Quality
As this uses AI auto-generation, please note:
- Not Perfect: Always review and customize generated semantic views according to business requirements
- Sample Data Dependency: Generated synonyms and descriptions vary based on sample data content
- Complex Logic: Advanced business logic requires manual additions and modifications
Conclusion
This AI auto-generation stored procedure is a tool that dramatically accelerates the initial stages of semantic view creation. While semantic layer preparation is essential for Cortex Analyst to deliver "natural language data analysis" - that ideal world - it would be a shame if this necessary work becomes a barrier that delays your start!
Using this stored procedure, you can generate baseline semantic views in seconds, then customize necessary parts from there. As a result, you can deliver analytics environments to business departments in a much shorter timeframe.
Take the first step toward realizing data democratization and empowering business users to extract insights from data themselves - I hope this tool helps you get there!
Promotion
Snowflake What's New Updates on X
I share Snowflake What's New updates on X. Follow for the latest insights:
English Version
Snowflake What's New Bot (English Version)
Japanese Version
Snowflake's What's New Bot (Japanese Version)
Change Log
(20251009) Initial post
Top comments (0)