Yoshiki Fujiwara(藤原善基)@AWS Community Builder for AWS Community Builders

Posted on May 25 • Edited on May 27

Snowflake and FSx for ONTAP S3 Access Points — From 'Access Denied' to Working External Tables

#aws #snowflake #amazonfsxfornetappontap #lakehouse

TL;DR

In Part 1, Athena worked cleanly. In Part 2, Databricks hit session policy boundaries. This Part 3 validates Snowflake's path — and it works.

Snowflake can query FSx for ONTAP S3 Access Point data — but only with the correct stage configuration. Without the AWS_ACCESS_POINT_ARN parameter, SELECT fails with "access denied" while LIST works. With it, the tested read, governance, and AI paths work: SELECT, External Tables, COPY INTO load, Directory Tables, governance tags, and 8 out of 10 Cortex AI functions work on FSx data (7 directly, 1 via COPY INTO for Cortex Search).

Configuration	LIST	SELECT	External Table	Cortex AI (text)	Vision AI
AP alias only (no ARN)	✅	❌ Access Denied	❌	❌	❌
AP alias + `AWS_ACCESS_POINT_ARN`	✅	✅	✅	✅ Direct	✅ Via staging

This appears to be a recurring integration pattern in this series: platforms that generate restrictive session policies need an explicit S3 Access Point ARN parameter so the generated policy includes the regional access point ARN.

Quick Decision Guide:

Zero-copy governed read on NAS data → External Table with AWS_ACCESS_POINT_ARN
Full AI + maximum query performance → COPY INTO internal table
RAG / semantic search over NAS documents → COPY INTO → Cortex Search Service (198ms)

GitHub Repository: fsxn-lakehouse-integrations

How to Read This Article

This article is:

A reproduction-focused validation report
Evidence from one environment (Snowflake Standard, ap-northeast-1)
A configuration guide for Snowflake + FSx for ONTAP S3 AP

Read by role:

Snowflake admin: Stage configuration → Working setup
Storage engineer: Evidence matrix → Root cause analysis
Data engineer: What works today → External Table setup
Partner / SA: Partner Decision Card → Architecture guidance
Security / governance reviewer: Governance Impact Summary → Regulated Workload Checklist
AI/ML engineer: AI / ML Integration Path → MLOps Boundary

Prerequisite Concepts

Before reading this article, it helps to understand:

Snowflake Storage Integration — an object that stores a reference to an IAM role for accessing external cloud storage
Snowflake External Stage — maps a cloud storage URL to a storage integration for data access
External Table — a Snowflake table that reads data directly from files on an external stage (no data copy)
AWS_ACCESS_POINT_ARN — a stage parameter that tells Snowflake to include the S3 Access Point ARN in its generated session policy
S3 Access Point ARN vs S3 bucket ARN — S3 AP uses arn:aws:s3:<region>:<account>:accesspoint/<name>, not arn:aws:s3:::<bucket>
Directory Table — a Snowflake feature that exposes file metadata (path, size, date) from a stage as a queryable table

Important premise: Snowflake does NOT officially document FSx for ONTAP S3 Access Points as a supported External Stage storage backend. The AWS_ACCESS_POINT_ARN parameter exists in Snowflake's CREATE STAGE documentation for S3 Access Points generally, but FSx for ONTAP S3 AP is not listed as a validated target. Our validation confirms that read and governance operations work when configured correctly, but this should not be interpreted as an officially supported configuration by Snowflake. Consult Snowflake Support before production use.

The Goal

Query structured and unstructured data stored on FSx for ONTAP from Snowflake — without copying data to a native S3 bucket. FSx for ONTAP S3 Access Points should make this possible by exposing NFS/SMB file data via S3 API.

In Part 1, Athena worked cleanly. In Part 2, Databricks required the access_point field and still has limitations. This article validates Snowflake's path.

Test Environment

Snowflake Account: Standard edition, AWS ap-northeast-1
Warehouse: COMPUTE_WH (X-Small)
Role: ACCOUNTADMIN
FSx for ONTAP: <FILE_SYSTEM_ID> (ONTAP 9.17.1)
SVM: <SVM_NAME>
S3 Access Point: Internet-origin, UNIX file system user

Scope: This article validates Snowflake Standard edition. Enterprise features (e.g., advanced governance, private connectivity) may provide additional capabilities not tested here.

The Setup

Snowflake accesses external data through a three-layer configuration:

Storage Integration (IAM Role ARN + trust)
    │
    └── External Stage (S3 URL + AWS_ACCESS_POINT_ARN + file format)
            │
            └── External Table / SELECT @stage (data access)

Visual Story: Before and After

❌ Before: SELECT Fails Without `AWS_ACCESS_POINT_ARN`

CREATE OR REPLACE STAGE fsxn_stage_without_arn
  STORAGE_INTEGRATION = fsxn_verification_integration
  URL = 's3://<ap-alias>/'
  FILE_FORMAT = (TYPE = PARQUET);

LIST @fsxn_stage_without_arn/sensor-data/;   -- ✅ Works
SELECT $1 FROM @fsxn_stage_without_arn/sensor-data/sensor_data.parquet LIMIT 3;  -- ❌ Access Denied

"Failed to access remote file: access denied. Please check your credentials." — The same file that LIST found cannot be read.

✅ After: SELECT Succeeds With `AWS_ACCESS_POINT_ARN`

CREATE OR REPLACE STAGE fsxn_stage_with_arn
  STORAGE_INTEGRATION = fsxn_verification_integration
  URL = 's3://<ap-alias>/'
  AWS_ACCESS_POINT_ARN = 'arn:aws:s3:<region>:<account>:accesspoint/<ap-name>'
  FILE_FORMAT = (TYPE = PARQUET);

SELECT $1 FROM @fsxn_stage_with_arn/sensor-data/sensor_data.parquet LIMIT 3;  -- ✅ SUCCESS

Result: 3 rows of sensor data returned successfully.

{"humidity": 32.2, "id": 1, "pressure": 1002.1, "sensor_id": "S004", "status": "normal", "temperature": 21.13}
{"humidity": 45.63, "id": 2, "pressure": 1004.13, "sensor_id": "S005", "status": "normal", "temperature": 23.07}
{"humidity": 42.79, "id": 3, "pressure": 1000.18, "sensor_id": "S003", "status": "normal", "temperature": 36.96}

✅ External Table Also Works

CREATE OR REPLACE EXTERNAL TABLE fsxn_sensor_ext_table
  LOCATION = @fsxn_stage_with_arn/sensor-data/
  FILE_FORMAT = (TYPE = PARQUET)
  AUTO_REFRESH = FALSE;

SELECT * FROM fsxn_sensor_ext_table LIMIT 3;  -- ✅ SUCCESS (3 rows)

Complete Capability Matrix

Capability	Status	Notes
Read operations
SELECT from `@stage` (Parquet)	✅ Verified	GetObject with `AWS_ACCESS_POINT_ARN`
SELECT from `@stage` (CSV)	✅ Verified	CSV with SKIP_HEADER works
SELECT from `@stage` (JSON)	✅ Expected	Same GetObject path (no JSON files in test data)
External Table (read)	✅ Verified	CREATE + SELECT both succeed
LIST `@stage` (all prefixes)	✅ Verified	Subdirectories included
GET_PRESIGNED_URL	✅ Observed	Works but not officially supported
Load operations
COPY INTO (stage → table)	✅ Verified	4.9s for Parquet load
Governance
Governance Tags on External Table	✅ Verified	CREATE TAG + ALTER TABLE SET TAG
SYSTEM$GET_TAG	✅ Verified	Tag retrieval works
Row Access Policy	✅ Expected	Standard Snowflake feature on tables
Column Masking	✅ Expected	Standard Snowflake feature on tables
Write operations
PutObject (via COPY INTO unload)	⚠️ TBD	FSx S3 AP supports PutObject ≤5GB
Event-driven
Snowpipe (auto-ingest)	❌ Not possible	S3 Event Notifications not supported on FSx S3 AP
AUTO_REFRESH on External Table	❌ Not possible	Requires S3 Event Notifications
Transactional table formats
Iceberg Table read (pre-existing metadata)	⚠️ TBD	Requires separate validation
Iceberg Table write-back	❌ Not suitable	Conditional writes not supported on FSx for ONTAP S3 AP. For Iceberg, use Snowflake Managed Iceberg Table on standard S3 (COPY INTO from FSx for ONTAP External Stage → Iceberg table on S3). External engines (Spark, Athena) can then read the same Iceberg table.
Delta / Hudi write	❌ Not suitable	Conditional writes not supported
Supported file formats
Parquet	✅ Verified	Primary format for analytics
CSV	✅ Verified	With header skip, delimiter options
JSON	✅ Expected	Same read path as Parquet/CSV
Avro	✅ Expected	Snowflake-supported format, same read path
ORC	✅ Expected	Snowflake-supported format, same read path

Key insight: With AWS_ACCESS_POINT_ARN, Snowflake achieves broad read and governance integration for the tested paths. The only limitations are event-driven features (Snowpipe, AUTO_REFRESH) and transactional write formats (Iceberg, Delta) — both due to FSx S3 AP API limitations, not Snowflake limitations.

The Root Cause: Session Policy ARN Mismatch

When Snowflake performs sts:AssumeRole, it applies a session policy. Without AWS_ACCESS_POINT_ARN, this session policy uses standard S3 bucket ARN patterns that don't match the FSx S3 AP regional ARN format:

Without AWS_ACCESS_POINT_ARN:
  Session policy allows GetObject on: arn:aws:s3:::*/*
  FSx S3 AP actual ARN:              arn:aws:s3:<region>:<account>:accesspoint/<name>/object/*
  → NO MATCH → AccessDenied

With AWS_ACCESS_POINT_ARN:
  Session policy includes:            arn:aws:s3:<region>:<account>:accesspoint/<name>/*
  → MATCH → GetObject succeeds

This is the same pattern as Databricks Unity Catalog's access_point field — both platforms need the S3 AP ARN explicitly specified to include it in the generated session policy.

Support Confirmation (May 2026): Snowflake Support confirmed this resolution. The original issue (LIST works, SELECT fails with "access denied") is resolved by adding the AWS_ACCESS_POINT_ARN parameter to the stage definition. Unlike Databricks (where the equivalent access_point field was never GA and has been removed), Snowflake's AWS_ACCESS_POINT_ARN is a documented, supported parameter in the CREATE STAGE reference.

Evidence Matrix

Layer	Evidence	Result	Interpretation
Snowflake integration	DESCRIBE INTEGRATION	✅ Pass	Trust established
Stage metadata	LIST `@stage`	✅ Pass	ListBucket path works (bucket-level ARN matches)
Object read (no ARN)	SELECT `@stage`	❌ Fail	GetObject blocked by session policy
Object read (with ARN)	SELECT `@stage`	✅ Pass	`AWS_ACCESS_POINT_ARN` resolves session policy
External Table	CREATE + SELECT	✅ Pass	Governed table access works with ARN
Same role direct	AWS CLI List/Get/Head	✅ Pass	IAM/AP/FSx permissions are correct
FSx authorization	File system user permissions	✅ Pass	FSx-side permission permits access
Operational health	SVM DNS check	✅ Pass	Distinguish ReadTimeout from AccessDenied

FSx for ONTAP S3 AP Authorization Path

FSx for ONTAP S3 Access Points use a dual-layer authorization model:

Layer 1 — S3-side authorization:

IAM identity-based policy (Snowflake's assumed role session)
S3 Access Point resource policy
Session policy generated by Snowflake (requires AWS_ACCESS_POINT_ARN to include AP ARN)

Layer 2 — FSx for ONTAP-side authorization:

File system user associated with the access point
UNIX mode-bits / NFSv4 ACLs (for UNIX security style volumes)

In the Snowflake validation, the initial failure occurred at Layer 1 — Snowflake's generated session policy did not include the S3 AP ARN pattern. Setting AWS_ACCESS_POINT_ARN resolves this by instructing Snowflake to include the AP ARN in the session policy, allowing both layers to be evaluated normally.

S3 API Compatibility and Snowflake Operations

Snowflake operation	Likely S3 operation	FSx S3 AP support	Observed result (with ARN)
LIST `@stage`	ListObjectsV2	✅ Supported	✅ Success
SELECT `@stage`	GetObject / HeadObject	✅ Supported	✅ Success
GET_PRESIGNED_URL	Presign / signed GetObject URL	Presign not supported in FSx S3 AP docs	Observed working; not a supported production path
External Table read	GetObject	✅ Supported	✅ Success
Iceberg metadata read	Head/Get + conditional	Partial (conditional writes not supported)	TBD

Comparison: Snowflake vs Databricks

Aspect	Snowflake	Databricks
Parameter name	`AWS_ACCESS_POINT_ARN` (on stage)	`access_point` (on External Location)
LIST without parameter	✅ Works	❌ Blocked (before `access_point`)
SELECT without parameter	❌ Fails	❌ Fails
SELECT with parameter	✅ Works	✅ Works (explicit path only)
External Table / UC Table	✅ Works	❌ CREATE TABLE still fails
Subdirectory listing	✅ Works	❌ Blocked
Documentation	CREATE STAGE docs	Databricks Support (May 2026)

Key difference: Snowflake's AWS_ACCESS_POINT_ARN resolves the issue more completely than Databricks' access_point field. Snowflake achieves full External Table support, while Databricks still cannot create UC tables.

Partner Decision Card

Customer requirement	Snowflake + FSx S3 AP today	Recommended path
File discovery only	✅ Works (LIST / Directory Table)	Use directly
Query file contents in Snowflake	✅ Works with `AWS_ACCESS_POINT_ARN`	Configure stage with ARN
Governed Snowflake external tables	✅ Works with `AWS_ACCESS_POINT_ARN`	Configure stage with ARN
Zero-copy SQL on NAS data	✅ Snowflake or Athena	Both work; choose by workload
AI on NAS data (summarize, RAG, sentiment)	✅ 8/10 Cortex functions validated	External Table + Cortex AI (zero-copy for text)
Automated enrichment pipeline	✅ Dynamic Table (confirmed May 2026)	External Table → Dynamic Table (TARGET_LAG = '1 hour')
Share curated data with partners	✅ Data Sharing	External Table or Dynamic Table → GRANT TO SHARE
Open format for multi-engine access	✅ Managed Iceberg Table (confirmed May 2026)	COPY INTO → Managed Iceberg → Databricks/Athena can read
Snowflake ML / Snowpark on NAS data	✅ Possible via External Table	Configure stage with ARN, validate Snowpark path
Iceberg Table on FSx S3 AP	TBD (conditional writes not supported)	Validate separately

Choose Snowflake when governed external tables, tags, Directory Tables, or Snowpark integration are required. Choose Athena when lightweight AWS-native serverless SQL over NAS data is sufficient.

Discovery Questions for Partners

When a customer asks about Snowflake + FSx for ONTAP S3 Access Points:

Is the workload read-only analytics, or does it require write-back?
Is Snowflake governance (tags, row access policy, masking) required?
Does the workload need real-time file detection (Snowpipe), or is scheduled refresh acceptable?
Are the target files structured (Parquet/CSV/JSON) or unstructured (images/documents)?
Is the data regulated (PHI, PII, financial)? If so, review presigned URL governance.
Does the customer need Iceberg table format? (Write-back not supported on FSx S3 AP)
What is the expected file count and average file size? (Impacts LIST/REFRESH latency)
Is the Snowflake account in the same AWS region as FSx for ONTAP?

Governance Impact

Capability	Status	Governance impact
LIST `@stage`	✅ Works	File inventory; not data access governance
SELECT `@stage`	✅ Works (with ARN)	Query-level access via Snowflake governance
External Table	✅ Works (with ARN)	Governed schema/table abstraction available
Iceberg Table	❌ Write not suitable	Conditional writes not supported; read of pre-existing tables TBD
GET_PRESIGNED_URL	⚠️ Observed only	Risk of bypassing Snowflake query governance if misused

For regulated workloads, do not use GET_PRESIGNED_URL as a workaround for query access. Even if URL generation is observed to work, it is not a governed Snowflake query path and should be reviewed separately for auditability, expiration, data classification, and access logging.

Governance Impact Summary

Important premise: FSx for ONTAP S3 Access Points are NOT officially documented by Snowflake as a supported External Stage storage backend. The governance paths described below are validated in this environment but should not be treated as officially supported configurations without Snowflake Support confirmation.

Access path	Governance model	Auditability	Production suitability
External Table (with `AWS_ACCESS_POINT_ARN`)	Snowflake RBAC + Tags + Row Access Policy	High (Snowflake Access History, query logs)	Recommended governed read path
COPY INTO (load to Snowflake table)	Full Snowflake governance on loaded data	High (standard Snowflake table governance)	Recommended for ML/AI workloads requiring full governance
Directory Table + GET_PRESIGNED_URL	File catalog governed; URL access is external	Medium (catalog queries logged; URL access not logged by Snowflake)	File discovery governed; downstream access requires separate audit
BUILD_SCOPED_FILE_URL	Snowflake-mediated access	High (access mediated through Snowflake privileges)	Preferred for governed unstructured data access
GET_PRESIGNED_URL (direct)	External access path	Low (Snowflake does not log URL usage after generation)	PoC / non-regulated only; requires separate access logging

Snowflake Access History captures query-level access to External Tables. However, presigned URL usage after generation is not tracked by Snowflake — use CloudTrail S3 data events for downstream audit if required.

MLOps Boundary

Reading data from FSx for ONTAP S3 AP via Snowflake External Table does not automatically make the downstream ML workflow governed.

If the data accessed via External Table or COPY INTO is used for ML or GenAI:

Register derived datasets in governed Snowflake tables
Track experiments with Snowflake ML lineage or external experiment tracking
Document source data access path (stage name, S3 AP alias, prefix, timestamp)
Record whether training data lineage is captured within Snowflake or externalized
Ensure Snowpark ML workloads use appropriate role privileges
If using Cortex functions, validate that input data classification is appropriate for the model

Snowflake's ML Lineage tracks feature-to-model relationships. If the source data path is an External Table on FSx S3 AP, document this as the lineage origin.

AI / RAG Data Readiness Checklist

If the FSx for ONTAP S3 AP data is intended for AI, RAG, or GenAI pipelines via Snowflake:

[ ] Are documents classified by sensitivity (PHI, PII, financial, internal, public)?
[ ] Are file-level permissions preserved or re-modeled for the AI pipeline?
[ ] Is metadata available for filtering and retrieval (file type, date, owner)? → Use Directory Table
[ ] Is freshness requirement defined (real-time, daily, weekly)? → Define REFRESH schedule
[ ] Is read-only access sufficient, or does the pipeline need write-back?
[ ] Is human review required for generated output before downstream use?
[ ] Is permission-aware retrieval required (user A sees only their authorized documents)?

If permission-aware retrieval is required, define one of:

Enforce at source access path — use per-user or per-group S3 Access Points with scoped file system users
Re-model permissions in metadata index — extract file-level ACLs into Directory Table metadata and filter at query time
Filter retrieval results by user/group claims — apply Snowflake Row Access Policy on External Table based on authenticated user identity
Do not proceed until authorization model is validated and approved by security owner

Snowflake + FSx S3 AP approval requirements (for regulated workloads):

Data owner approval for External Table / stage access
Security owner approval for presigned URL generation policy
Platform owner approval for COPY INTO (data leaves FSx, enters Snowflake)
Defined: allowed prefix, allowed operations, refresh schedule, expiration date
Approval record location (where the decision is stored)
Review / expiration date (when the approval must be re-evaluated)

For regulated workloads, exercise caution with:

GET_PRESIGNED_URL for patient-facing or financial data (bypasses Snowflake query governance)
COPY INTO without data classification review (data moves from FSx to Snowflake storage)
Cortex LLM functions on sensitive data without human review gate
Unreviewed access to regulated datasets via scoped URLs

Unstructured Data Support

Format	Support	Access Method	Use Case
Images (JPEG, PNG, TIFF)	✅	GET_PRESIGNED_URL / BUILD_SCOPED_FILE_URL	Thumbnail generation, ML inference, quality inspection
Video (MP4, MOV)	✅	GET_PRESIGNED_URL	Streaming, frame extraction
Documents (PDF, DOCX)	✅	GET_PRESIGNED_URL / Snowpark File Access	Text extraction, RAG, document processing
Audio (WAV, MP3)	✅	GET_PRESIGNED_URL	Transcription, speech analytics
Binary / Archives	✅	GET_PRESIGNED_URL	Download, transfer

How to manage unstructured data as a library:

-- Enable Directory Table for file catalog
ALTER STAGE fsxn_stage SET DIRECTORY = (ENABLE = TRUE);
ALTER STAGE fsxn_stage REFRESH;

-- Query file catalog (search by path, size, date)
SELECT RELATIVE_PATH, SIZE, LAST_MODIFIED
FROM DIRECTORY(@fsxn_stage)
WHERE RELATIVE_PATH LIKE '%images/%'
ORDER BY LAST_MODIFIED DESC;

-- Generate download URL for applications (valid 1 hour)
SELECT GET_PRESIGNED_URL(@fsxn_stage, 'images/photo001.jpg', 3600);

-- Generate Snowflake-proxied secure URL
SELECT BUILD_SCOPED_FILE_URL(@fsxn_stage, 'documents/report.pdf');

Note: AUTO_REFRESH is not available because FSx S3 AP does not support S3 Event Notifications (GetBucketNotificationConfiguration is not supported). Use ALTER STAGE REFRESH manually or via Snowflake Task on a schedule.

URL type guidance: Use BUILD_SCOPED_FILE_URL when you want access mediated through Snowflake role privileges (governed path). Treat GET_PRESIGNED_URL as an external object access path that bypasses Snowflake query governance and requires separate review for regulated workloads.

AI / ML Integration Path

Snowflake provides AI/ML capabilities that can leverage FSx for ONTAP data via S3 AP. 7 out of 9 tested Cortex AI functions work directly on FSx S3 AP data without copying.

Snowflake AI/ML Feature	FSx S3 AP Compatibility	Access Path	Duration	Use Case
CORTEX.SUMMARIZE	✅ Direct	External Table → Cortex	3.3s	Text summarization on NAS documents
CORTEX.TRANSLATE	✅ Direct	External Table → Cortex	5.1s	Multi-language support
CORTEX.SENTIMENT	✅ Direct	External Table → Cortex	2.5s	Sentiment analysis
CORTEX.COMPLETE (text)	✅ Direct	External Table → Cortex	16s	AI analysis, anomaly detection
CORTEX.EXTRACT_ANSWER	✅ Direct	External Table → Cortex	2.7s	Information extraction
PARSE_DOCUMENT (OCR)	✅ Direct	Stage path → OCR	~8s	Invoice/report text extraction
COMPLETE (Vision/Multimodal)	✅ Workaround	COPY FILES → internal stage → TO_FILE	41s	Image analysis, defect detection
TO_FILE on FSx S3 AP	❌ Blocked	—	—	"Remote file not found"
Cortex Search (RAG)	✅ Verified	External Table → COPY INTO → Cortex Search Service	198ms query	Semantic search over NAS documents

Key finding: Text-based Cortex functions, PARSE_DOCUMENT, and Cortex Search all work on FSx S3 AP data (Cortex Search requires COPY INTO as a staging step). Vision AI (multimodal COMPLETE) requires a staging step because TO_FILE() cannot resolve files on S3 AP external stages.

Validated AI/ML paths:

✅ Cortex LLM SUMMARIZE on External Table data — AI-generated summary in 3.3s
✅ Cortex TRANSLATE on External Table data — English to Japanese in 5.1s
✅ Cortex SENTIMENT on External Table data — sentiment scores in 2.5s
✅ Cortex COMPLETE (text) on External Table data — AI anomaly analysis in 16s
✅ Cortex EXTRACT_ANSWER on External Table data — information extraction in 2.7s
✅ PARSE_DOCUMENT (OCR) on FSx S3 AP stage file — text extraction from images in ~8s
✅ COMPLETE (Vision AI) via COPY FILES workaround — image analysis in 41s (pixtral-large)
✅ Cortex Search (RAG) — External Table → COPY INTO → Cortex Search Service → semantic query in 198ms
✅ COPY INTO loads NAS data into Snowflake tables → available for all Cortex/ML functions
✅ Directory Table catalogs unstructured files → enables file discovery for processing pipelines
✅ GET_PRESIGNED_URL generates download URLs → enables external ML services to access files

Vision AI Workaround (Validated)

Direct TO_FILE() on FSx S3 AP external stage returns "Remote file not found." The workaround:

-- 1. Create unencrypted internal stage (SNOWFLAKE_SSE required — default encryption blocks TO_FILE)
CREATE OR REPLACE STAGE fsxn_ai_stage ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE');

-- 2. Copy image from FSx S3 AP to internal stage
COPY FILES INTO @fsxn_ai_stage FROM @fsxn_ap_arn_test_stage/media/documents/invoice_sample.png;

-- 3. Enable Cross-Region Inference (required for vision models in ap-northeast-1)
ALTER ACCOUNT SET CORTEX_ENABLED_CROSS_REGION = 'ANY_REGION';

-- 4. Run Vision AI
ALTER STAGE fsxn_ai_stage SET DIRECTORY = (ENABLE = TRUE);
ALTER STAGE fsxn_ai_stage REFRESH;
SELECT SNOWFLAKE.CORTEX.COMPLETE('pixtral-large',
  'Describe this invoice. What is the invoice number, customer, and amount?', FILE
) AS vision_result
FROM (SELECT TO_FILE(BUILD_SCOPED_FILE_URL(@fsxn_ai_stage, RELATIVE_PATH)) AS FILE
      FROM DIRECTORY(@fsxn_ai_stage) WHERE RELATIVE_PATH LIKE '%.png' LIMIT 1);

Result: Vision AI correctly identified Invoice #INV-2026-0524, Customer: Acme Corp, Amount: USD 1,234.56.

Data residency note: The COPY FILES step moves image data from FSx for ONTAP to Snowflake-managed internal storage. Cross-Region Inference may route data to US/EU regions for model processing. Verify compliance with your data residency requirements before enabling for regulated workloads.

Cortex Search (RAG) — Validated

Cortex Search provides semantic search over text data — the Snowflake-native RAG building block. The validated path uses External Table → COPY INTO → Cortex Search Service:

-- 1. Load FSx S3 AP data into internal table (required for Cortex Search)
COPY INTO sensor_documents FROM @fsxn_stage_with_arn/sensor-data/
  FILE_FORMAT = (TYPE = PARQUET);

-- 2. Create Cortex Search Service on the loaded data
CREATE OR REPLACE CORTEX SEARCH SERVICE sensor_search_service
  ON text_column
  WAREHOUSE = COMPUTE_WH
  TARGET_LAG = '1 hour'
  AS (SELECT * FROM sensor_documents);

-- 3. Semantic search query
SELECT PARSE_JSON(
  SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
    'sensor_search_service',
    '{"query": "high temperature anomaly", "columns": ["text_column"], "limit": 5}'
  )
);
-- Result: Relevant documents returned in 198ms

Dataset context: This validation used the sensor data loaded via COPY INTO from FSx S3 AP (1000 rows of IoT sensor readings). Cortex Search performance at scale (millions of documents, large text corpora) should be validated separately — 198ms is a sizing reference for this dataset size, not a service-level guarantee.

GA status: Verify that Cortex Search Service and its query functions are Generally Available (GA) in your Snowflake edition and region before production use. Preview features may not be covered by Snowflake SLA and should not be used for regulated workloads without explicit vendor confirmation.

Cortex Search Service created on data loaded from FSx for ONTAP via COPY INTO.

Semantic search query returns relevant results in 198ms — RAG-style retrieval over NAS-originated data.

Key insight: Cortex Search requires COPY INTO (data must be in a Snowflake internal table), but the end-to-end path from FSx for ONTAP → External Stage → COPY INTO → Cortex Search Service → semantic query is validated. This provides a Snowflake-native RAG path for NAS documents.

Data residency change: COPY INTO moves data from FSx for ONTAP to Snowflake-managed storage. Once loaded, the data is subject to Snowflake's storage lifecycle, not ONTAP's. For regulated workloads, obtain data owner approval before COPY INTO and document the residency change in your compliance records. Cortex Search Service indexes are stored in the same region as the Snowflake account — no cross-region data movement occurs for the index itself.

Comparison with Bedrock Knowledge Bases: Cortex Search requires a COPY INTO step (data moves to Snowflake storage). Bedrock Knowledge Bases can read directly from FSx S3 AP without copying. Choose Cortex Search when the RAG pipeline must stay within Snowflake governance. Choose Bedrock KB when data residency on FSx is mandatory and AWS-native RAG is preferred.

PoC Quick Start — Validate Cortex Search on your NAS data in 3 steps (estimated: 30 minutes with pre-configured stage):

Configure External Stage with AWS_ACCESS_POINT_ARN (see Configuration Guide above)
Run COPY INTO <target_table> FROM @fsxn_stage/<your-documents-prefix>/ to load text data
Create Cortex Search Service on the loaded table and run a semantic query to validate retrieval quality

Manufacturing Use Case: OCR + AI on NAS Data

-- OCR: Extract text from inspection report image stored on FSx for ONTAP
SELECT SNOWFLAKE.CORTEX.PARSE_DOCUMENT(
  @fsxn_stage,
  'media/documents/invoice_sample.png',
  {'mode': 'OCR'}
) AS ocr_result;
-- Result: "INVOICE #INV-2026-0524", "Customer: Acme Corp", "Amount: USD 1,234.56"

-- AI Analysis: Analyze sensor data for anomalies
SELECT SNOWFLAKE.CORTEX.COMPLETE('mistral-large2',
  'Analyze this IoT sensor reading and identify anomalies: ' || VALUE::VARCHAR
) AS ai_analysis FROM fsxn_sensor_ext_table LIMIT 1;

PARSE_DOCUMENT (OCR mode) extracts text from an image on FSx for ONTAP via S3 AP — works directly without copying.

Cortex COMPLETE (mistral-large2) generates AI anomaly analysis of IoT sensor data on FSx for ONTAP — works directly on External Table data.

Vision AI (pixtral-large) correctly extracts invoice details from an image originally on FSx for ONTAP — requires COPY FILES to internal stage.

Not validated in this article:

Snowpark File Access (SnowflakeFile.open()) for direct binary file processing in UDFs
AI_TRANSCRIBE for audio files on FSx S3 AP

Comparison with Databricks AI/ML path:

AI/ML Capability	Snowflake + FSx S3 AP	Databricks + FSx S3 AP
Governed table as ML input	✅ External Table	❌ UC Table creation blocked
Text AI (LLM) on NAS data	✅ 6 Cortex functions direct	⚠️ boto3 + external LLM (bypasses UC)
Vision AI on NAS images	✅ Via staging workaround (41s)	⚠️ boto3 driver-only (bypasses UC)
OCR / Document extraction	✅ PARSE_DOCUMENT direct (8s)	⚠️ boto3 + external OCR
Feature engineering	✅ Snowpark DataFrame on External Table	⚠️ spark.read with explicit path only
File catalog for ML pipeline	✅ Directory Table	⚠️ dbutils.fs.ls (top-level only)
RAG over NAS documents	✅ Cortex Search (via COPY INTO, 198ms)	⚠️ boto3 + external RAG (bypasses UC)

Key insight: Snowflake's AI/ML path benefits from governed External Tables and direct Cortex function access — 8 out of 10 tested functions work on FSx data (7 directly without copying, 1 via COPY INTO for Cortex Search). Databricks' AI/ML path is limited by UC table creation failure, forcing boto3 workarounds that bypass governance.

For end-to-end RAG on NAS documents: Use Snowflake Cortex Search (validated: External Table → COPY INTO → Cortex Search Service, 198ms query latency) or Amazon Bedrock Knowledge Bases as the AWS-documented path (no copy needed).

Decision guidance: Use Snowflake when the customer already needs Snowflake governance, Cortex/Snowpark processing, or table-based feature engineering. Use Bedrock Knowledge Bases when the primary requirement is AWS-native permission-aware RAG over NAS documents.

Comparison: Snowflake vs Databricks (Governance)

Governance Capability	Snowflake + FSx S3 AP	Databricks + FSx S3 AP
Table creation	✅ External Table	❌ CREATE TABLE fails
Data classification tags	✅ Governance Tags	❌ UC Table not creatable
Access control	✅ Row Access Policy	❌ UC governance not applicable
File catalog	✅ Directory Table	⚠️ dbutils.fs.ls (top-level only)
Secure URL generation	✅ BUILD_SCOPED_FILE_URL	❌
Column masking	✅ Available	❌
COPY INTO (data load)	✅	❌
Unstructured data catalog	✅ Directory Table + Presigned URL	⚠️ boto3 only (bypasses governance)

Key takeaway: In this validation, Snowflake with AWS_ACCESS_POINT_ARN achieved a more complete governed read path than the Databricks path tested in Part 2. Snowflake can create governed tables, apply tags, and manage unstructured data catalogs — capabilities that remain blocked in Databricks due to UC table creation failure.

For regulated workloads: Snowflake provides a more complete governed path today (External Table + Tags + Row Access Policy + audit trail). Databricks requires staged ingestion to S3 for equivalent governance. If your compliance framework requires governed table-level access control on the data, Snowflake is the validated path for FSx S3 AP integration.

Business Impact

Requirement	Observed result	Business impact	Recommended decision
Zero-copy Snowflake query over NAS	✅ Works (with ARN)	Eliminates copy pipeline	Use `AWS_ACCESS_POINT_ARN` stage
Snowflake governance on FSx data	✅ External Table works	Governed table abstraction available	Create External Tables
File inventory from Snowflake	✅ Works	Metadata cataloging possible	Use LIST / Directory Tables
RAG / AI over NAS documents	✅ Cortex Search validated (198ms)	Snowflake-native RAG path available	COPY INTO → Cortex Search Service
Text AI on NAS data (no copy)	✅ 7 functions direct	AI processing without data movement	Use Cortex functions on External Table

Detailed validation metrics (refresh duration, file count, query latency, COPY INTO duration, URL generation success rate) should be recorded in the verification-pack evidence files rather than treated as universal benchmark numbers.

Use Case Fit Matrix

Use case	Best current path	Why
SQL analytics on structured NAS files	Snowflake External Table or Athena	Both validated; Snowflake adds governance tags
Unstructured data catalog	Snowflake Directory Table	File metadata queryable with governance
Data load from NAS to Snowflake	COPY INTO from FSx S3 AP stage	Validated (4.9s for Parquet)
RAG over NAS documents	Cortex Search (via COPY INTO, validated 198ms) or Bedrock KB (AWS-native)	Cortex Search validated; Bedrock KB is AWS-documented path
ML feature engineering	Snowpark DataFrame on External Table	Governed read path available
Real-time ingestion	Not FSx S3 AP path	Use native S3 + Snowpipe
Iceberg / transactional tables	Not FSx S3 AP path	Use native S3 for write-back

Cost Model Considerations

Component	Cost driver	Notes
Snowflake warehouse	Credit consumption during queries	X-Small sufficient for validation; scale per workload
FSx for ONTAP	Throughput capacity + storage	S3 AP queries share throughput with NFS/SMB workloads
S3 AP requests	No additional S3 request charges	FSx S3 AP does not incur separate S3 API fees
Data transfer	Standard AWS data transfer	Snowflake SaaS in same region minimizes transfer

Cost comparison across engines is not the focus of this article. Snowflake's credit-based model differs fundamentally from Athena's per-TB-scanned model. Evaluate based on workload pattern, governance requirements, and existing Snowflake investment.

Configuration Guide

Step 1: Create Storage Integration

CREATE OR REPLACE STORAGE INTEGRATION fsxn_integration
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = 'S3'
  ENABLED = TRUE
  STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::<account>:role/<role-name>'
  STORAGE_ALLOWED_LOCATIONS = ('s3://<ap-alias>/');

Step 2: Create Stage WITH `AWS_ACCESS_POINT_ARN`

CREATE OR REPLACE STAGE fsxn_stage
  STORAGE_INTEGRATION = fsxn_integration
  URL = 's3://<ap-alias>/'
  AWS_ACCESS_POINT_ARN = 'arn:aws:s3:<region>:<account>:accesspoint/<ap-name>'
  FILE_FORMAT = (TYPE = PARQUET);

Step 3: Verify

LIST @fsxn_stage/;                                    -- File discovery
SELECT $1 FROM @fsxn_stage/path/to/file.parquet LIMIT 5;  -- Data read

Step 4: Create External Table (optional)

The following DDL is simplified for readability. See the GitHub SQL scripts for the exact tested definition.

CREATE OR REPLACE EXTERNAL TABLE my_ext_table
  LOCATION = @fsxn_stage/sensor-data/
  FILE_FORMAT = (TYPE = PARQUET)
  AUTO_REFRESH = FALSE;

Internal Table vs External Table — Design Guide

Understanding the difference between internal (managed) tables and external tables is critical for architecture decisions when integrating FSx for ONTAP with Snowflake.

Comparison Matrix

Aspect	External Table (on FSx S3 AP)	Internal Table (COPY INTO)
Data location	Remains on FSx for ONTAP (zero-copy)	Copied into Snowflake-managed storage
Multi-protocol access	Same data via NFS/SMB/S3 AP simultaneously	Only accessible via Snowflake
Data freshness	Real-time (reads current file state)	Stale until next COPY INTO
Query performance	Slower (estimated ~2-5s for small queries based on observed S3 AP GetObject latency)	Faster (sub-second with micro-partitions, pruning)
Governance (Tags, Masking)	✅ Full support	✅ Full support
Time Travel	❌ Not available	✅ Available (up to 90 days)
Cortex AI (text functions)	✅ Direct (SUMMARIZE, TRANSLATE, etc.)	✅ Direct
Cortex AI (Vision/TO_FILE)	❌ TO_FILE blocked on FSx S3 AP	✅ Works on internal stage
Cortex Search (RAG)	❌ Requires COPY INTO first	✅ Direct
ONTAP features preserved	✅ Snapshot, FlexClone, Dedup, FPolicy	❌ Data is outside ONTAP
Storage cost	FSx for ONTAP only (no Snowflake storage)	FSx + Snowflake storage (duplicate)

Decision Flowchart

Q: Does the data need to stay on FSx for ONTAP?
├── YES → External Table
│         Q: Do you need Vision AI or Cortex Search?
│         ├── YES → Hybrid: External Table + selective COPY INTO
│         └── NO → External Table is sufficient (text AI works directly)
│
└── NO → COPY INTO internal table
          Q: Do you need real-time freshness?
          ├── YES → Scheduled COPY INTO (Task) or FPolicy → Lambda → Snowpipe
          └── NO → Batch COPY INTO on schedule

Cost Comparison

Pattern	FSx Storage	Snowflake Storage	Best For
External Table only	✅ (existing)	None	Read-heavy, compliance, multi-protocol
COPY INTO (full)	✅ (existing)	+ full copy	Max performance, Time Travel, full AI
Hybrid (External + selective COPY)	✅ (existing)	+ images/RAG data only	AI workloads with data residency needs

Industry-Specific Recommendations

Industry	Recommended Pattern	Rationale	PoC Success Criteria
Manufacturing	External Table + PARSE_DOCUMENT (OCR)	Data stays on FSx; inspection images processed in place	OCR extracts text from 10+ inspection images in <10s each
Financial Services	Hybrid (External Table + COPY INTO for Cortex Search)	Compliance requires data on FSx; RAG needs internal table	Cortex Search returns relevant compliance docs in <500ms
Healthcare	External Table + SnapLock	PHI must not leave controlled storage; immutable audit	SELECT on External Table succeeds with governance tags applied
Media / Entertainment	External Table + COPY FILES (Vision AI)	Large media files stay on FSx; selective staging for AI	Vision AI describes image content correctly via staging path
Cross-Industry Analytics	COPY INTO (full)	Maximum query performance; data duplication acceptable	COPY INTO completes in <10s for representative dataset

Snowpipe Alternatives for FSx for ONTAP

Since FSx S3 AP does not support S3 Event Notifications, standard Snowpipe auto-ingest is not available. Use these alternatives:

Option 1: FPolicy → Lambda → SNS → Snowpipe REST API (Recommended)

FSx for ONTAP ──FPolicy──▶ Lambda ──▶ SNS ──▶ Snowpipe REST API ──▶ COPY INTO target table
     │                                              │
     └── NFS/SMB users access same data             └── Snowflake governance on loaded data

Latency: Seconds (<30s from file write to Snowflake availability)
Complexity: Medium (requires FPolicy configuration + Lambda function)
Best for: Near-real-time ingestion requirements

FPolicy throughput note: FPolicy introduces minimal latency on the NFS/SMB I/O path (typically <1ms per operation for passthrough mode). However, under high-frequency file write workloads (thousands of files/second), validate throughput impact on the FSx for ONTAP file system before production deployment.

Option 2: Snowflake Task + COPY INTO (Simple)

-- Create a task that runs COPY INTO every 5 minutes
CREATE OR REPLACE TASK fsxn_ingest_task
  WAREHOUSE = COMPUTE_WH
  SCHEDULE = '5 MINUTE'
AS
  COPY INTO target_table FROM @fsxn_stage_with_arn/incoming/
  FILE_FORMAT = (TYPE = PARQUET)
  PATTERN = '.*[.]parquet';

ALTER TASK fsxn_ingest_task RESUME;

Latency: Minutes (configurable schedule interval)
Complexity: Low (pure Snowflake SQL)
Best for: Batch ingestion where minutes-level latency is acceptable

Option 3: Snowpipe REST API (Manual Trigger)

Applications call the Snowpipe REST API with a file list when new files are known:

Latency: Seconds (triggered by application)
Complexity: Low (API call from any application)
Best for: Application-controlled ingestion workflows

Snowpipe / COPY INTO Supported Formats

Format	Snowpipe	COPY INTO	External Table	Notes
CSV	✅	✅	✅	Delimiter, header, encoding options
JSON	✅	✅	✅	Nested, semi-structured
Parquet	✅	✅	✅	Column pruning, predicate pushdown
Avro	✅	✅	✅	Schema evolution supported
ORC	✅	✅	✅	Read-only
XML	✅	✅	✅	Native support

Stop Criteria

Stop the Snowflake direct-access PoC when:

SELECT from stage fails with AccessDenied after AWS_ACCESS_POINT_ARN is configured and IAM/AP/FSx permissions are proven correct
The workload requires Iceberg Table write-back (conditional writes not supported on FSx S3 AP)
Data owner does not approve the access path
ReadTimeout occurs (check SVM DNS/AD configuration — see Networking Troubleshooting)

Regulated Workload Checklist

Before using Snowflake + FSx S3 AP for regulated data:

[ ] Confirm the S3 Access Point file-system user identity and least-privilege permissions
[ ] Confirm Snowflake role privileges for stage, external table, and tag access
[ ] Define whether users may generate presigned or scoped URLs (prefer BUILD_SCOPED_FILE_URL for governed access)
[ ] Record derived data locations if COPY INTO loads data into Snowflake tables
[ ] Define manual refresh schedule and evidence retention
[ ] Store approval owner, review date, and expiration date
[ ] Validate that GET_PRESIGNED_URL is not used as a bypass for query-level governance
[ ] If Vision AI is required: Approve COPY FILES to internal stage (data moves to Snowflake-managed storage)
[ ] If Cross-Region Inference is enabled: Verify that image/document data may be processed in US/EU regions
[ ] If Cortex Search is used: Approve COPY INTO (data moves to Snowflake storage) AND Cortex Search Service index creation (data residency changes twice — once for table load, once for search index). Cortex Search Service index is stored in the Snowflake account region.

Store the checklist result with an approval ID, owner, review date, expiration date, and evidence location so the PoC decision can be audited later.

Cross-Region Inference — Data Residency Warning

When CORTEX_ENABLED_CROSS_REGION = 'ANY_REGION' is set, Cortex AI functions may route data to model endpoints in other AWS regions (US, EU) for processing. For regulated workloads:

Verify: Does your compliance framework allow data processing outside the home region?
Alternatives: Use AWS_US or AWS_EU instead of ANY_REGION to limit routing scope
Mitigation: Process only non-regulated images via Vision AI; keep PHI/PII in text-only Cortex functions (which run in-region)
Documentation: Record which Cross-Region setting is used and which data types are processed

Compliance Framework Mapping

Framework	Recommended Pattern	Key Controls
HIPAA (PHI)	External Table + SnapLock + FPolicy audit	Data never leaves FSx; file access audited; admin cannot delete during retention
SOX (Financial)	COPY INTO + Time Travel + audit trail	Full change history; point-in-time queries for audit
GDPR (PII)	External Table + Row Access Policy + Tag-based Masking	Data minimization at query time; PII masked for non-authorized roles
FINRA (Records)	External Table + SnapLock Compliance	Non-erasable, non-writable records for retention period

Approval Evidence Example

approval_id: "FSXN-SF-POC-001"
data_owner: "<name/group>"
security_owner: "<name/group>"
platform_owner: "<name/group>"
allowed_prefixes:
  - "s3://<ap-alias>/sensor-data/"
  - "s3://<ap-alias>/bronze/"
allowed_operations:
  - LIST
  - SELECT (External Table)
  - COPY INTO (load only)
  - Directory Table
  - BUILD_SCOPED_FILE_URL
  - Cortex text functions (SUMMARIZE, TRANSLATE, SENTIMENT)
  - COPY FILES to internal stage (for Vision AI only)
disallowed_operations:
  - GET_PRESIGNED_URL for regulated data
  - COPY INTO unload (write-back)
  - Cortex LLM on PHI/PII without human review
  - Cross-Region Inference on regulated images (unless approved)
cross_region_inference: "ANY_REGION"  # or "DISABLED" for regulated data
review_date: "<YYYY-MM-DD>"
expiration_date: "<YYYY-MM-DD>"
evidence_location: "verification-pack/snowflake/evidence/<date>/evidence-record.yaml"

COPY INTO unload (write-back to FSx S3 AP) was not validated in this article. Although FSx S3 AP supports PutObject, Snowflake unload behavior should be tested separately before positioning write-back as supported.

Data residency note: COPY INTO (load) and COPY FILES change the data residency model — source files remain on FSx, but a derived copy is created in Snowflake-managed storage. Cross-Region Inference may further route data to other regions. Treat loaded tables and staged files as derived regulated data and apply retention, classification, and deletion controls separately.

Troubleshooting Playbook

When Snowflake access to FSx for ONTAP S3 AP fails, isolate one layer at a time:

Stage configuration — Is AWS_ACCESS_POINT_ARN set? Without it, GetObject will fail.
IAM — Does the Storage Integration role have s3:GetObject, s3:ListBucket on the S3 AP ARN?
S3 AP policy — Does the Access Point resource policy allow the Snowflake IAM user ARN?
FSx file system — Is the file system user (e.g., root) permitted to read the target files?
Network — Is the AP internet-origin? (Snowflake SaaS cannot use VPC-origin APs)
Operational — Does vserver services dns check show healthy DNS? (ReadTimeout = DNS/AD issue)

Known Failure Signatures

Symptom	Likely layer	Next step
LIST works, SELECT fails with "access denied"	Missing `AWS_ACCESS_POINT_ARN`	Add ARN parameter to stage
LIST and SELECT both fail with "access denied"	IAM role or S3 AP policy	Check DESCRIBE INTEGRATION, verify trust policy
ReadTimeout (no response)	SVM DNS/AD or FSx backend	Check `vserver services dns check`; verify S3 AP lifecycle
Stage creation fails	Storage Integration config	Verify STORAGE_ALLOWED_LOCATIONS includes the AP alias
External Table creation fails	Stage or file format issue	Verify LIST works first, then check FILE_FORMAT
COPY INTO fails	File format mismatch or permissions	Verify SELECT works first

What This Article Does Not Conclude

This article does not conclude that Snowflake + FSx for ONTAP S3 AP is production-certified for all workloads. It documents the behavior observed in one validated environment and identifies the configuration required for successful integration.

Specifically, this article does not validate:

Snowpipe auto-ingest (requires S3 Event Notifications)
Iceberg Table write-back (requires conditional writes)
COPY INTO unload / write-back to FSx S3 AP
Snowpark File Access (SnowflakeFile.open) for binary processing
Performance at scale (large file counts, concurrent queries, large directory refreshes, or mixed NFS/SMB/S3 workload contention on the FSx file system)
Private connectivity (PrivateLink) path

Operational Note: ReadTimeout vs AccessDenied

During this validation series, all S3 APs on one SVM became unresponsive for 7+ days due to orphaned DNS/AD configuration.

Important distinction:

ReadTimeout (no response) → Check SVM DNS/AD configuration
AccessDenied (immediate error) → Check AWS_ACCESS_POINT_ARN stage parameter

See FSx S3 AP Networking — DNS/AD Troubleshooting for details.

Lessons Learned

1. Platform documentation holds the answer

The AWS_ACCESS_POINT_ARN parameter exists in Snowflake's CREATE STAGE documentation. The initial "no workaround" conclusion was premature — always check platform docs for S3 AP-specific parameters before concluding incompatibility.

2. The same pattern recurs across platforms

Both Snowflake (AWS_ACCESS_POINT_ARN) and Databricks (access_point field) require explicit S3 AP ARN configuration. This appears to be a recurring integration pattern: platforms that generate restrictive session policies need an explicit parameter so the generated policy includes the regional access point ARN format.

3. LIST ≠ READ (but the fix is simple)

The partial success (LIST works, SELECT doesn't) is confusing but has a clear fix. The root cause is that ListBucket uses bucket-level ARN matching while GetObject requires object-level ARN matching — and the AP ARN parameter resolves both.

4. SVM DNS/AD configuration can silently break S3 AP

ReadTimeout (not AccessDenied) indicates an operational issue, not a session policy issue. Check vserver services dns check on the SVM.

5. Pre-signed URLs work but are not a governed path

GET_PRESIGNED_URL() generates valid URLs for FSx S3 AP objects. However, this bypasses Snowflake query governance and should not be used as a production workaround for regulated workloads.

What to Tell Stakeholders

Current recommendation (8 out of 10 tested AI functions validated on FSx data):

Use Snowflake External Stage with AWS_ACCESS_POINT_ARN for governed read access to FSx for ONTAP data
Use External Tables for governed schema abstraction with tags and access policies
Use Dynamic Tables (TARGET_LAG = '1 hour', FULL refresh) for automated transformation from External Table source — confirmed by Snowflake Support (May 2026)
Use COPY INTO when data needs to be loaded into Snowflake for ML/AI processing or Managed Iceberg Tables for open-format interoperability
Use Directory Table for unstructured data cataloging
Use Snowflake Data Sharing to distribute curated datasets to partners/suppliers with governance (Row Access Policy, Tags)
Do not rely on Snowpipe AUTO_REFRESH — use scheduled ALTER EXTERNAL TABLE REFRESH via Task instead
Do not position Iceberg write-back on FSx S3 AP as supported
For end-to-end RAG, use Cortex Search (validated: External Table → COPY INTO → Cortex Search Service, 198ms query) or Bedrock Knowledge Bases (AWS-documented path, no copy needed)

The AI-Ready Data Product Journey (Snowflake Path)

This article validates the Snowflake-specific path in the series' multi-engine data product journey:

FSx for ONTAP (source of truth)
  ↓ S3 Access Point + AWS_ACCESS_POINT_ARN
External Table (zero-copy governed read)
  ↓ Cortex AI text functions work here (SUMMARIZE, TRANSLATE, SENTIMENT — no copy needed)
  ↓
Dynamic Table (TARGET_LAG = '1 hour', FULL refresh)
  ↓ Automated enrichment with Cortex AI in SELECT clause
  ↓
Cortex Search Service (semantic search / RAG, 198ms)
  ↓
Data Sharing (governed distribution to partners/suppliers)

Key insight: The fastest path from "NAS files" to "AI-ready data product shared with partners" is:

External Table (zero-copy, immediate) → text AI functions work today
Dynamic Table (automated enrichment) → Cortex AI in SELECT clause
Cortex Search (RAG) → semantic search over enriched data
Data Sharing → governed distribution without file transfer

Support Update (May 2026, Case #01359983): Snowflake confirmed that COPY INTO from FSx for ONTAP S3 AP External Stage → Managed Iceberg Table is supported. Dynamic Tables with External Table source work with REFRESH_MODE = FULL (min TARGET_LAG 60s). This enables the open-format bridge: FSx for ONTAP → Snowflake Managed Iceberg → readable by Databricks/Athena/EMR via standard Iceberg readers.

This validation should be used to guide architecture selection and stage configuration, not as a production certification.

What's Next

Part 1: Athena — Query NAS Data In Place (validated read-oriented SQL path)
Part 2: Databricks — A Layer-by-Layer Validation of Observed Boundaries (session policy + access_point field)
Part 4: DuckDB Lambda — Serverless analytics at $0.00001/query (for teams that need lightweight, zero-idle-cost SQL without warehouse management)
Part 5: EMR Spark — Read-Write ETL Pipeline (for teams that need distributed Spark processing with write-back to S3 for downstream lakehouse consumption)

References

Key achievement: This validation established that Snowflake + FSx for ONTAP S3 AP provides a governed, AI-ready read path — 8 out of 10 tested Cortex AI functions work on NAS data, External Tables enable full governance (tags, masking, row policies), and Cortex Search delivers 198ms semantic search over NAS-originated documents. This is the most complete governed integration path validated in this series.

This article documents observed behavior in one validated environment (Snowflake Standard edition, AWS ap-northeast-1, May 2026). Platform behavior may change with future updates.

Disclaimer: This article is an independent validation report and does not represent Snowflake, AWS, or NetApp official guidance. Product behavior, support status, and platform capabilities may change. Always validate in your own environment and consult vendor documentation and support channels.