<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</title>
    <description>The latest articles on DEV Community by Yoshiki Fujiwara(藤原 善基)@AWS Community Builder (@yoshikifujiwara).</description>
    <link>https://dev.to/yoshikifujiwara</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1143688%2F2e0886ff-292c-4e8f-a588-bc7629c2321b.jpeg</url>
      <title>DEV Community: Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</title>
      <link>https://dev.to/yoshikifujiwara</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/yoshikifujiwara"/>
    <language>en</language>
    <item>
      <title>42 Patterns, Category Architecture, and HA LifeKeeper Monitoring — FSx for ONTAP S3 Access Points, Phase 18</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 21 Jun 2026 16:54:22 +0000</pubDate>
      <link>https://dev.to/aws-builders/42-patterns-category-architecture-and-ha-lifekeeper-monitoring-fsx-for-ontap-s3-access-points-4fh</link>
      <guid>https://dev.to/aws-builders/42-patterns-category-architecture-and-ha-lifekeeper-monitoring-fsx-for-ontap-s3-access-points-4fh</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Phase 18 restructures the entire repository from 41 flat directories into a categorized &lt;code&gt;solutions/&lt;/code&gt; hierarchy, adds &lt;strong&gt;HA LifeKeeper Monitoring&lt;/strong&gt; as a new pattern category, introduces &lt;strong&gt;5 category-specific architecture diagrams&lt;/strong&gt;, and establishes modern Python project infrastructure. The repository now contains &lt;strong&gt;42 deployable patterns&lt;/strong&gt; organized for discoverability at scale.&lt;/p&gt;

&lt;p&gt;Repository: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Restructure?
&lt;/h2&gt;

&lt;p&gt;With 41 pattern directories at the repository root, navigating the project had become unwieldy. New contributors could not quickly find patterns by domain, the README required extensive scrolling, and adding new categories (HA, GenAI) had no clear placement convention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before (flat)&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;legal-compliance/  financial-idp/  semiconductor-eda/  sap-erp-adjacent/
flexcache-anycast-dr/  genai-kb-selfservice-curation/  event-driven-fpolicy/
... (41 directories at root)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;After (categorized)&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;solutions/
├── industry/           # 28 UC patterns (UC1-UC28)
├── flexcache/          # 7 FlexCache/FlexClone patterns
├── genai/              # 2 GenAI patterns (UC29-UC30)
├── sap/                # SAP/ERP pattern
├── ha/                 # HA monitoring (new)
├── event-driven/       # 2 FPolicy event-driven patterns
└── edge/               # CDN/edge delivery
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Move: git mv with History Preservation
&lt;/h2&gt;

&lt;p&gt;All 41 directories were moved using &lt;code&gt;git mv&lt;/code&gt;, preserving full commit history. Key renames include:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Old Path&lt;/th&gt;
&lt;th&gt;New Path&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sap-erp-adjacent/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;solutions/sap/erp-adjacent/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Category grouping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dynamic-flexcache-render-workflow/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;solutions/flexcache/dynamic-render-workflow/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Shorter name&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;genai-kb-selfservice-curation/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;solutions/genai/kb-selfservice-curation/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Strip prefix&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;devops-flexclone-cicd/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;solutions/flexcache/devops-cicd/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Strip prefix&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;content-edge-delivery/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;solutions/edge/content-delivery/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Category grouping&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Use &lt;code&gt;git log --follow &amp;lt;file&amp;gt;&lt;/code&gt; to trace history across the move.&lt;/p&gt;




&lt;h2&gt;
  
  
  HA LifeKeeper Monitoring — New Pattern
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;SIOS LifeKeeper&lt;/strong&gt; is a Linux/Windows HA clustering solution that can be used on Amazon EC2 for application-aware failover scenarios. With FSx for ONTAP Multi-AZ as shared storage (NFS/iSCSI, depending on OS and configuration), this pattern focuses on observing LifeKeeper logs without putting monitoring agents on the HA nodes.&lt;/p&gt;

&lt;p&gt;The new HA pattern (&lt;code&gt;solutions/ha/lifekeeper-monitoring/&lt;/code&gt;) provides &lt;strong&gt;non-intrusive&lt;/strong&gt; log analysis:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TB
    subgraph "HA Cluster"
        LK1[LifeKeeper Node 1&amp;lt;br/&amp;gt;Active]
        LK2[LifeKeeper Node 2&amp;lt;br/&amp;gt;Standby]
    end

    subgraph "Shared Storage"
        FSXN[FSx for ONTAP Multi-AZ]
        S3AP[S3 Access Point&amp;lt;br/&amp;gt;Read-only log access]
    end

    subgraph "Analysis Pipeline"
        SFN[Step Functions]
        DISC[Discovery Lambda&amp;lt;br/&amp;gt;Log classification]
        PROC[Processing Lambda&amp;lt;br/&amp;gt;Bedrock Root Cause Analysis]
        RPT[Report Lambda&amp;lt;br/&amp;gt;Health score + alerts]
    end

    LK1 --&amp;gt;|Log write| FSXN
    FSXN --&amp;gt; S3AP --&amp;gt;|Non-intrusive read| DISC
    SFN --&amp;gt; DISC --&amp;gt; PROC --&amp;gt; RPT
    PROC --&amp;gt;|Nova Pro| BEDROCK[Amazon Bedrock]
    RPT --&amp;gt; SNS[SNS Alert]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Key Design Decisions
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Non-intrusive to HA nodes&lt;/strong&gt;: No monitoring agent is installed on HA nodes. The S3 AP read path avoids host-level changes, while still consuming FSx/S3 API throughput like any other read workload.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human-in-the-loop&lt;/strong&gt;: AI analysis is advisory only. LifeKeeper's own health checks handle failover decisions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Health Scoring&lt;/strong&gt;: 0-100 score with deductions for failover events, comm path latency, and resource state anomalies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Root Cause Analysis&lt;/strong&gt;: Bedrock Nova Pro analyzes state transitions (ISP→OSF, ISS→ISP) to identify likely causes.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Implementation note&lt;/strong&gt;: This pattern observes LifeKeeper logs and produces advisory analysis. It does not replace LifeKeeper cluster design, quorum/witness configuration, split-brain prevention, protocol-specific recovery kit setup, or application-level failover testing.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Example Operational Metrics
&lt;/h3&gt;

&lt;p&gt;These are suggested evaluation metrics for future validation. Phase 18 verifies the DemoMode pipeline, not real-cluster failover triage time.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Demo Target&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Operations&lt;/td&gt;
&lt;td&gt;Time from workflow start to structured triage report&lt;/td&gt;
&lt;td&gt;&amp;lt; 10 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Technical&lt;/td&gt;
&lt;td&gt;Log discovery completeness&lt;/td&gt;
&lt;td&gt;All configured LifeKeeper log paths in scope&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Quality&lt;/td&gt;
&lt;td&gt;False positive alert rate&lt;/td&gt;
&lt;td&gt;&amp;lt; 5% (requires real-cluster validation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;Monthly monitoring cost&lt;/td&gt;
&lt;td&gt;&amp;lt; $15 (5-min polling, 1 cluster)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Deploy (DemoMode)
&lt;/h3&gt;

&lt;p&gt;DemoMode=true deploys without FSx for ONTAP — uses a regular S3 bucket with sample LifeKeeper logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;solutions/ha/lifekeeper-monitoring
sam build &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sam deploy &lt;span class="nt"&gt;--guided&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="nv"&gt;DemoMode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;S3AccessPointAlias&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-demo-bucket &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;OutputBucketName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-output-bucket
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;DemoMode verification confirms: SAM template deploys successfully, Step Functions workflow executes all states, Discovery Lambda classifies sample logs correctly, and Processing Lambda generates a health score report. Bedrock analysis produces structured, advisory observations based on log patterns; failover decisions remain outside the AI workflow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fn9j5gotqqo5ddnaasfyn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fn9j5gotqqo5ddnaasfyn.png" alt="Step Functions graph view — HA LifeKeeper Monitoring workflow completed successfully" width="800" height="415"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Verified in ap-northeast-1 on 2026-06-21. The sample log set intentionally contains failure-like events, resulting in a demo health score of 40/100. The score is not a benchmark for LifeKeeper or FSx for ONTAP; it demonstrates that the scoring pipeline detects and reports anomalies from sample logs.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Coming in Phase 19&lt;/strong&gt;: Full E2E verification with a real SIOS LifeKeeper HA cluster (AWS Marketplace + FSx for ONTAP Multi-AZ) including live failover testing, actual state transition detection, and Bedrock RCA quality assessment against real-world failure scenarios.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Category Architecture Diagrams
&lt;/h2&gt;

&lt;p&gt;Phase 18 adds &lt;strong&gt;5 mermaid architecture diagrams&lt;/strong&gt; to README.md (both JA and EN), each in a collapsible &lt;code&gt;&amp;lt;details&amp;gt;&lt;/code&gt; block:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Key Components&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🏭 FlexCache&lt;/td&gt;
&lt;td&gt;ONTAP REST API → HealthCheck → RouteDecision → DynamoDB routing → Create/Cleanup lifecycle&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🤖 GenAI&lt;/td&gt;
&lt;td&gt;FPolicy → SQS → EventBridge → Bedrock KB → RetrieveAndGenerate / Agentic tools&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🛡️ HA&lt;/td&gt;
&lt;td&gt;S3 AP non-intrusive read → Bedrock RCA → Health score → SNS alerts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⚡ Event-Driven&lt;/td&gt;
&lt;td&gt;FPolicy Engine → ECS Fargate TCP → SQS → EventBridge rule routing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🌐 Edge/CDN&lt;/td&gt;
&lt;td&gt;3 delivery modes (ORIGIN_PULL, OAC, PUBLISH_PUSH) → vendor-neutral CDN&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All diagrams include &lt;code&gt;accTitle&lt;/code&gt; and &lt;code&gt;accDescr&lt;/code&gt; for screen reader accessibility.&lt;/p&gt;




&lt;h2&gt;
  
  
  Project Infrastructure Improvements
&lt;/h2&gt;

&lt;h3&gt;
  
  
  pyproject.toml (PEP 621)
&lt;/h3&gt;

&lt;p&gt;Modern Python project metadata with unified tool configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[tool.ruff]&lt;/span&gt;
&lt;span class="py"&gt;target-version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"py312"&lt;/span&gt;
&lt;span class="py"&gt;line-length&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;

&lt;span class="nn"&gt;[tool.pytest.ini_options]&lt;/span&gt;
&lt;span class="py"&gt;addopts&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"-v --tb=short --import-mode=importlib"&lt;/span&gt;

&lt;span class="nn"&gt;[tool.coverage.report]&lt;/span&gt;
&lt;span class="py"&gt;fail_under&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Dependency Pinning
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="c"&gt;# requirements.txt — exact versions for reproducibility
&lt;/span&gt;&lt;span class="py"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;=1.43.29&lt;/span&gt;
&lt;span class="py"&gt;urllib3&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;=2.7.0&lt;/span&gt;
&lt;span class="py"&gt;jsonschema&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;=4.17.3&lt;/span&gt;

&lt;span class="c"&gt;# requirements-dev.txt
&lt;/span&gt;&lt;span class="py"&gt;pytest&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;=9.1.0&lt;/span&gt;
&lt;span class="py"&gt;hypothesis&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;=6.155.2&lt;/span&gt;
&lt;span class="py"&gt;moto&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;=5.2.2&lt;/span&gt;
&lt;span class="py"&gt;ruff&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;=0.15.17&lt;/span&gt;
&lt;span class="py"&gt;cfn-lint&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;=1.51.4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  .cfnlintrc
&lt;/h3&gt;

&lt;p&gt;Project-wide cfn-lint configuration that discovers all templates under &lt;code&gt;solutions/&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;templates&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solutions/**/template.yaml"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solutions/**/template-deploy.yaml"&lt;/span&gt;
&lt;span class="na"&gt;ignore_checks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;W3002&lt;/span&gt;  &lt;span class="c1"&gt;# Local CodeUri (sam build handles upload)&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;W1031&lt;/span&gt;  &lt;span class="c1"&gt;# Fn::Sub false positive with Secrets Manager ARNs&lt;/span&gt;
&lt;span class="na"&gt;regions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;ap-northeast-1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Additional Tooling
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;.gitattributes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Consistent line endings, language-specific diff drivers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;.github/PULL_REQUEST_TEMPLATE.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Project-specific PR checklist&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;solutions/README.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Category navigation index&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CHANGELOG.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Keep a Changelog format, all releases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CONTRIBUTING.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;"Adding a New Pattern" section with category guide&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  CI/CD Changes
&lt;/h2&gt;

&lt;p&gt;The CI pipeline was split to avoid pytest &lt;code&gt;importlib&lt;/code&gt; mode namespace collisions when multiple patterns with identically-named &lt;code&gt;handler.py&lt;/code&gt; files are collected together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Before: single pytest invocation (collision risk)&lt;/span&gt;
&lt;span class="s"&gt;pytest shared/tests/ solutions/**/tests/ --cov=shared&lt;/span&gt;

&lt;span class="c1"&gt;# After: isolated invocations&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run shared tests with coverage&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest shared/tests/ --cov=shared&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run pattern tests&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest solutions/industry/*/tests/ solutions/flexcache/*/tests/ ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also added &lt;code&gt;persist-credentials: false&lt;/code&gt; to all &lt;code&gt;actions/checkout&lt;/code&gt; steps (zizmor security hardening).&lt;/p&gt;




&lt;h2&gt;
  
  
  Multi-Perspective Review
&lt;/h2&gt;

&lt;p&gt;The restructuring and HA monitoring pattern were reviewed from partner delivery, storage architecture, HA operations, security, CI/CD, accessibility, and contributor onboarding perspectives. The review resulted in wording changes around neutrality, operational caveats, connector validation, HA safety boundaries, and repository discoverability.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Means for You
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Partner / SI / delivery team&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pattern selection is now intuitive by category — pick &lt;code&gt;solutions/industry/&lt;/code&gt; for industry PoCs, &lt;code&gt;solutions/flexcache/&lt;/code&gt; for distributed workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;New contributor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;CONTRIBUTING.md&lt;/code&gt; now includes "Adding a New Pattern" with required checklist and category selection guide&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Existing repository users&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;git log --follow &amp;lt;file&amp;gt;&lt;/code&gt; still works. &lt;code&gt;sam build&lt;/code&gt; and &lt;code&gt;sam deploy&lt;/code&gt; are unchanged (run from the pattern directory)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CI/CD&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Zero changes needed in your &lt;code&gt;samconfig.toml&lt;/code&gt; or deployment scripts — CodeUri is relative to template.yaml&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Infrastructure cost&lt;/strong&gt;: Zero. This is a repository organization change; it does not deploy AWS resources by itself.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Phase 16-17 blog articles (GenAI patterns) — ready to publish&lt;/li&gt;
&lt;li&gt;dev.to series updated for directory restructuring (Phase 13 link fixed)&lt;/li&gt;
&lt;li&gt;Next pattern candidates under evaluation&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns.git
&lt;span class="nb"&gt;cd &lt;/span&gt;FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns

&lt;span class="c"&gt;# Browse patterns by category&lt;/span&gt;
&lt;span class="nb"&gt;ls &lt;/span&gt;solutions/

&lt;span class="c"&gt;# Deploy HA LifeKeeper monitoring (DemoMode)&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;solutions/ha/lifekeeper-monitoring
sam build &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sam deploy &lt;span class="nt"&gt;--guided&lt;/span&gt; &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="nv"&gt;DemoMode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# Run tests&lt;/span&gt;
make test-quick &lt;span class="nv"&gt;PYTHON&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;.venv/bin/python
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;Yoshiki Fujiwara&lt;/p&gt;

</description>
      <category>aws</category>
      <category>lifekeeper</category>
      <category>amazonfsxfornetappontap</category>
      <category>s3accesspoints</category>
    </item>
    <item>
      <title>Amazon Quick Agentic Workspace Powered by FSx for ONTAP S3 Access Points — Phase 17</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 21 Jun 2026 15:54:23 +0000</pubDate>
      <link>https://dev.to/aws-builders/amazon-quick-agentic-workspace-powered-by-fsx-for-ontap-s3-access-points-phase-17-1cf2</link>
      <guid>https://dev.to/aws-builders/amazon-quick-agentic-workspace-powered-by-fsx-for-ontap-s3-access-points-phase-17-1cf2</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;UC30 bridges the gap between file-based business data and AI-powered actions. Business users maintain structured and unstructured data on an FSx for ONTAP SMB share, while &lt;strong&gt;Amazon Quick&lt;/strong&gt; (Index / Sight / Flows) consumes it through S3 Access Points and a serverless Action API — providing search, BI, and governed action workflows from a single workspace.&lt;/p&gt;

&lt;p&gt;Where UC29 focuses on "self-service knowledge ingestion into Bedrock KB," UC30 focuses on &lt;strong&gt;unifying search, analytics, and action execution&lt;/strong&gt; behind Quick Suite's agentic interface.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;New in this release&lt;/strong&gt;: &lt;code&gt;generate_brief_with_web&lt;/code&gt; action augments internal context with real-time web search results via AgentCore Web Search Tool (GA June 2026), enabling briefs that combine primary internal data with current public context.&lt;/p&gt;

&lt;p&gt;Repository: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt; (see &lt;code&gt;solutions/genai/quick-agentic-workspace/&lt;/code&gt; and &lt;code&gt;samconfig.toml.example&lt;/code&gt;)&lt;/p&gt;




&lt;h2&gt;
  
  
  Amazon Quick × S3 AP Data Mapping
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Quick Feature&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;S3 AP Data&lt;/th&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Quick Index / Research&lt;/td&gt;
&lt;td&gt;Unstructured file search&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;index/&amp;lt;role&amp;gt;/&lt;/code&gt; (md/pdf)&lt;/td&gt;
&lt;td&gt;S3 AP as data source&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Quick Sight&lt;/td&gt;
&lt;td&gt;Structured BI &amp;amp; visualization&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;analytics/&amp;lt;role&amp;gt;/&lt;/code&gt; (csv)&lt;/td&gt;
&lt;td&gt;Glue/Athena (Athena Query Lambda)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Quick Flows&lt;/td&gt;
&lt;td&gt;Action automation&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;flows/&amp;lt;role&amp;gt;/&lt;/code&gt; (json)&lt;/td&gt;
&lt;td&gt;Action API (API Gateway + Lambda + Bedrock)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Quick Flows + Web&lt;/td&gt;
&lt;td&gt;Web-augmented briefs&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;flows/&amp;lt;role&amp;gt;/&lt;/code&gt; + web&lt;/td&gt;
&lt;td&gt;Action API + AgentCore Web Search (opt-in)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Seven roles (sales / marketing / finance / IT / operations / legal / developers) share the same AI-dedicated volume — reusable from UC29.&lt;/p&gt;

&lt;p&gt;Design note: FSx for ONTAP S3 Access Points are useful as an integration boundary, but they do not remove the need to validate each consuming service connector. The access path combines S3/IAM policy evaluation with file-system-level identity authorization.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fc9f2071r5vmlyf5xjobl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fc9f2071r5vmlyf5xjobl.png" alt="Amazon Quick home screen showing Index, Sight, and Flows" width="800" height="426"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Amazon Quick provides a unified workspace — search (Index), BI (Sight), and action automation (Flows) — powered by FSx for ONTAP data via S3 Access Points.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Windows Explorer (drag &amp;amp; drop into quick-workspace/ SMB share)
  ├── index/&amp;lt;role&amp;gt;/ → Quick Index (unstructured search)
  ├── analytics/&amp;lt;role&amp;gt;/ → Glue/Athena → Quick Sight (BI)
  └── flows/&amp;lt;role&amp;gt;/ → Action API → Quick Flows (actions)

Action API (6 actions):
  API Gateway (IAM auth / SigV4)
  → Lambda (per-action authorization + HITL gate)
  → generate_brief           → Bedrock Converse (internal context only)
  → generate_brief_with_web  → Bedrock Converse + AgentCore Web Search (hybrid)
  → create_action_item       → SNS notification
  → request_approval         → DynamoDB (HITL entry)
  → approve                  → DynamoDB (admin only)
  → execute_approved         → DynamoDB check + execution (enforced HITL)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Hybrid RAG Flow (generate_brief_with_web)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;Quick&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Flows&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;request:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"generate_brief_with_web"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;├─→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Internal&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;context&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;params.context&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;FSx&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ONTAP&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;file&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;content)&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;├─→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;AgentCore&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Web&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Search&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(us-east&lt;/span&gt;&lt;span class="mi"&gt;-1&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;MCP&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;protocol)&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="err"&gt;query&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;params.web_query&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;params.title&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Amazon&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;web&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;index&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;snippets&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;URLs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;titles&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;dates&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;└─→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Bedrock&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Converse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(ap-northeast&lt;/span&gt;&lt;span class="mi"&gt;-1&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
           &lt;/span&gt;&lt;span class="err"&gt;system&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;prompt:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;internal&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;primary,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;web&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;supplementary,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;untrusted&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Unified&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;brief&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="err"&gt;Internal:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="err"&gt;Web:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;title&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="err"&gt;(URL)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;citations&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Security Design
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Authentication + Per-Action Authorization
&lt;/h3&gt;

&lt;p&gt;The Action API uses IAM authentication (SigV4). The handler extracts the &lt;strong&gt;authenticated caller identity&lt;/strong&gt; (&lt;code&gt;requestContext.identity&lt;/code&gt;) — not self-declared body fields — and performs per-action authorization:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ACTION_AUTH_MODE=open&lt;/code&gt; (default/demo): No enforcement; audit fields still bound to authenticated caller&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;For production, use &lt;code&gt;ACTION_AUTH_MODE=enforce&lt;/code&gt; and explicitly define &lt;code&gt;AUTHORIZED_PRINCIPALS&lt;/code&gt; and &lt;code&gt;ADMIN_PRINCIPALS&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ACTION_AUTH_MODE=enforce&lt;/code&gt; (production):

&lt;ul&gt;
&lt;li&gt;Read-only actions (&lt;code&gt;generate_brief&lt;/code&gt;, &lt;code&gt;generate_brief_with_web&lt;/code&gt;): always allowed&lt;/li&gt;
&lt;li&gt;Mutating actions: caller must match &lt;code&gt;AUTHORIZED_PRINCIPALS&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Admin actions (&lt;code&gt;approve&lt;/code&gt;): caller must match &lt;code&gt;ADMIN_PRINCIPALS&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Mismatch → &lt;strong&gt;403 Forbidden&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Enforced Human-in-the-Loop (HITL)
&lt;/h3&gt;

&lt;p&gt;High-risk operations are gated by a DynamoDB approval store:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;request_approval&lt;/code&gt; → persists record as &lt;code&gt;pending_approval&lt;/code&gt; (enforced=true)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;approve&lt;/code&gt; → admin transitions to &lt;code&gt;approved&lt;/code&gt; (ConditionExpression prevents race)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;execute_approved&lt;/code&gt; → &lt;strong&gt;only executes if record is &lt;code&gt;approved&lt;/code&gt;&lt;/strong&gt;; otherwise 409&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Verified live: execute pre-approval → 409, post-approval → 200, re-execute → 409 (no replay).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Security note&lt;/strong&gt;: Approval records have a &lt;strong&gt;7-day TTL&lt;/strong&gt; (DynamoDB Time-to-Live). Stale pending approvals auto-expire, preventing indefinite accumulation of unreviewed requests. Expired records cannot be approved or executed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Additional Controls
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prompt injection defense&lt;/strong&gt;: Both &lt;code&gt;generate_brief&lt;/code&gt; and &lt;code&gt;generate_brief_with_web&lt;/code&gt; treat context as untrusted data with explicit delimiter boundaries (&lt;code&gt;&amp;lt;internal_context&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;web_search_results&amp;gt;&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web query safety&lt;/strong&gt;: Only &lt;code&gt;params.web_query&lt;/code&gt; or &lt;code&gt;params.title&lt;/code&gt; is sent to Web Search — never internal document content&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Raw SQL disabled by default&lt;/strong&gt;: &lt;code&gt;ALLOW_RAW_SQL=false&lt;/code&gt;; role-level data boundaries enforced via Lake Formation (LF-TBAC) in production&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Results bucket hardening&lt;/strong&gt;: PublicAccessBlock + TLS-only + 30-day lifecycle&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API throttling&lt;/strong&gt;: Rate/burst limits against denial-of-wallet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Search citation obligation&lt;/strong&gt;: Source URLs + titles are always included in responses (Acceptable Use Policy compliance)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Web-Augmented Brief Generation (opt-in)
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;GA at AWS Summit NYC 2026 (June 17, 2026). Powered by AgentCore Web Search Tool.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;Business briefs based solely on internal documents lack current market context. A sales brief about a product launch needs both the internal product spec &lt;em&gt;and&lt;/em&gt; awareness of relevant public announcements published recently. A legal compliance brief needs both the internal policy document &lt;em&gt;and&lt;/em&gt; the latest regulatory guidance.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution
&lt;/h3&gt;

&lt;p&gt;A new action &lt;code&gt;generate_brief_with_web&lt;/code&gt; combines internal context with real-time web search results. The internal context remains the primary source; web results are supplemental, cited, and treated as untrusted input.&lt;/p&gt;

&lt;h3&gt;
  
  
  Usage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"generate_brief_with_web"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Q3 Data Protection Regulatory Update"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Internal operations follow FISC safety standards..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"web_query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"data protection regulation 2026 Japan financial services"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Response
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"completed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"generate_brief_with_web"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Q3 Data Protection Regulatory Update"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"brief"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Based on internal FISC compliance documentation... Additionally, [Web: FISC 2026 Revision Summary](https://example.com/fisc) published on 2026-06-10 introduces..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"web_citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://example.com/fisc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FISC 2026 Revision Summary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"publishedDate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-06-10"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"web_search_enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"guardrail_applied"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Design Properties
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Detail&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Internal context priority&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Internal documents are the primary source; web supplements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Graceful degradation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Web Search failure → behaves like &lt;code&gt;generate_brief&lt;/code&gt; (internal only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Citation separation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Internal sources and web sources are visually distinct in the brief&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Query safety&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Only the &lt;code&gt;web_query&lt;/code&gt; (or &lt;code&gt;title&lt;/code&gt;) is sent externally — never internal content&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cross-region&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Gateway in us-east-1 (Web Search Tool constraint); adds ~100-200ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Authorization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Read-only action (same tier as &lt;code&gt;generate_brief&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Prompt injection defense&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Web results wrapped in &lt;code&gt;&amp;lt;web_search_results&amp;gt;&lt;/code&gt; as untrusted data&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Activation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sam deploy &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;EnableWebSearch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;AgentCoreGatewayId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;gateway-id&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;AgentCoreGatewayRegion&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Without these parameters, &lt;code&gt;generate_brief_with_web&lt;/code&gt; still works but produces internal-only briefs (graceful degradation).&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Verification Findings
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Lake Formation + Athena
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F9oksuqa5wmygv9o6ny04.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F9oksuqa5wmygv9o6ny04.png" alt="Athena recent queries showing UC30 quick-workspace queries" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Athena queries running against Glue tables backed by S3 AP data — the foundation for Quick Sight analytics.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgha9svhpx9xrhjg3ikgv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgha9svhpx9xrhjg3ikgv.png" alt="CloudFormation stack deployed for UC30" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The UC30 CloudFormation stack with all resources (API Gateway, Lambda, DynamoDB ApprovalsTable, Athena WorkGroup) deployed.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The test account had Lake Formation governing the Data Catalog. The Athena Query Lambda's execution role required &lt;strong&gt;Lake Formation permission grants&lt;/strong&gt; (DESCRIBE on DB, SELECT/DESCRIBE on tables) in addition to IAM. Production deployments should design LF-TBAC for role-based data visibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  Amazon Quick × FSx for ONTAP S3 AP Integration Boundary
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fpnv0byog65u1pkl7wr0v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fpnv0byog65u1pkl7wr0v.png" alt="Quick S3 Knowledge Base connection attempt with FSx for ONTAP S3 AP alias" width="799" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Amazon Quick's S3 KB connector accepts the S3 AP alias but authorization fails due to FSx for ONTAP's dual-layer auth — leading to the recommendation below.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fahj08qvya1cxlg6bs68v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fahj08qvya1cxlg6bs68v.png" alt="Quick Knowledge integrations panel" width="800" height="426"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Quick provides multiple data integration paths — for FSx for ONTAP data, Bedrock KB (UC29) or Athena-mediated access is the validated route in this repository.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Amazon Quick's S3 knowledge base connector accepts an FSx for ONTAP S3 AP alias as a "valid URL," but connection verification fails with &lt;em&gt;"You do not have permissions to access the S3 bucket."&lt;/em&gt; This is &lt;strong&gt;not&lt;/strong&gt; a missing IAM permission that can be simply added — it is a structural authorization path mismatch. Three factors block the standard connector path:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;ARN format mismatch&lt;/strong&gt; — FSx for ONTAP S3 APs use &lt;code&gt;arn:aws:s3:{region}:{account}:accesspoint/{name}&lt;/code&gt; (not the &lt;code&gt;arn:aws:s3:::{bucket}&lt;/code&gt; format). The connector's internal IAM evaluation likely targets the alias as a bucket name, which does not match IAM policy evaluation for the AP ARN.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AP resource policy rejects the principal&lt;/strong&gt; — Adding Quick's data access role to the S3 AP resource policy returns &lt;code&gt;MalformedPolicy: Invalid principal in policy&lt;/code&gt;, indicating a principal constraint on FSx for ONTAP S3 AP policies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Layer 2: filesystem identity&lt;/strong&gt; — Even if IAM layers were resolved, the ONTAP file-system identity (UNIX UID or AD user) associated with the S3 AP must independently permit read access to the target files.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;For readers attempting this path&lt;/strong&gt;: Use CloudTrail to capture the exact &lt;code&gt;AccessDenied&lt;/code&gt; event and identify Quick's calling principal ARN. If the &lt;code&gt;MalformedPolicy&lt;/code&gt; constraint can be resolved (potentially by configuring the S3 AP with an AD-based identity rather than UNIX root), the direct path may become viable. As of June 2026, &lt;strong&gt;Bedrock Knowledge Base → FSx for ONTAP S3 AP&lt;/strong&gt; is the confirmed working route for RAG ingestion.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Evidence-based implementation guidance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FSx for ONTAP → RAG&lt;/strong&gt;: Bedrock KB (UC29) is the validated path in this repository&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quick Index&lt;/strong&gt;: Stage to a standard S3 bucket for predictable connector behavior&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quick Sight (BI)&lt;/strong&gt;: Athena-mediated access works (QuickSight role needs Athena/Glue/LF/results-bucket permissions)&lt;/li&gt;
&lt;li&gt;Direct Quick → FSx for ONTAP S3 AP (standard connector path): &lt;strong&gt;validated — does not work&lt;/strong&gt; (tested with UNIX root identity; &lt;code&gt;MalformedPolicy&lt;/code&gt; on AP resource policy). AD-based S3 AP identity configuration is tracked as a future hypothesis&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Glue Tables
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;analytics/&amp;lt;role&amp;gt;/&lt;/code&gt; CSVs are pointed to by Glue tables (&lt;code&gt;sales_pipeline&lt;/code&gt; / &lt;code&gt;it_incidents&lt;/code&gt;) created via Athena DDL. LOCATION uses S3 AP alias format: &lt;code&gt;s3://&amp;lt;alias&amp;gt;/quick-workspace/analytics/&amp;lt;role&amp;gt;/&lt;/code&gt;. For scale, migrate to Parquet + partitioning to reduce Athena scanned costs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Data Classification
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;th&gt;Classification&lt;/th&gt;
&lt;th&gt;Rationale&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Action API response (generate_brief)&lt;/td&gt;
&lt;td&gt;INTERNAL&lt;/td&gt;
&lt;td&gt;Source-derived summary; no external disclosure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Action API response (generate_brief_with_web)&lt;/td&gt;
&lt;td&gt;INTERNAL&lt;/td&gt;
&lt;td&gt;Contains internal citations; web portion is PUBLIC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Action API response (create/approve/execute)&lt;/td&gt;
&lt;td&gt;INTERNAL&lt;/td&gt;
&lt;td&gt;Business operation records&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Athena query results (results bucket)&lt;/td&gt;
&lt;td&gt;INTERNAL&lt;/td&gt;
&lt;td&gt;Encrypted + 30-day lifecycle + TLS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB ApprovalsTable&lt;/td&gt;
&lt;td&gt;INTERNAL&lt;/td&gt;
&lt;td&gt;Approval state metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SNS notifications&lt;/td&gt;
&lt;td&gt;INTERNAL&lt;/td&gt;
&lt;td&gt;Action summaries only; no file content&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web Search results (raw)&lt;/td&gt;
&lt;td&gt;PUBLIC&lt;/td&gt;
&lt;td&gt;External public information&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Extend &lt;code&gt;shared/data_classification.py&lt;/code&gt; for regulated workloads (CUI / FISC / HIPAA).&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Monthly estimate&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Amazon Quick&lt;/td&gt;
&lt;td&gt;Per-user/plan billing&lt;/td&gt;
&lt;td&gt;Separate; unsubscribe when done&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Athena&lt;/td&gt;
&lt;td&gt;Scanned-data pricing&lt;/td&gt;
&lt;td&gt;Reduce with Parquet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda / API Gateway&lt;/td&gt;
&lt;td&gt;Serverless pay-per-use&lt;/td&gt;
&lt;td&gt;&amp;lt; $10 for moderate usage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bedrock LLM (briefs)&lt;/td&gt;
&lt;td&gt;Usage-based&lt;/td&gt;
&lt;td&gt;Usage-based; verify the current model price&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DynamoDB (approvals)&lt;/td&gt;
&lt;td&gt;Pay-per-request&lt;/td&gt;
&lt;td&gt;Minimal for approval records&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS Budgets alarm&lt;/td&gt;
&lt;td&gt;Free (SNS delivery cost only)&lt;/td&gt;
&lt;td&gt;Created when &lt;code&gt;NotificationEmail&lt;/code&gt; set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AgentCore Web Search (opt-in)&lt;/td&gt;
&lt;td&gt;Per-query pricing (see &lt;a href="https://aws.amazon.com/bedrock/agentcore/pricing/" rel="noopener noreferrer"&gt;AgentCore pricing&lt;/a&gt;)&lt;/td&gt;
&lt;td&gt;Gateway invocation pricing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-region transfer (opt-in)&lt;/td&gt;
&lt;td&gt;&amp;lt; $0.02&lt;/td&gt;
&lt;td&gt;us-east-1 ↔ ap-northeast-1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Teardown / rebuild: one-command idempotent scripts (&lt;code&gt;scripts/teardown-uc29-uc30.sh&lt;/code&gt; / &lt;code&gt;scripts/rebuild-uc29-kb.py&lt;/code&gt;)&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns.git
&lt;span class="nb"&gt;cd &lt;/span&gt;FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/solutions/genai/quick-agentic-workspace

&lt;span class="c"&gt;# Install dependencies&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt        &lt;span class="c"&gt;# or: uv pip install -r requirements.txt&lt;/span&gt;

&lt;span class="nb"&gt;cat &lt;/span&gt;samconfig.toml.example  &lt;span class="c"&gt;# Review parameters&lt;/span&gt;

sam build &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sam deploy &lt;span class="nt"&gt;--guided&lt;/span&gt;

&lt;span class="c"&gt;# DemoMode=true runs without FSx for ONTAP (regular S3 bucket)&lt;/span&gt;

&lt;span class="c"&gt;# Optional: Enable Web Search hybrid RAG&lt;/span&gt;
sam deploy &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;EnableWebSearch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;AgentCoreGatewayId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;gateway-id&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;AgentCoreGatewayRegion&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Governance Note
&lt;/h2&gt;

&lt;p&gt;This article is technical architecture guidance, not legal, compliance, or regulatory advice. Amazon Quick features, pricing, regional availability, and connector behavior are subject to change — verify with official documentation and your own account settings. S3 AP data source boundaries are at volume/prefix granularity. For per-user visibility control, use Quick's document-level ACL or Custom Permission-Aware RAG. Web Search Tool usage requires compliance with the Acceptable Use Policy (source citations must be retained and displayed in end-user output).&lt;/p&gt;




&lt;p&gt;Yoshiki Fujiwara&lt;/p&gt;

</description>
      <category>aws</category>
      <category>amazonquick</category>
      <category>amazonfsxfornetappontap</category>
      <category>s3accesspoints</category>
    </item>
    <item>
      <title>Amazon Bedrock Knowledge Bases + FSx for ONTAP S3 Access Points: Self-Service AI Curation via Windows Drag &amp; Drop — Phase 16</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 21 Jun 2026 15:19:58 +0000</pubDate>
      <link>https://dev.to/aws-builders/amazon-bedrock-knowledge-bases-fsx-for-ontap-s3-access-points-self-service-ai-curation-via-3poe</link>
      <guid>https://dev.to/aws-builders/amazon-bedrock-knowledge-bases-fsx-for-ontap-s3-access-points-self-service-ai-curation-via-3poe</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;"Put files in this Windows folder and your AI assistant can use them after the next governed sync" — UC29 reduces the handoff friction around enterprise AI knowledge updates. Business users maintain Amazon Bedrock Knowledge Base content through &lt;strong&gt;Windows Explorer drag &amp;amp; drop&lt;/strong&gt; on an FSx for ONTAP SMB share. No S3 console, no ETL, no copy. Data remains in the file system as the operational source; the S3 Access Point provides a governed access path for ingestion without creating a separate object copy.&lt;/p&gt;

&lt;p&gt;Three maturity stages + one opt-in enhancement:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scenario A (manual)&lt;/strong&gt;: User places files → triggers sync from console/CLI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scenario B (scheduled)&lt;/strong&gt;: EventBridge Scheduler + Step Functions poll every 15 minutes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scenario C (event-driven)&lt;/strong&gt;: FPolicy detects file placement → real-time KB sync&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid RAG (opt-in)&lt;/strong&gt;: Internal KB answers augmented with real-time web search via AgentCore Web Search Tool (GA June 2026)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Vector store: &lt;strong&gt;Amazon S3 Vectors&lt;/strong&gt; — a managed, cost-aware vector store option for Bedrock Knowledge Bases, with metadata filtering available for permission-aware retrieval designs.&lt;/p&gt;

&lt;p&gt;Repository: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt; (see &lt;code&gt;solutions/genai/kb-selfservice-curation/&lt;/code&gt; and &lt;code&gt;samconfig.toml.example&lt;/code&gt; for deployment parameters)&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Pattern?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Challenge&lt;/th&gt;
&lt;th&gt;Traditional approach&lt;/th&gt;
&lt;th&gt;This pattern&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Knowledge updates blocked on IT&lt;/td&gt;
&lt;td&gt;Ticket → manual ETL&lt;/td&gt;
&lt;td&gt;Business user drags &amp;amp; drops&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dual management (NAS + S3 copy)&lt;/td&gt;
&lt;td&gt;Source drifts from S3 replica&lt;/td&gt;
&lt;td&gt;S3 AP reads the single source directly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Forgotten re-ingestion&lt;/td&gt;
&lt;td&gt;Manual and easy to forget&lt;/td&gt;
&lt;td&gt;Automatic (Scenario B/C)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Specialist skills required&lt;/td&gt;
&lt;td&gt;ETL / S3 / Bedrock expertise&lt;/td&gt;
&lt;td&gt;Familiar Windows folder operations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Answers limited to internal docs&lt;/td&gt;
&lt;td&gt;No current external context&lt;/td&gt;
&lt;td&gt;Hybrid RAG: internal + web search (opt-in)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vector store cost&lt;/td&gt;
&lt;td&gt;OpenSearch Serverless ~$175/month minimum&lt;/td&gt;
&lt;td&gt;S3 Vectors: cost-aware pay-per-use profile&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;Users drag files into an &lt;strong&gt;AI-dedicated NTFS volume&lt;/strong&gt; (SMB share, ACL-separated by department) on FSx for ONTAP. The same volume is exposed via S3 Access Point as a Bedrock Knowledge Base data source:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F33ekl3ijq81dutipoo8a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F33ekl3ijq81dutipoo8a.png" alt="Windows Explorer showing the SMB share with 7 role-based folders (sales, marketing, finance, IT, operations, legal, developers)" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Users see a familiar Windows folder structure — each department has its own folder with NTFS ACL separation.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fqemain6saj2aiwikq8tt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fqemain6saj2aiwikq8tt.png" alt="Product catalog files placed via drag &amp;amp; drop in the sales/product-catalog folder" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Drag &amp;amp; drop a product spec into the sales folder — that is the only content-maintenance step for the user.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Windows Explorer (drag &amp;amp; drop)
  → FSx for ONTAP volume (ai_knowledge/)
  → S3 Access Point (read-only, internet-origin)
  → Bedrock Knowledge Base Data Source (inclusionPrefixes: ai-knowledge/)
  → StartIngestionJob → S3 Vectors (vector store) updated
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Seven business roles (sales / marketing / finance / IT / operations / legal / developers) share the volume with NTFS ACL-based permission separation.&lt;/p&gt;

&lt;p&gt;Design note: S3 Access Points for FSx for ONTAP let S3-compatible AWS services access file data without copying it to an S3 bucket. Access is authorized twice: first through AWS/IAM policies on the access point path, and then through the file-system identity and permissions associated with the access point.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hybrid RAG Flow (opt-in)
&lt;/h3&gt;

&lt;p&gt;When &lt;code&gt;EnableWebSearch=true&lt;/code&gt;, the Query Lambda augments internal KB answers with real-time web information:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User question
  ├─→ [1] Bedrock KB RetrieveAndGenerate (internal docs via S3 AP → S3 Vectors)
  ├─→ [2] AgentCore Web Search (us-east-1, cross-region MCP call)
  │       Amazon-operated web index (tens of billions of docs, continuously updated)
  └─→ [3] Bedrock Converse → unified answer with dual citations
            [Internal: product-spec.pdf] + [Web: Market Report 2026](https://...)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Scenario B: Scheduled Automation (Step Functions)
&lt;/h2&gt;

&lt;p&gt;The most common production deployment. EventBridge Scheduler triggers a Step Functions workflow every 15 minutes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EventBridge Scheduler (rate(15 minutes))
  └─→ Step Functions State Machine
       ├─→ DetectAndStartIngestion Lambda
       │     • ListObjectsV2 via S3 AP → compare with last-known state
       │     • If changes detected → StartIngestionJob
       │     • If no changes → skip (cost-zero run)
       ├─→ Wait (30s) → CheckIngestionStatus Lambda (poll loop)
       └─→ NotifySuccess / NotifyFailure → SNS
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key design choices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Differential detection&lt;/strong&gt;: Compares current S3 AP listing against prior state; only triggers ingestion when files actually changed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotent&lt;/strong&gt;: Re-running the same schedule with no file changes is a no-op (no wasted Bedrock ingestion cost)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observable&lt;/strong&gt;: Each Step Functions execution is visible in the console with per-state timing and error details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configurable interval&lt;/strong&gt;: The &lt;code&gt;ScheduleExpression&lt;/code&gt; parameter accepts any EventBridge rate/cron expression&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the &lt;strong&gt;required safety net&lt;/strong&gt; for Scenario C (see below) — it catches any files missed during the lost-update window.&lt;/p&gt;




&lt;h2&gt;
  
  
  Scenario C: FPolicy Event-Driven Real-Time Sync
&lt;/h2&gt;

&lt;p&gt;Scenario B's 15-minute polling is replaced by FPolicy real-time detection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Windows/NFS file operation
  → FPolicy instant detection (CREATE/WRITE/DELETE/RENAME)
  → FPolicy Server → SQS → Bridge Lambda → EventBridge custom bus
  → EventBridge Rule (file_path prefix = ai_knowledge)
  → KB Trigger Lambda (debounce) → StartIngestionJob
  → Bedrock KB → reflected in tens of seconds to minutes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The FPolicy → SQS → EventBridge front-end reuses the existing &lt;code&gt;solutions/event-driven/fpolicy&lt;/code&gt; pattern infrastructure. UC29 adds only an EventBridge rule and the KB Trigger Lambda.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lost-Update Window (Critical)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fcto6v8g44h5ttfkw8kcn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fcto6v8g44h5ttfkw8kcn.png" alt="EventBridge rule filtering FPolicy events for ai_knowledge prefix" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The EventBridge rule routes only FPolicy events matching the &lt;code&gt;ai_knowledge&lt;/code&gt; volume path to the KB Trigger Lambda.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Bedrock Ingestion performs a full-source scan at job start time. Files added &lt;em&gt;during&lt;/em&gt; a running job are &lt;strong&gt;not&lt;/strong&gt; included in that execution. Scenario C alone does &lt;strong&gt;not&lt;/strong&gt; guarantee zero missed files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mandatory&lt;/strong&gt;: Always pair Scenario C with Scenario B (periodic reconcile sync) as a safety net. The KB Trigger Lambda skips when a job is already in progress (debounce + ConflictException handling + reserved concurrency = 2).&lt;/p&gt;

&lt;h3&gt;
  
  
  Namespace Pitfall
&lt;/h3&gt;

&lt;p&gt;FPolicy reports ONTAP volume-path namespace (&lt;code&gt;ai_knowledge/...&lt;/code&gt;, underscore). The KB S3 ingestion prefix (&lt;code&gt;ai-knowledge/&lt;/code&gt;, hyphen) is a &lt;strong&gt;different&lt;/strong&gt; namespace. Initial implementation confused the two, causing false-skip. The EventBridge rule and Lambda secondary filter now use a dedicated &lt;code&gt;FPOLICY_PATH_FILTER&lt;/code&gt; parameter for the volume-path namespace.&lt;/p&gt;




&lt;h2&gt;
  
  
  Hybrid RAG: Internal KB + Web Search (opt-in)
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;GA at AWS Summit NYC 2026 (June 17, 2026). Powered by AgentCore Web Search Tool.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Enterprise knowledge from FSx for ONTAP is treated as the primary internal source, while public web context is supplemental and untrusted. For questions that benefit from current external context — regulatory updates, market trends, public market information — the Query Lambda can optionally augment answers with real-time web search results.&lt;/p&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Internal KB retrieval&lt;/strong&gt; (always): Bedrock KB searches S3 Vectors for relevant chunks from FSx for ONTAP documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web search&lt;/strong&gt; (opt-in): AgentCore Gateway invokes Amazon's purpose-built web index via MCP protocol&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unified answer&lt;/strong&gt;: Bedrock Converse merges both contexts, with internal documents as primary source&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Key Design Decisions
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Decision&lt;/th&gt;
&lt;th&gt;Rationale&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Opt-in&lt;/strong&gt; (&lt;code&gt;EnableWebSearch=false&lt;/code&gt; default)&lt;/td&gt;
&lt;td&gt;Most enterprise QA needs internal data only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Graceful degradation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Web Search failure → internal-only answer (no error surfaced to user)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Cross-region&lt;/strong&gt; (us-east-1 Gateway)&lt;/td&gt;
&lt;td&gt;Web Search Tool is us-east-1 only; adds ~100-200ms latency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Query safety&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Only user's question text is sent to Web Search — never internal document content&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Citation separation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;[Internal: filename]&lt;/code&gt; vs &lt;code&gt;[Web: title](URL)&lt;/code&gt; — users see exactly which source informed each claim&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Prompt injection defense&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Web results wrapped in &lt;code&gt;&amp;lt;web_search_results&amp;gt;&lt;/code&gt; with explicit "untrusted data" instruction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Acceptable Use compliance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Source URLs and titles are always displayed (Web Search Tool TOS requirement)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Deployment
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sam deploy &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;EnableWebSearch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;AgentCoreGatewayId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;gateway-id&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;AgentCoreGatewayRegion&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example Response
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"completed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"What are the latest FISC guidelines for cloud data protection?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"answer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Based on internal documentation, our current FISC compliance posture covers... Additionally, [Web: FISC 2026 Guidelines Update](https://example.com/fisc-2026) published last month introduces..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"s3://.../legal/compliance/fisc-overview.pdf"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"web_citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://example.com/fisc-2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FISC 2026 Guidelines Update"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"web"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"web_search_enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Verification Highlights
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Windows-Identity S3 Access Point with Dedicated AD
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F5yiskgokk293fm3z0nru.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F5yiskgokk293fm3z0nru.png" alt="Windows Explorer Quick Access showing mapped FSx for ONTAP SMB share" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Windows EC2 domain-joined to AWS Managed Microsoft AD, with the FSx SMB share mapped — proving the literal drag &amp;amp; drop experience works end to end.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;To demonstrate the literal Windows drag &amp;amp; drop experience, we built a dedicated AWS Managed Microsoft AD + domain-joined Windows EC2 + AD-joined SVM:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AD-joined SVM OU&lt;/strong&gt;: AWS Managed AD's &lt;code&gt;OU=Computers&lt;/code&gt; lacks delegation rights → use the domain-name OU (&lt;code&gt;OU=&amp;lt;domain&amp;gt;,DC=...&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CIFS share creation&lt;/strong&gt;: Executes against the &lt;strong&gt;filesystem management LIF&lt;/strong&gt;, not the SVM LIF&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Windows-identity S3 AP&lt;/strong&gt;: Works correctly with a running dedicated AD; files dropped in Explorer are readable via S3 AP&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Deletion Lifecycle
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fsen1lqx0ufp2p8a5l7hd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fsen1lqx0ufp2p8a5l7hd.png" alt="Bedrock KB data source showing Sync button and ingestion status" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Bedrock KB data source connected to the FSx for ONTAP S3 AP alias. Click "Sync" for manual ingestion, or let Scenario B/C automate it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F022n7bjwsbny601zthha.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F022n7bjwsbny601zthha.png" alt="Step Functions execution graph — all states succeeded" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Scenario B's Step Functions workflow: detect changes → start ingestion → poll status → notify on completion.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;"User deletes a file → AI forgets it" verified end-to-end: file deletion → next sync → &lt;code&gt;numberOfDocumentsDeleted=1&lt;/code&gt; → re-query returns "no information found". Powered by &lt;code&gt;dataDeletionPolicy=DELETE&lt;/code&gt;. For urgent revocation between syncs, call the Ingestion API directly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Performance Considerations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Shared bandwidth&lt;/strong&gt;: S3 AP reads share the FSx throughput capacity (128/256/512 MBps) with NFS/SMB workloads. Scenario B's 15-minute interval and Scenario C's reserved concurrency (2) throttle ingestion flow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bulk re-index&lt;/strong&gt;: For full re-ingestion (e.g., embedding model change), use a &lt;strong&gt;FlexClone volume&lt;/strong&gt; as the Ingestion target — zero impact on production I/O, consistent point-in-time read&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tiering&lt;/strong&gt;: Frequently accessed AI knowledge should remain on the SSD tier. Capacity Pool retrieval latency affects GetObject time during ingestion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Search latency&lt;/strong&gt;: Cross-region call to us-east-1 adds ~100-200ms. Total hybrid query latency depends on KB size, model, and network conditions (KB retrieve + Web Search + Converse generation)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Access Control — Three Layers
&lt;/h2&gt;

&lt;p&gt;S3 AP boundaries are volume/prefix-level. For per-user visibility:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Search narrowing&lt;/strong&gt; = Bedrock KB metadata filters (this UC; not AWS authorization)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document-level ACL&lt;/strong&gt; = Amazon Quick S3 Knowledge Base (UC30; user/group-level)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chunk-level permission filter&lt;/strong&gt; = Custom Permission-Aware RAG (FC3; AD SID/NTFS ACL for regulated industries)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Web Search results are public information — no ACL filtering needed. However, the &lt;strong&gt;unified answer&lt;/strong&gt; that combines internal + web sources is subject to the same access control as internal-only answers (the internal citations remain permission-scoped).&lt;/p&gt;




&lt;h2&gt;
  
  
  Vector Store: Why S3 Vectors
&lt;/h2&gt;

&lt;p&gt;This pattern uses &lt;strong&gt;Amazon S3 Vectors&lt;/strong&gt; as the Bedrock KB vector store. OpenSearch Serverless remains a valid option when its operational and latency profile fits the workload better.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;OpenSearch Serverless&lt;/th&gt;
&lt;th&gt;S3 Vectors&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Minimum monthly cost&lt;/td&gt;
&lt;td&gt;~$175 (2 OCU)&lt;/td&gt;
&lt;td&gt;Pay-per-use only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost at scale&lt;/td&gt;
&lt;td&gt;OCU-based&lt;/td&gt;
&lt;td&gt;Cost savings for large vector datasets (see AWS documentation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Metadata filtering&lt;/td&gt;
&lt;td&gt;Supported&lt;/td&gt;
&lt;td&gt;Supported (department, owner, role)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Permission-Aware RAG compatibility&lt;/td&gt;
&lt;td&gt;Supported&lt;/td&gt;
&lt;td&gt;Compatible with metadata-filtered retrieval designs; authorization enforced by application layer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Infrastructure management&lt;/td&gt;
&lt;td&gt;Managed but OCU scaling required&lt;/td&gt;
&lt;td&gt;Managed vector operations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scale&lt;/td&gt;
&lt;td&gt;Millions of vectors&lt;/td&gt;
&lt;td&gt;2 billion vectors per index&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query latency&lt;/td&gt;
&lt;td&gt;Sub-100ms&lt;/td&gt;
&lt;td&gt;Sub-100ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For this project — 28 industry patterns + PoC-to-production lifecycle — S3 Vectors' pay-per-use model is the right fit. We evaluated Bedrock Managed Knowledge Base (GA June 2026, AWS Summit NYC) but chose Custom KB + S3 Vectors for cost control, ACL metadata flexibility, and FSx for ONTAP lifecycle integration (see ADR: &lt;code&gt;docs/investigations/managed-kb-vs-custom-kb-s3vectors.md&lt;/code&gt;).&lt;/p&gt;




&lt;h2&gt;
  
  
  Data Classification
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;th&gt;Classification&lt;/th&gt;
&lt;th&gt;Rationale&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;KB vectors + metadata&lt;/td&gt;
&lt;td&gt;INTERNAL&lt;/td&gt;
&lt;td&gt;Inherits source file classification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ingestion job status / SNS&lt;/td&gt;
&lt;td&gt;INTERNAL&lt;/td&gt;
&lt;td&gt;Operational metadata only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch Metrics / Logs&lt;/td&gt;
&lt;td&gt;INTERNAL&lt;/td&gt;
&lt;td&gt;Aggregate metrics, no file content&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web Search results&lt;/td&gt;
&lt;td&gt;PUBLIC&lt;/td&gt;
&lt;td&gt;External public information&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hybrid answer (internal + web)&lt;/td&gt;
&lt;td&gt;INTERNAL&lt;/td&gt;
&lt;td&gt;Contains internal document citations&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For regulated workloads (CUI / FISC / HIPAA), extend &lt;code&gt;shared/data_classification.py&lt;/code&gt; labels. If retention-period requirements apply, use &lt;code&gt;dataDeletionPolicy=RETAIN&lt;/code&gt; and design a separate purge procedure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Monthly estimate&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lambda (sync + query)&lt;/td&gt;
&lt;td&gt;&amp;lt; $5&lt;/td&gt;
&lt;td&gt;Serverless pay-per-use&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 API (ListObjects, GetObject)&lt;/td&gt;
&lt;td&gt;&amp;lt; $1&lt;/td&gt;
&lt;td&gt;S3 AP reads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EventBridge Scheduler&lt;/td&gt;
&lt;td&gt;&amp;lt; $1&lt;/td&gt;
&lt;td&gt;15-min interval&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bedrock KB Ingestion&lt;/td&gt;
&lt;td&gt;Usage-based&lt;/td&gt;
&lt;td&gt;Per-document embedding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 Vectors&lt;/td&gt;
&lt;td&gt;Usage-based&lt;/td&gt;
&lt;td&gt;Compare with OpenSearch Serverless for your query volume, latency, and operations requirements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bedrock LLM (query)&lt;/td&gt;
&lt;td&gt;Usage-based&lt;/td&gt;
&lt;td&gt;Nova Pro: $0.0008/1K input tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FPolicy Server (Scenario C)&lt;/td&gt;
&lt;td&gt;~$35&lt;/td&gt;
&lt;td&gt;ECS Fargate (set desiredCount=0 when idle)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AgentCore Web Search (opt-in)&lt;/td&gt;
&lt;td&gt;Per-query pricing (see &lt;a href="https://aws.amazon.com/bedrock/agentcore/pricing/" rel="noopener noreferrer"&gt;AgentCore pricing&lt;/a&gt;)&lt;/td&gt;
&lt;td&gt;Gateway invocation pricing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-region transfer (opt-in)&lt;/td&gt;
&lt;td&gt;&amp;lt; $0.02&lt;/td&gt;
&lt;td&gt;us-east-1 ↔ ap-northeast-1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns.git
&lt;span class="nb"&gt;cd &lt;/span&gt;FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/solutions/genai/kb-selfservice-curation

&lt;span class="c"&gt;# Install dependencies (shared modules used by Lambda handlers)&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt        &lt;span class="c"&gt;# or: uv pip install -r requirements.txt&lt;/span&gt;

&lt;span class="c"&gt;# Review parameters&lt;/span&gt;
&lt;span class="nb"&gt;cat &lt;/span&gt;samconfig.toml.example

&lt;span class="c"&gt;# Build and deploy (requires configured AWS credentials + FSx for ONTAP S3 AP)&lt;/span&gt;
sam build &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sam deploy &lt;span class="nt"&gt;--guided&lt;/span&gt;

&lt;span class="c"&gt;# DemoMode=true runs without FSx for ONTAP (regular S3 bucket)&lt;/span&gt;

&lt;span class="c"&gt;# Optional: Enable Web Search hybrid RAG&lt;/span&gt;
sam deploy &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;EnableWebSearch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;AgentCoreGatewayId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;gateway-id&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nv"&gt;AgentCoreGatewayRegion&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-east-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Governance Note
&lt;/h2&gt;

&lt;p&gt;This article is technical architecture guidance, not legal, compliance, or regulatory advice. Pricing, regional availability, and benchmark numbers are time-sensitive; verify them against current AWS documentation before production use. S3 AP data source boundaries are at volume/prefix granularity — for per-user visibility control, consider Custom Permission-Aware RAG. If retention-period requirements (NARA / FISC) apply, use &lt;code&gt;dataDeletionPolicy=RETAIN&lt;/code&gt; and design purge procedures separately. Web Search Tool usage requires compliance with the Acceptable Use Policy (source citations must be displayed).&lt;/p&gt;




&lt;p&gt;Yoshiki Fujiwara&lt;/p&gt;

</description>
      <category>aws</category>
      <category>amazonbedrock</category>
      <category>amazonfsxfornetappontap</category>
      <category>s3accesspoints</category>
    </item>
    <item>
      <title>From Logs to Detection: Building a File Security Pipeline in Datadog for FSx for ONTAP</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 14 Jun 2026 16:44:19 +0000</pubDate>
      <link>https://dev.to/aws-builders/from-logs-to-detection-building-a-file-security-pipeline-in-datadog-for-fsx-for-ontap-2262</link>
      <guid>https://dev.to/aws-builders/from-logs-to-detection-building-a-file-security-pipeline-in-datadog-for-fsx-for-ontap-2262</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;For security teams&lt;/strong&gt;: Four threshold monitors + one ML anomaly monitor + four Cloud SIEM detection rules catch mass file deletion, data exfiltration, permission tampering, and unusual geography in under 5 minutes. EventID codes become human-readable operation names. All created via Datadog API — no manual clicking required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For engineers&lt;/strong&gt;: The Log Pipeline (6 processors including GeoIP enrichment) transforms raw ONTAP XML events into searchable, alertable fields. Five Saved Views cover the most common investigation patterns. Total AWS cost remains ~$1.50/month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For architects&lt;/strong&gt;: This pattern demonstrates the full detection-to-response lifecycle — from raw file system audit events through enrichment, categorization, alerting, cross-service correlation, and automated remediation — entirely serverless, entirely as code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FSx for ONTAP → S3 Access Point → Lambda → Datadog Logs API v2
                                              │
                                              ▼
                                   ┌──────────────────────┐
                                   │ Log Pipeline (6)     │
                                   │  • Category Processor│
                                   │  • Status Remapper   │
                                   │  • Date Remapper     │
                                   │  • Attribute Remapper│
                                   │  • GeoIP Enrichment  │
                                   └──────────┬───────────┘
                                              │
                          ┌───────────────────┼───────────────────┐
                          ▼                   ▼                   ▼
                ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
                │ Log Monitors (4)│ │ Cloud SIEM (4)  │ │ Log Archive     │
                │ + Anomaly (1)   │ │ Security Signals│ │ → S3 → Glacier  │
                └────────┬────────┘ └────────┬────────┘ └─────────────────┘
                         │                   │
                         ▼                   ▼
                ┌──────────────────────────────────────┐
                │ Workflow → Slack + Case + Lambda     │
                │          (ONTAP Snapshot remediation)│
                └──────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Reading guide&lt;/strong&gt;: Sections 1-9 cover the core pipeline (30 min to deploy). Sections 10+ cover advanced SOC integration (additional 15 min). Jump to Deployment if you want to start immediately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Business value&lt;/strong&gt;: This pipeline reduces mean-time-to-detect (MTTD) for file-based threats from "hours/days" (batch log review) to under 5 minutes (real-time alerting). For regulated environments, it provides continuous compliance evidence with automated archival — replacing manual quarterly reviews with always-on monitoring.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is Part 16 of &lt;a href="https://dev.to/aws-builders/why-your-fsx-for-ontap-audit-logs-deserve-better-than-ec2-kod"&gt;Serverless Observability for FSx for ONTAP&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Post-Ingestion Processing Matters
&lt;/h2&gt;

&lt;p&gt;In Part 1, we shipped raw audit events to Datadog. That's necessary but insufficient. Raw events look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"event_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"4660"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CORP&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;contractor-ext-03"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/share/engineering/designs/prototype-v3.dwg"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Audit Failure"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Questions that raw logs can't answer quickly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What is EventID 4660? (Answer: Object Delete)&lt;/li&gt;
&lt;li&gt;Should this alarm? (Answer: depends on volume and context)&lt;/li&gt;
&lt;li&gt;Who should investigate? (Answer: Storage team + SOC)&lt;/li&gt;
&lt;li&gt;What's the user's baseline? (Answer: need faceted historical view)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This article builds the processing layers that transform raw data into actionable security intelligence.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why EventBridge Scheduler?&lt;/strong&gt; FSx for ONTAP S3 Access Points do not support S3 Event Notifications or EventBridge object-level events. Lambda polls on a schedule and uses checkpointing to process only new files. XML format is chosen for Lambda-native parsing without binary dependencies (vs EVTX which requires Windows-specific libraries).&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────────────────────────────┐
│ Lambda Handler (Python 3.12)                                     │
│  • Parse XML → Normalize → HEC-compatible JSON                   │
│  • POST to Datadog Logs API v2                                   │
│  • Fields: event_type, user, path, client_ip, svm, result        │
└──────────────────────────────────────────────────────────────────┘
        │  HTTP 202
        ▼
┌──────────────────────────────────────────────────────────────────┐
│ Datadog Log Pipeline: "FSx for ONTAP Audit Logs"                 │
│ Filter: source:fsxn                                              │
│                                                                  │
│ 1. Category Processor → @operation_name                          │
│    4663→Object Access, 4660→Object Delete, 4656→Handle Request   │
│                                                                  │
│ 2. Status Remapper → log severity from @result                   │
│    "Audit Success" → info, "Audit Failure" → error               │
│                                                                  │
│ 3. Date Remapper → @timestamp as official log time               │
│                                                                  │
│ 4. Attribute Remapper → @user→usr.id, @client_ip→network.client  │
└──────────────────────────────────────────────────────────────────┘
        │
        ▼
┌──────────────────────────────────────────────────────────────────┐
│ Datadog Security Monitors (3)                                    │
│                                                                  │
│ • [FSxN] Mass File Deletion    → &amp;gt;50 deletes/5min per user       │
│ • [FSxN] Abnormal Access Volume → &amp;gt;1000 accesses/1h per user     │
│ • [FSxN] Access Failure Spike  → &amp;gt;10 failures/15min per user     │
└──────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Log Pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpkn02l39ljys4zs2ifxl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpkn02l39ljys4zs2ifxl.png" alt="Datadog Log Pipeline for FSxN" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why a Pipeline?
&lt;/h3&gt;

&lt;p&gt;Without a pipeline, every query requires remembering that &lt;code&gt;4660&lt;/code&gt; means "delete" and &lt;code&gt;4663&lt;/code&gt; means "access." With a pipeline, you search &lt;code&gt;@operation_name:Object Delete&lt;/code&gt; — and Datadog handles the translation at ingest time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Pipeline via API
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;urllib3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;

&lt;span class="n"&gt;sm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;secretsmanager&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ap-northeast-1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_secret_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SecretId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;fsxn-datadog-api-key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;SecretString&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;app_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_secret_value&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SecretId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;datadog/fsxn-app-key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;SecretString&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;http&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;PoolManager&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FSx for ONTAP Audit Logs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;is_enabled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source:fsxn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;processors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category-processor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EventID to Operation Name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;is_enabled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;target&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;operation_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;categories&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@event_type:4663&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Object Access&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@event_type:4656&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Handle Request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@event_type:4660&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Object Delete&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@event_type:4670&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Permission Change&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@event_type:4658&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Handle Close&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@event_type:5140&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Share Access&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@event_type:5145&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Share Check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@event_type:4624&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Logon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@event_type:4634&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Logoff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status-remapper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Map result to log status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;is_enabled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sources&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;date-remapper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Use event timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;is_enabled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sources&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attribute-remapper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Map user to usr.id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;is_enabled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sources&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attribute&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;target&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;usr.id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;target_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attribute&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;preserve_source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;override_on_conflict&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attribute-remapper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Map client_ip to network.client.ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;is_enabled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sources&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client_ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attribute&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;target&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;network.client.ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;target_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attribute&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;preserve_source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;override_on_conflict&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.ap1.datadoghq.com/api/v1/logs/config/pipelines&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DD-API-KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DD-APPLICATION-KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;app_key&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HTTP &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: Pipeline ID = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Site note&lt;/strong&gt;: Replace &lt;code&gt;api.ap1.datadoghq.com&lt;/code&gt; with your Datadog site's API endpoint (e.g., &lt;code&gt;api.datadoghq.com&lt;/code&gt; for US1, &lt;code&gt;api.datadoghq.eu&lt;/code&gt; for EU1).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  EventID Mapping Table
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;EventID&lt;/th&gt;
&lt;th&gt;Operation Name&lt;/th&gt;
&lt;th&gt;MITRE ATT&amp;amp;CK&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;4663&lt;/td&gt;
&lt;td&gt;Object Access&lt;/td&gt;
&lt;td&gt;T1005 (Data from Local System)&lt;/td&gt;
&lt;td&gt;File read/write operation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4656&lt;/td&gt;
&lt;td&gt;Handle Request&lt;/td&gt;
&lt;td&gt;T1005&lt;/td&gt;
&lt;td&gt;File handle opened&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4660&lt;/td&gt;
&lt;td&gt;Object Delete&lt;/td&gt;
&lt;td&gt;T1485 (Data Destruction)&lt;/td&gt;
&lt;td&gt;File deleted&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4670&lt;/td&gt;
&lt;td&gt;Permission Change&lt;/td&gt;
&lt;td&gt;T1222 (File Permissions Modification)&lt;/td&gt;
&lt;td&gt;ACL/permission modified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4658&lt;/td&gt;
&lt;td&gt;Handle Close&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;File handle closed (low signal)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5140&lt;/td&gt;
&lt;td&gt;Share Access&lt;/td&gt;
&lt;td&gt;T1021.002 (SMB/Windows Admin)&lt;/td&gt;
&lt;td&gt;SMB share connected&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5145&lt;/td&gt;
&lt;td&gt;Share Check&lt;/td&gt;
&lt;td&gt;T1135 (Network Share Discovery)&lt;/td&gt;
&lt;td&gt;Share permission checked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4624&lt;/td&gt;
&lt;td&gt;Logon&lt;/td&gt;
&lt;td&gt;T1078 (Valid Accounts)&lt;/td&gt;
&lt;td&gt;Authentication event&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4634&lt;/td&gt;
&lt;td&gt;Logoff&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;Session ended&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Security Monitors
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd99x3gsbs8suebkpzdp8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd99x3gsbs8suebkpzdp8.png" alt="Datadog Security Monitors for FSxN" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitor 1: Mass File Deletion
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;mass_delete&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[FSxN] Mass File Deletion Detected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;log alert&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;logs(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source:fsxn @event_type:4660&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).index(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).rollup(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).by(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).last(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5m&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;) &amp;gt; 50&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;## Mass File Deletion Alert

A user has triggered more than 50 file deletion events within 5 minutes.

**User**: {{@user}}  |  **Count**: {{value}}

### Investigation Steps
1. Check affected paths: `source:fsxn @event_type:4660 @user:{{@user}}`
2. Verify if this is a scheduled cleanup or authorized bulk operation
3. Check the client IP for unexpected sources
4. Correlate with user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s normal deletion patterns

@slack-storage-alerts&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tags&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source:fsxn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;team:storage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;severity:high&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;options&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thresholds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;critical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;warning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;notify_no_data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;renotify_interval&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;evaluation_delay&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why 50?&lt;/strong&gt; In our test environment, normal daily deletions per user stay under 10. The threshold should be tuned based on your environment's baseline — run &lt;code&gt;source:fsxn @event_type:4660 | stats count by @user&lt;/code&gt; over a week to establish what's normal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitor 2: Abnormal Access Volume
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;abnormal_access&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[FSxN] Abnormal Access Volume&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;log alert&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;logs(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source:fsxn @result:&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Audit Success&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="s"&gt;).index(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).rollup(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).by(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).last(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1h&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;) &amp;gt; 1000&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;## Abnormal Access Volume Alert

More than 1000 successful file access events in 1 hour.
May indicate data exfiltration or unauthorized bulk access.

**User**: {{@user}}  |  **Count**: {{value}}

### Investigation Steps
1. Review patterns: `source:fsxn @user:{{@user}}`
2. Check if backup jobs or batch operations are running
3. Verify client IP and compare with known locations
4. Check accessed paths for sensitive content

@slack-security-alerts&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;options&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thresholds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;critical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;warning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;evaluation_delay&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Tuning note&lt;/strong&gt;: Backup service accounts (&lt;code&gt;svc-backup&lt;/code&gt;, &lt;code&gt;svc-indexer&lt;/code&gt;) generate legitimate high-volume access. Exclude them with &lt;code&gt;@user:-svc-*&lt;/code&gt; in the query or create a suppression rule.&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitor 3: Access Failure Spike
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;failure_spike&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[FSxN] Access Failure Spike&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;log alert&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;logs(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source:fsxn @result:&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Audit Failure&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="s"&gt;).index(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).rollup(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).by(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).last(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;15m&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;) &amp;gt; 10&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;## Access Failure Spike

More than 10 access failures in 15 minutes.
May indicate unauthorized access attempts or permission misconfiguration.

**User**: {{@user}}  |  **Count**: {{value}}

### Investigation Steps
1. Check failed paths: `source:fsxn @result:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Audit Failure&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; @user:{{@user}}`
2. Verify if permissions were recently changed
3. Check if user is accessing resources outside their scope
4. Correlate with AD group membership changes

@slack-security-alerts&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;options&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thresholds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;critical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;warning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;evaluation_delay&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Saved Views
&lt;/h2&gt;

&lt;p&gt;Five pre-configured views for common investigation patterns:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;View&lt;/th&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;When to use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;FSxN File Deletions&lt;/td&gt;
&lt;td&gt;&lt;code&gt;source:fsxn @event_type:4660&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Monitor triggered, investigating which files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FSxN Access Failures&lt;/td&gt;
&lt;td&gt;&lt;code&gt;source:fsxn @result:"Audit Failure"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Permission denied investigation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FSxN All Events&lt;/td&gt;
&lt;td&gt;&lt;code&gt;source:fsxn&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;General audit stream&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FSxN Sensitive Share Access&lt;/td&gt;
&lt;td&gt;&lt;code&gt;source:fsxn (@path:*finance* OR @path:*hr* OR @path:*legal*)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Sensitive data access review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FSxN After-Hours Access&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;source:fsxn&lt;/code&gt; (filtered by time)&lt;/td&gt;
&lt;td&gt;Off-hours activity detection&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each view includes pre-configured columns (user, path, client_ip, event_type) so analysts don't need to customize the table every time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Facets for One-Click Filtering
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F86psvqpt6woh4keun2o6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F86psvqpt6woh4keun2o6.png" alt="Datadog Log Explorer with Facets" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Facets add clickable filters to the left sidebar. Create them from any log entry's detail panel:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Facet&lt;/th&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Event Type&lt;/td&gt;
&lt;td&gt;&lt;code&gt;@event_type&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter to specific operations (4660=delete, 4663=access)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operation&lt;/td&gt;
&lt;td&gt;&lt;code&gt;@operation_name&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Human-readable version (after pipeline applies)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User&lt;/td&gt;
&lt;td&gt;&lt;code&gt;@user&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter by specific user under investigation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SVM&lt;/td&gt;
&lt;td&gt;&lt;code&gt;@svm&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Isolate events to specific storage virtual machines&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File Path&lt;/td&gt;
&lt;td&gt;&lt;code&gt;@path&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter by directory or share&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Client IP&lt;/td&gt;
&lt;td&gt;&lt;code&gt;@client_ip&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filter by source workstation/server&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Result&lt;/td&gt;
&lt;td&gt;&lt;code&gt;@result&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Quick split between success and failure events&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Fields are searchable with &lt;code&gt;@field:value&lt;/code&gt; syntax even without Facets. Facets add UI convenience — they're not required for alerting or saved views.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Cost Reality
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lambda (5-min schedule, ~1s execution)&lt;/td&gt;
&lt;td&gt;~$0.50&lt;/td&gt;
&lt;td&gt;8,640 invocations/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Secrets Manager (2 secrets)&lt;/td&gt;
&lt;td&gt;~$0.80&lt;/td&gt;
&lt;td&gt;API key + APP key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EventBridge Scheduler&lt;/td&gt;
&lt;td&gt;~$0.00&lt;/td&gt;
&lt;td&gt;Free tier covers this&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 AP reads&lt;/td&gt;
&lt;td&gt;~$0.05&lt;/td&gt;
&lt;td&gt;Depends on file count&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Datadog Forwarder Lambda&lt;/td&gt;
&lt;td&gt;~$0.10&lt;/td&gt;
&lt;td&gt;CloudTrail event processing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 Archive storage&lt;/td&gt;
&lt;td&gt;~$0.02&lt;/td&gt;
&lt;td&gt;Glacier tier after 30 days&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AWS total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$1.50/month&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Datadog log ingestion&lt;/td&gt;
&lt;td&gt;~$0.10/GB&lt;/td&gt;
&lt;td&gt;Conventional pricing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total (1GB/month)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$1.60/month&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Datadog pricing caveat&lt;/strong&gt;: Log ingestion pricing varies by plan (Conventional vs On-Demand) and contract. The ~$0.10/GB is the list price for conventional plans; actual cost depends on your contract, committed volume, and retention choices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cloud SIEM pricing&lt;/strong&gt;: Cloud SIEM is a separate paid feature (~$0.20/GB analyzed for log detection). A 30-day free trial is available. Security Signals, Detection Rules, and the triage workflow require Cloud SIEM. Log Monitors (threshold alerts to Slack) work without it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Cost optimization patterns
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Filter before shipping&lt;/strong&gt; — Drop low-value EventIDs (4658 Handle Close) at the Lambda level to reduce ingestion volume&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compress payloads&lt;/strong&gt; — Enable gzip on the HTTP POST to reduce egress&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tune audit scope&lt;/strong&gt; — Configure ONTAP &lt;code&gt;vserver audit&lt;/code&gt; to monitor only the shares that matter&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Index retention&lt;/strong&gt; — Use Datadog's flexible retention (15/30/60/90 days) to match compliance needs vs cost&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  E2E Verification Results
&lt;/h2&gt;

&lt;p&gt;Verified on Datadog AP1 (ap1.datadoghq.com) with paid plan, June 2026:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lambda → Datadog Logs API v2&lt;/td&gt;
&lt;td&gt;✅ HTTP 202&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Log Pipeline (5 processors)&lt;/td&gt;
&lt;td&gt;✅ Applied automatically&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Category Processor (9 EventIDs)&lt;/td&gt;
&lt;td&gt;✅ &lt;code&gt;@operation_name&lt;/code&gt; populated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Status Remapper&lt;/td&gt;
&lt;td&gt;✅ Failure events marked as error&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Date Remapper&lt;/td&gt;
&lt;td&gt;✅ Event timestamp used (not ingest time)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Attribute Remapper (usr.id)&lt;/td&gt;
&lt;td&gt;✅ Enables Datadog user features&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mass Delete Monitor&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;ALERT triggered&lt;/strong&gt; (55 events &amp;gt; threshold 50)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Access Failure Monitor&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;ALERT triggered&lt;/strong&gt; (12 events &amp;gt; threshold 10)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Abnormal Access Monitor&lt;/td&gt;
&lt;td&gt;✅ Active (OK — high threshold by design)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anomaly Monitor (ML)&lt;/td&gt;
&lt;td&gt;✅ Active (learning baseline)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Saved Views (5)&lt;/td&gt;
&lt;td&gt;✅ Accessible from Views dropdown&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Facets (8 custom)&lt;/td&gt;
&lt;td&gt;✅ Created via UI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dashboard (10 widgets)&lt;/td&gt;
&lt;td&gt;✅ FSx ONTAP Audit Log Overview&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Log-based Metrics (4)&lt;/td&gt;
&lt;td&gt;✅ &lt;code&gt;fsxn.audit.*&lt;/code&gt; in Metrics Explorer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sensitive Data Scanner (5 rules)&lt;/td&gt;
&lt;td&gt;✅ PII detection active&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloud SIEM Detection Rules (4)&lt;/td&gt;
&lt;td&gt;✅ Security Signals generated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudTrail Forwarder&lt;/td&gt;
&lt;td&gt;✅ Logs arriving in Datadog&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GeoIP Enrichment&lt;/td&gt;
&lt;td&gt;✅ Country/city populated on client_ip&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GuardDuty Correlation Rule&lt;/td&gt;
&lt;td&gt;✅ Cross-service detection active&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Workflow Automation&lt;/td&gt;
&lt;td&gt;✅ Monitor → Slack + Case&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Case Management (FSXN project)&lt;/td&gt;
&lt;td&gt;✅ Project + template case&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SOC Triage Runbook&lt;/td&gt;
&lt;td&gt;✅ 7-step Notebook&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Log Archive (S3 → Glacier)&lt;/td&gt;
&lt;td&gt;✅ source:fsxn archived&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snapshot Remediation Lambda&lt;/td&gt;
&lt;td&gt;✅ Deployed (fsxn-snapshot-remediation)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Log-based Metrics
&lt;/h2&gt;

&lt;p&gt;Log-based metrics extract numerical values from logs at ingest time — you get metric-resolution dashboards and anomaly detection without paying for log retention beyond the evaluation window.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Created via Datadog API (no UI needed)
&lt;/span&gt;&lt;span class="n"&gt;metrics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fsxn.audit.delete_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;         &lt;span class="c1"&gt;# Delete events grouped by user/SVM
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fsxn.audit.access_failure_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Failures by user/SVM/client_ip
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fsxn.audit.event_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;           &lt;span class="c1"&gt;# All events by event_type/SVM
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fsxn.audit.unique_users&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c1"&gt;# Active users
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5mpppmz2l4v35r574c2i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5mpppmz2l4v35r574c2i.png" alt="Log-based Metrics Configuration" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Anomaly detection&lt;/strong&gt; on &lt;code&gt;fsxn.audit.delete_count&lt;/code&gt; — Datadog's built-in anomaly monitor catches unusual deletion patterns without manual thresholds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SLO tracking&lt;/strong&gt; — Define an SLO on access failure rate (&lt;code&gt;fsxn.audit.access_failure_count / fsxn.audit.event_count&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost efficiency&lt;/strong&gt; — Metrics are retained for 15 months at metric resolution (vs log retention at full-text cost)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Sensitive Data Scanner
&lt;/h2&gt;

&lt;p&gt;Audit logs frequently contain PII in file paths (&lt;code&gt;/hr/EMP-123456-performance-review.xlsx&lt;/code&gt;) and user contexts. Datadog's Sensitive Data Scanner catches these before they reach analysts.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;What It Catches&lt;/th&gt;
&lt;th&gt;Why It Matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Employee ID (&lt;code&gt;EMP-\d{6}&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Employee records in paths&lt;/td&gt;
&lt;td&gt;PII exposure prevention&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JP Phone (&lt;code&gt;0[789]0-\d{4}-\d{4}&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Mobile numbers in filenames&lt;/td&gt;
&lt;td&gt;Privacy protection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Email in Path&lt;/td&gt;
&lt;td&gt;Customer emails in file names&lt;/td&gt;
&lt;td&gt;Data minimization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credit Card&lt;/td&gt;
&lt;td&gt;CC numbers in documents&lt;/td&gt;
&lt;td&gt;PCI-DSS technical control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;My Number&lt;/td&gt;
&lt;td&gt;Japanese national ID&lt;/td&gt;
&lt;td&gt;Sensitive PII detection&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Compliance note&lt;/strong&gt;: Sensitive Data Scanner is a &lt;strong&gt;technical detection and redaction control&lt;/strong&gt;. It does not constitute full APPI/GDPR/PCI-DSS compliance, which requires organizational measures (consent management, purpose limitation, data subject rights, etc.). Consult your legal/compliance team for regulatory assessment.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo6qlfxkkm50epd5b1u0u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo6qlfxkkm50epd5b1u0u.png" alt="Sensitive Data Scanner" width="799" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Matched patterns are &lt;strong&gt;partially redacted&lt;/strong&gt; at ingest — the first 3 characters remain for identification, the rest is replaced with &lt;code&gt;[REDACTED]&lt;/code&gt;. This preserves investigation capability while protecting the data subject.&lt;/p&gt;




&lt;h2&gt;
  
  
  Enhanced Dashboard
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F751ugwm1gnhwzzyo8pfl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F751ugwm1gnhwzzyo8pfl.png" alt="Enhanced Dashboard — 10 Widgets" width="800" height="548"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The SOC-focused dashboard provides at-a-glance visibility into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Who&lt;/strong&gt; is generating the most activity (top users, top IPs)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What&lt;/strong&gt; operations dominate (sunburst by operation/SVM)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Where&lt;/strong&gt; access is concentrated (hot paths)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When&lt;/strong&gt; failures spike (timeline with per-user breakdown)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trend&lt;/strong&gt; from log-based metrics (delete rate over time)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Cloud SIEM Security Signals
&lt;/h2&gt;

&lt;p&gt;Beyond log monitors (which notify via Slack/PagerDuty), Security Signals integrate with Datadog's SOC workflow — triage, investigation, and response all in one place.&lt;/p&gt;

&lt;h3&gt;
  
  
  Log Monitors vs Cloud SIEM Detection Rules
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Log Monitors&lt;/th&gt;
&lt;th&gt;Cloud SIEM Detection Rules&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;td&gt;Alert notification (Slack/PagerDuty/email)&lt;/td&gt;
&lt;td&gt;Security Signal (appears in Security Signals panel)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Triage&lt;/td&gt;
&lt;td&gt;Manual (click alert → investigate)&lt;/td&gt;
&lt;td&gt;Built-in triage workflow (Open → In Progress → Archived)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MITRE mapping&lt;/td&gt;
&lt;td&gt;Manual tags&lt;/td&gt;
&lt;td&gt;Native MITRE ATT&amp;amp;CK framework integration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Correlation&lt;/td&gt;
&lt;td&gt;Single query&lt;/td&gt;
&lt;td&gt;Cross-log-source correlation possible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Case integration&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Direct "Create Case" from signal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Operational alerts (DevOps/SRE)&lt;/td&gt;
&lt;td&gt;Security investigation (SOC/IR teams)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Recommendation&lt;/strong&gt;: Use both. Log Monitors for immediate Slack notification to storage team. Cloud SIEM Detection Rules for SOC triage workflow and MITRE-mapped investigation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkanywlx3y6n7t5u4iqb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkanywlx3y6n7t5u4iqb.png" alt="Cloud SIEM Detection Rules" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Three detection rules generate Security Signals when FSxN audit logs match threat patterns:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;MITRE ATT&amp;amp;CK&lt;/th&gt;
&lt;th&gt;Trigger&lt;/th&gt;
&lt;th&gt;Severity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Mass File Deletion&lt;/td&gt;
&lt;td&gt;T1485 Data Destruction&lt;/td&gt;
&lt;td&gt;&amp;gt;50 deletes/5min per user&lt;/td&gt;
&lt;td&gt;Critical&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Brute Force File Access&lt;/td&gt;
&lt;td&gt;T1110 Brute Force&lt;/td&gt;
&lt;td&gt;&amp;gt;20 failures/15min per user+IP&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Permission Tampering&lt;/td&gt;
&lt;td&gt;T1222 File Permissions Modification&lt;/td&gt;
&lt;td&gt;&amp;gt;5 changes/10min per user&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Creating a Security Signal rule via API
&lt;/span&gt;&lt;span class="n"&gt;rule&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FSxN: Mass File Deletion&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;log_detection&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;queries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source:fsxn @event_type:4660&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;groupByFields&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;@user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aggregation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delete_events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dataSource&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;logs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cases&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;condition&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delete_events &amp;gt; 50&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;critical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Critical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;condition&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delete_events &amp;gt; 20&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;high&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;High&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;options&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;evaluationWindow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keepAlive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxSignalDuration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;detectionMethod&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;threshold&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User {{@user}} deleted {{value}} files in 5 min. T1485.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tags&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source:fsxn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;technique:T1485-data-destruction&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each signal includes investigation steps and response guidance directly in the signal panel — analysts don't need to leave the Security Signals view to understand what happened.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cloud SIEM Setup Steps
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6cqngqkwacopqohlugcg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6cqngqkwacopqohlugcg.png" alt="Cloud SIEM Onboarding — Index Configuration" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Navigate to Security → Cloud SIEM → Get Started&lt;/li&gt;
&lt;li&gt;Skip Content Packs (FSxN is a custom source)&lt;/li&gt;
&lt;li&gt;Select &lt;code&gt;fsxn&lt;/code&gt; as a log source → Enable as Trial&lt;/li&gt;
&lt;li&gt;Configure Cloud SIEM Index (default: 450 days retention)&lt;/li&gt;
&lt;li&gt;Detection rules are automatically applied to incoming &lt;code&gt;source:fsxn&lt;/code&gt; logs&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Workflow Automation
&lt;/h2&gt;

&lt;p&gt;When a critical monitor fires, the Workflow automatically triggers response actions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Monitor Alert → @workflow-fsxn-security-alert-response → Slack notification + Investigation links
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The workflow (&lt;code&gt;fsxn-security-alert-response&lt;/code&gt;) is linked to monitors via the &lt;code&gt;@workflow-&amp;lt;handle&amp;gt;&lt;/code&gt; mention syntax in the monitor message. No additional configuration needed — just add the mention to any monitor's notification body.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Extensions&lt;/strong&gt; (via Datadog Workflow builder):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Auto-create Jira ticket with investigation context&lt;/li&gt;
&lt;li&gt;Query Active Directory for user's manager&lt;/li&gt;
&lt;li&gt;Invoke &lt;code&gt;fsxn-snapshot-remediation&lt;/code&gt; Lambda for evidence preservation (deployed — see Automated Snapshot Remediation)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Investigation Notebook
&lt;/h2&gt;

&lt;p&gt;A pre-built 5-step investigation template guides analysts through alert triage:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Content&lt;/th&gt;
&lt;th&gt;Widget&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1. Identify Scope&lt;/td&gt;
&lt;td&gt;Event type distribution timeline&lt;/td&gt;
&lt;td&gt;Timeseries (bars)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. User Timeline&lt;/td&gt;
&lt;td&gt;Delete events by user over time&lt;/td&gt;
&lt;td&gt;Timeseries (line)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Affected Files&lt;/td&gt;
&lt;td&gt;Top 20 deleted paths&lt;/td&gt;
&lt;td&gt;Top List&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. Client IP Analysis&lt;/td&gt;
&lt;td&gt;Top 10 source IPs&lt;/td&gt;
&lt;td&gt;Top List&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5. Conclusion&lt;/td&gt;
&lt;td&gt;Root cause, impact, action template&lt;/td&gt;
&lt;td&gt;Markdown&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Access at: Notebooks → "FSxN Audit Log Investigation Template"&lt;/p&gt;

&lt;p&gt;Each step includes pre-configured queries — analysts just adjust the time window and user filter to match the alert context.&lt;/p&gt;




&lt;h2&gt;
  
  
  RBAC and Access Control
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;FSxN Security Analyst&lt;/code&gt; role provides scoped access to audit logs:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Permissions&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;FSxN Security Analyst&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;logs_read_data&lt;/code&gt;, &lt;code&gt;logs_read_index_data&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;SOC analysts investigating file access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Datadog Standard Role&lt;/td&gt;
&lt;td&gt;Full access&lt;/td&gt;
&lt;td&gt;Storage admins managing pipeline&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Assign users to this role to grant log access without exposing infrastructure metrics or APM data. Combined with Saved Views, analysts see only the investigation tools relevant to their role.&lt;/p&gt;




&lt;h2&gt;
  
  
  OTel Collector Bridge
&lt;/h2&gt;

&lt;p&gt;For teams already running OpenTelemetry Collector, an alternative delivery path routes FSxN logs through OTel with trace context injection:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# otel-bridge/collector-config.yaml (excerpt)&lt;/span&gt;
&lt;span class="na"&gt;receivers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;otlp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;protocols&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;endpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;0.0.0.0&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;4318&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;

&lt;span class="na"&gt;processors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;attributes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;service.name&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ontap-audit&lt;/span&gt;
        &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;upsert&lt;/span&gt;
  &lt;span class="na"&gt;transform&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;log_statements&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;log&lt;/span&gt;
        &lt;span class="na"&gt;statements&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;set(attributes["ddsource"], "fsxn")&lt;/span&gt;

&lt;span class="na"&gt;exporters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;datadog&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;api&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${env:DD_API_KEY}&lt;/span&gt;
      &lt;span class="na"&gt;site&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${env:DD_SITE:-ap1.datadoghq.com}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Benefits over direct Lambda→Datadog:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;trace_id injection&lt;/strong&gt; for distributed tracing correlation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-backend fanout&lt;/strong&gt; (Datadog + Grafana + S3 in parallel)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attribute enrichment&lt;/strong&gt; at collector level (team, environment tags)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sampling&lt;/strong&gt; for high-volume environments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Edge-side PII redaction&lt;/strong&gt; — Mask sensitive fields before they leave your AWS account (stronger than relying solely on Datadog's Sensitive Data Scanner)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Edge redaction pattern&lt;/strong&gt;: Use OTel's &lt;code&gt;transform&lt;/code&gt; processor to hash &lt;code&gt;user&lt;/code&gt; and &lt;code&gt;client_ip&lt;/code&gt; and truncate &lt;code&gt;path&lt;/code&gt; to directory level BEFORE export to Datadog. This ensures PII never crosses your network boundary — even if Datadog's SDS configuration is misconfigured or disabled, your data is already protected at the source.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Status&lt;/strong&gt;: Config syntax verified. Integration testing requires deploying OTel Collector on ECS (see &lt;a href="https://dev.tolink-to-part-7"&gt;Part 7&lt;/a&gt; for the ECS-based Collector pattern used with Grafana and Honeycomb).&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Log Archives and Compliance
&lt;/h2&gt;

&lt;p&gt;For regulatory retention (adjust based on your organization's audit policy), a dedicated CloudFormation template deploys an S3 archive with Glacier lifecycle:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws cloudformation deploy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--template-file&lt;/span&gt; integrations/datadog/template-log-archive.yaml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--stack-name&lt;/span&gt; fsxn-datadog-archive &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;DatadogExternalId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;from-datadog-integration-page&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;RetentionDays&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;30 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;GlacierRetentionDays&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2555 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--capabilities&lt;/span&gt; CAPABILITY_NAMED_IAM
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This separates concerns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Detection&lt;/strong&gt; — Hot logs in Datadog (15-30 day index retention)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance&lt;/strong&gt; — Cold archive in S3 → Glacier (7+ years)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Investigation&lt;/strong&gt; — Rehydrate specific time ranges on demand&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Anomaly Detection (ML-based)
&lt;/h2&gt;

&lt;p&gt;Beyond static thresholds, the anomaly monitor on &lt;code&gt;fsxn.audit.delete_count&lt;/code&gt; uses Datadog's agile algorithm to learn each user's baseline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Anomaly monitor — no manual threshold needed
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;avg(last_4h):anomalies(
    sum:fsxn.audit.delete_count{*} by {user}.as_count(),
    &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;agile&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, 3, direction=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;above&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;
) &amp;gt;= 1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a user who normally deletes 5 files/day suddenly deletes 200, the alert fires — even though 200 is well below the static "50 in 5 minutes" threshold. This catches slow, sustained exfiltration that threshold-based monitors miss.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Baseline period&lt;/strong&gt;: The anomaly algorithm needs ~2 weeks of data to build confidence. Deploy early — it stays silent during the learning period.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Cardinality Management
&lt;/h2&gt;

&lt;p&gt;⚠️ Log-based metrics with &lt;code&gt;group_by: user&lt;/code&gt; create one time series per unique user. For organizations with 10,000+ users:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Recommended group_by&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;fsxn.audit.event_count&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;svm only&lt;/td&gt;
&lt;td&gt;Broad metric, low cardinality&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;fsxn.audit.delete_count&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;user, svm&lt;/td&gt;
&lt;td&gt;Targeted, high signal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;fsxn.audit.access_failure_count&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;user, svm, client_ip&lt;/td&gt;
&lt;td&gt;Investigation-focused&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;fsxn.audit.unique_users&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;user&lt;/td&gt;
&lt;td&gt;Intentionally per-user&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Monitor cardinality in Metrics Summary (&lt;code&gt;fsxn.audit.*&lt;/code&gt;). Datadog bills per unique tag-value combination — keep per-user grouping only on metrics where individual user behavior matters.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Different from Splunk/CrowdStrike?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Datadog&lt;/th&gt;
&lt;th&gt;Splunk (Part 8)&lt;/th&gt;
&lt;th&gt;CrowdStrike (Part 15)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Delivery protocol&lt;/td&gt;
&lt;td&gt;Logs API v2 (JSON)&lt;/td&gt;
&lt;td&gt;HEC (JSON)&lt;/td&gt;
&lt;td&gt;HEC (JSON)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pipeline config&lt;/td&gt;
&lt;td&gt;API-driven (Python)&lt;/td&gt;
&lt;td&gt;props.conf / UI&lt;/td&gt;
&lt;td&gt;LogScale parser YAML&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monitor creation&lt;/td&gt;
&lt;td&gt;API (&lt;code&gt;POST /api/v1/monitor&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Saved searches + alerts&lt;/td&gt;
&lt;td&gt;CQL + alert actions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EventID mapping&lt;/td&gt;
&lt;td&gt;Category Processor&lt;/td&gt;
&lt;td&gt;eval/lookup&lt;/td&gt;
&lt;td&gt;LogScale parser&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Identity correlation&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;usr.id&lt;/code&gt; attribute&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;user&lt;/code&gt; field&lt;/td&gt;
&lt;td&gt;Falcon Identity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Free tier&lt;/td&gt;
&lt;td&gt;❌ (paid only for logs)&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API management&lt;/td&gt;
&lt;td&gt;✅ Full (Pipeline + Monitor + Dashboard)&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key Datadog advantage: &lt;strong&gt;everything is API-manageable&lt;/strong&gt;. Pipeline, monitors, dashboards, and saved views can all be created, updated, and version-controlled through the Datadog API — making infrastructure-as-code for observability fully achievable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Production Recommendations
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use the API, not the UI&lt;/strong&gt; — Version-control your Pipeline, Monitor, and Dashboard definitions in a setup script. When you need to adjust thresholds or add EventIDs, it's a code change with review.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Separate APP key from API key&lt;/strong&gt; — The API key ships logs; the APP key manages configuration. Store both in Secrets Manager with different IAM policies.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Exclude service accounts&lt;/strong&gt; — Add &lt;code&gt;@user:-svc-*&lt;/code&gt; to monitor queries or create Datadog suppression rules for known batch accounts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with Warning, promote to Critical&lt;/strong&gt; — Deploy monitors with Warning thresholds first, observe for 1-2 weeks, then tighten to Critical after confirming the baseline.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Connect to incident workflow&lt;/strong&gt; — Route Critical monitors to PagerDuty/Slack and include investigation links (Saved View URLs) in the alert message.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enable Log Archives for compliance&lt;/strong&gt; — Configure S3 archive with Glacier lifecycle for FISC/SOC2 retention requirements. Rehydrate on demand for historical investigation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use Workflow Automation for response&lt;/strong&gt; — Configure Datadog Workflows to automatically create Jira tickets, trigger Slack quick-actions, or invoke Lambda remediation when critical monitors fire.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Manage with Terraform for production&lt;/strong&gt; — For enterprise/multi-org deployments, use the &lt;a href="https://registry.terraform.io/providers/DataDog/datadog/latest" rel="noopener noreferrer"&gt;Datadog Terraform Provider&lt;/a&gt; to version-control Pipelines, Monitors, and Detection Rules. Separate API key (&lt;code&gt;logs_write&lt;/code&gt; only) from APP key (admin-level, CI/CD pipeline only).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Apply IAM Permissions Boundary&lt;/strong&gt; — In large organizations, apply a Permissions Boundary to the log-shipping Lambda role to prevent privilege escalation. The boundary should allow only &lt;code&gt;s3:GetObject&lt;/code&gt;, &lt;code&gt;s3:ListBucket&lt;/code&gt;, &lt;code&gt;secretsmanager:GetSecretValue&lt;/code&gt;, and &lt;code&gt;logs:*&lt;/code&gt; (CloudWatch).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Encrypt with customer-managed keys (CMK)&lt;/strong&gt; — For regulated environments, enable &lt;a href="https://docs.datadoghq.com/account_management/org_settings/#log-management" rel="noopener noreferrer"&gt;Datadog Log Management Encryption&lt;/a&gt; with your KMS CMK, and configure the S3 archive bucket with SSE-KMS. This ensures audit logs are encrypted with keys you control at every stage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Tune detection rules (anti-alert-fatigue)&lt;/strong&gt; — Run all new Cloud SIEM rules in "Warning / dev" status for 2 weeks. Whitelist known service accounts with &lt;code&gt;@usr.id:-svc-*&lt;/code&gt; in queries. Review false positives weekly and adjust thresholds or add suppression rules before promoting to Critical.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data Integrity: Handle HTTP 429/5xx&lt;/strong&gt; — Datadog Logs API v2 returns HTTP 202 (accepted, not indexed). On HTTP 429 (rate limit) or 5xx (transient), implement exponential backoff (base 1s, max 5 retries). After final failure, route the complete HEC payload to SQS DLQ for replay. Never drop logs silently.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Deployment
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Deploy the CloudFormation stack (Lambda + Scheduler + DLQ + Alarms)&lt;/span&gt;
aws cloudformation deploy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--template-file&lt;/span&gt; integrations/datadog/template.yaml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--stack-name&lt;/span&gt; fsxn-datadog-integration &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;FsxS3AccessPointArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;arn:aws:s3:&amp;lt;region&amp;gt;:&amp;lt;account&amp;gt;:accesspoint/&amp;lt;name&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;DatadogApiKeySecretArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;secret-arn&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;DatadogSite&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ap1.datadoghq.com &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--capabilities&lt;/span&gt; CAPABILITY_NAMED_IAM

&lt;span class="c"&gt;# 2. Deploy full observability (Pipeline + Monitors + Metrics + Scanner)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;DD_API_KEY_SECRET_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"fsxn-datadog-api-key"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;DD_APP_KEY_SECRET_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"datadog/fsxn-app-key"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;DD_SITE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"ap1.datadoghq.com"&lt;/span&gt;
bash integrations/datadog/scripts/setup-full-observability.sh

&lt;span class="c"&gt;# 3. (Optional) Deploy Log Archive for compliance&lt;/span&gt;
aws cloudformation deploy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--template-file&lt;/span&gt; integrations/datadog/template-log-archive.yaml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--stack-name&lt;/span&gt; fsxn-datadog-archive &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="nv"&gt;DatadogExternalId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;external-id&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--capabilities&lt;/span&gt; CAPABILITY_NAMED_IAM

&lt;span class="c"&gt;# 4. Create Facets (guided manual step)&lt;/span&gt;
bash integrations/datadog/scripts/setup-facets.sh

&lt;span class="c"&gt;# 5. Enable Cloud SIEM (UI: Security → Cloud SIEM → Get Started → select fsxn)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Time from zero to full detection capability: &lt;strong&gt;~45 minutes&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pre-Deployment Checklist
&lt;/h3&gt;

&lt;p&gt;Before deploying to production, confirm:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] FSx audit configuration confirmed (XML format, rotation schedule)&lt;/li&gt;
&lt;li&gt;[ ] S3 Access Point created and Lambda role authorized&lt;/li&gt;
&lt;li&gt;[ ] Datadog API Key + APP Key stored in Secrets Manager&lt;/li&gt;
&lt;li&gt;[ ] Datadog site region confirmed (AP1/US1/EU1)&lt;/li&gt;
&lt;li&gt;[ ] Data classification sign-off: audit logs contain user PII (usernames, IPs, file paths) — confirm external transmission is approved per organization policy&lt;/li&gt;
&lt;li&gt;[ ] Retention requirements defined (Datadog index retention + S3 archive lifecycle)&lt;/li&gt;
&lt;li&gt;[ ] On-call/escalation path defined for Critical signals&lt;/li&gt;
&lt;li&gt;[ ] Service accounts identified for monitor exclusion (&lt;code&gt;svc-*&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;[ ] Cloud SIEM trial or paid plan confirmed (for Security Signals)&lt;/li&gt;
&lt;li&gt;[ ] Network path validated (VPC-external Lambda or NAT Gateway for S3 AP access)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. API-first beats UI-first
&lt;/h3&gt;

&lt;p&gt;Creating the Pipeline via API took 30 seconds. Doing it through the UI takes 5 minutes of clicking through dropdowns. More importantly, the API approach is repeatable, reviewable, and version-controlled.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Facets are UX, not functionality
&lt;/h3&gt;

&lt;p&gt;Every field is searchable with &lt;code&gt;@field:value&lt;/code&gt; whether or not you create a Facet. Facets just add sidebar convenience. Don't block on Facet setup — your monitors work without them.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Category Processor is the key transformation
&lt;/h3&gt;

&lt;p&gt;Without it, analysts need to memorize Windows EventID codes. With it, &lt;code&gt;@operation_name:Object Delete&lt;/code&gt; is immediately understandable. This single processor provides more investigative value than any other pipeline step.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Evaluation delay prevents false positives
&lt;/h3&gt;

&lt;p&gt;The 60-second &lt;code&gt;evaluation_delay&lt;/code&gt; on monitors accounts for ingestion latency. Without it, you get spurious alerts when log batches arrive slightly out of order.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Saved Views are investigation shortcuts
&lt;/h3&gt;

&lt;p&gt;When a monitor fires at 3 AM, the on-call engineer needs to investigate immediately. Pre-built Saved Views with the right columns and filters eliminate setup time during incidents.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Anomaly detection needs baseline data
&lt;/h3&gt;

&lt;p&gt;The anomaly monitor requires at least 2 weeks of historical data to build a reliable baseline. Deploy it early — it won't generate false positives during the learning period (it simply stays silent until confident).&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Cloud SIEM and Log Monitors are complementary
&lt;/h3&gt;

&lt;p&gt;Log Monitors give you fast Slack/PagerDuty notifications for the on-call team. Cloud SIEM Detection Rules give SOC analysts a structured triage workflow with MITRE mapping. Deploy both — they serve different audiences.&lt;/p&gt;

&lt;h3&gt;
  
  
  8. GeoIP is free threat intelligence
&lt;/h3&gt;

&lt;p&gt;The GeoIP Processor is zero-cost and instantly useful. Even in a pure on-premises environment, unusual geography flags VPN misconfiguration, compromised credentials, or travel you weren't expecting.&lt;/p&gt;

&lt;h3&gt;
  
  
  9. Forwarder deployment enables an ecosystem
&lt;/h3&gt;

&lt;p&gt;Once the Datadog Forwarder Lambda is deployed, enabling CloudTrail, GuardDuty, Lambda logs, or any other AWS log source is a single S3 notification configuration — no new infrastructure needed.&lt;/p&gt;




&lt;h2&gt;
  
  
  CloudTrail Correlation (Cross-Service Detection)
&lt;/h2&gt;

&lt;p&gt;A dedicated detection rule correlates FSxN audit events with AWS CloudTrail context. When the signal fires, the investigation message includes pre-built queries for cross-service correlation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;source:cloudtrail @userIdentity.arn:*&amp;lt;username&amp;gt;*     → IAM actions
source:guardduty                                      → GuardDuty findings
source:vpc @network.client.ip:&amp;lt;suspicious-ip&amp;gt;         → Network flows
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rule (&lt;code&gt;FSxN: Suspicious Deletion After IAM Role Assumption&lt;/code&gt;) fires on &amp;gt;30 deletions in 10 minutes and embeds a correlation checklist:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Was there a recent &lt;code&gt;AssumeRole&lt;/code&gt; or &lt;code&gt;ConsoleLogin&lt;/code&gt; from an unusual IP?&lt;/li&gt;
&lt;li&gt;Did the user's permissions change in the last 24 hours?&lt;/li&gt;
&lt;li&gt;Is the client IP in the corporate VPN range?&lt;/li&gt;
&lt;li&gt;Are there GuardDuty findings for this account?&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: This rule activates fully once CloudTrail logs are flowing to Datadog via the AWS Integration. The FSxN detection works immediately; CloudTrail queries return results after integration setup.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  SOC Triage Runbook
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s1i83yaifkdnm8qiq1k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s1i83yaifkdnm8qiq1k.png" alt="SOC Triage Runbook" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A 7-step Notebook guides analysts from signal to resolution:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1. Signal Context&lt;/td&gt;
&lt;td&gt;Verify user, IP, time, volume&lt;/td&gt;
&lt;td&gt;Timeseries widget&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. User Verification&lt;/td&gt;
&lt;td&gt;Check AD status, role changes, maintenance windows&lt;/td&gt;
&lt;td&gt;Manual checklist&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Impact Assessment&lt;/td&gt;
&lt;td&gt;Identify affected files and directories&lt;/td&gt;
&lt;td&gt;Top List (paths)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. Cross-Service Correlation&lt;/td&gt;
&lt;td&gt;CloudTrail, GuardDuty, VPC Flow Logs&lt;/td&gt;
&lt;td&gt;Query templates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5. Decision Matrix&lt;/td&gt;
&lt;td&gt;Map conditions to response actions&lt;/td&gt;
&lt;td&gt;Decision table&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6. Response Actions&lt;/td&gt;
&lt;td&gt;Disable account, snapshot, notify, ticket&lt;/td&gt;
&lt;td&gt;Checklist&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7. Post-Incident&lt;/td&gt;
&lt;td&gt;Update thresholds, document, lessons learned&lt;/td&gt;
&lt;td&gt;Checklist&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Accessible at: &lt;strong&gt;Notebooks → "FSxN Security Signal Triage Runbook"&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Case Management
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzgfuf56nzt02apc3rrx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzgfuf56nzt02apc3rrx.png" alt="Case Management — FSXN Project" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Cases provide structured investigation tracking across the SOC team:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Project&lt;/strong&gt;: &lt;code&gt;FSXN&lt;/code&gt; — All FSx for ONTAP security investigations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Case creation&lt;/strong&gt;: Manual from Security Signals, or auto-create via Workflow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Priority levels&lt;/strong&gt;: P1 (critical mass deletion) through P4 (informational)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lifecycle&lt;/strong&gt;: Open → In Progress → Resolved → Closed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When a critical signal fires, the workflow creates a case automatically with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Signal link and context&lt;/li&gt;
&lt;li&gt;Affected user and IP&lt;/li&gt;
&lt;li&gt;Investigation notebook link&lt;/li&gt;
&lt;li&gt;Response checklist&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Threat Intelligence (GeoIP Enrichment)
&lt;/h2&gt;

&lt;p&gt;The Log Pipeline's GeoIP Processor automatically enriches every FSxN log with geographic data from the &lt;code&gt;client_ip&lt;/code&gt; field:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@client_ip: 10.0.5.99 → @network.client.geoip: { country: "JP", city: "Tokyo", ... }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A detection rule flags access from unexpected countries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Rule: "FSxN: File Access from Unusual Geography"
# Query: source:fsxn -@network.client.geoip.country:JP -@network.client.geoip.country:US
# Threshold: &amp;gt;5 events/15min from non-JP/US IPs
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This catches compromised credentials being used from abroad without needing an external threat intelligence feed — Datadog's built-in GeoIP database handles the enrichment at ingest time.&lt;/p&gt;




&lt;h2&gt;
  
  
  GuardDuty Correlation
&lt;/h2&gt;

&lt;p&gt;A detection rule correlates FSxN deletion events with GuardDuty findings from the same source IP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FSxN: &amp;gt;10 deletions/30min from IP X
  + GuardDuty finding for IP X (UnauthorizedAccess, Recon, Trojan)
  = Critical Security Signal with full context
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rule's investigation message includes pre-built GuardDuty queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;source:guardduty @detail.resource.instanceDetails.networkInterfaces.privateIpAddress:{{@client_ip}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GuardDuty findings flow to Datadog via the same Forwarder Lambda that handles CloudTrail — no additional setup needed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Automated Snapshot Remediation
&lt;/h2&gt;

&lt;p&gt;When mass deletion is confirmed (Critical signal reviewed by analyst), the Workflow invokes &lt;code&gt;fsxn-snapshot-remediation&lt;/code&gt; Lambda:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Security Signal (Critical) → Analyst confirms → Workflow → Lambda → ONTAP REST API → Snapshot
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Lambda:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Receives volume name, SVM, and reason from the Workflow&lt;/li&gt;
&lt;li&gt;Authenticates to ONTAP via Secrets Manager credentials&lt;/li&gt;
&lt;li&gt;Creates a timestamped snapshot: &lt;code&gt;remediation_20260614_215530_mass_deletion&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Returns snapshot name and status for the Case record
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Lambda invocation payload (from Datadog Workflow)
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;volume_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;finance_share&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;svm_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ProductionSVM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mass deletion detected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CORP&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;suspicious-user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Configuration&lt;/strong&gt;: Set &lt;code&gt;ONTAP_MGMT_IP&lt;/code&gt; and &lt;code&gt;ONTAP_CREDENTIALS_SECRET_ARN&lt;/code&gt; environment variables on the Lambda to point to your FSx for ONTAP management endpoint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TLS note&lt;/strong&gt;: The Lambda uses &lt;code&gt;cert_reqs="CERT_NONE"&lt;/code&gt; for ONTAP REST API calls because FSx for ONTAP uses self-signed certificates by default. In production, upload the ONTAP CA certificate to the Lambda layer (&lt;code&gt;/opt/certs/ontap-ca.pem&lt;/code&gt;) and configure &lt;code&gt;urllib3.PoolManager(ca_certs="/opt/certs/ontap-ca.pem")&lt;/code&gt; to validate the connection.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Snapshot Storm Prevention (Cooldown)
&lt;/h3&gt;

&lt;p&gt;To prevent runaway Snapshot creation during sustained mass-deletion events:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Before creating snapshot, check for recent remediation snapshots
&lt;/span&gt;&lt;span class="n"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;mgmt_ip&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/api/storage/volumes/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;vol_uuid&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/snapshots&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;?name=remediation_*&amp;amp;order_by=create_time desc&amp;amp;max_records=1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;base_headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;snaps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;records&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;snaps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;last_snap_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;snaps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_time&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Skip if a remediation snapshot was created in the last 15 minutes
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;is_within_cooldown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_snap_time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;minutes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Skipped — cooldown active&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;ONTAP limits&lt;/strong&gt;: Each volume supports up to 1023 Snapshots. The cooldown prevents hitting this limit during sustained attack scenarios.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Remediation Audit Trail
&lt;/h3&gt;

&lt;p&gt;The Lambda invocation itself must be auditable — proving that the Snapshot was created by an authorized pipeline, not a rogue actor:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CloudTrail&lt;/strong&gt;: Lambda invocation recorded with &lt;code&gt;InvokedBy: workflow.datadoghq.com&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ONTAP audit log&lt;/strong&gt;: Snapshot creation appears as an administrative event with the API user&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Datadog Case&lt;/strong&gt;: Snapshot name and status recorded in the Case timeline&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda CloudWatch Logs&lt;/strong&gt;: Full request/response logged with correlation ID&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;This article covered the complete detection-to-response lifecycle. Future enhancements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Datadog Threat Intel feeds&lt;/strong&gt; — When Datadog makes the Threat Intel Indicators API available on AP1, feed internal IP reputation lists for richer enrichment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-SVM snapshot orchestration&lt;/strong&gt; — Extend the remediation Lambda to snapshot across multiple SVMs in parallel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated account lockout&lt;/strong&gt; — Chain the Workflow to invoke AD lockout via Systems Manager Run Command&lt;/li&gt;
&lt;/ul&gt;




&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations" rel="noopener noreferrer"&gt;GitHub Repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>datadog</category>
      <category>security</category>
      <category>amazonfsxfornetappontap</category>
    </item>
    <item>
      <title>Shipping FSx for ONTAP Audit Logs to CrowdStrike Falcon LogScale via HEC — Parser v1.1.0</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 14 Jun 2026 16:28:49 +0000</pubDate>
      <link>https://dev.to/aws-builders/shipping-fsx-for-ontap-audit-logs-to-crowdstrike-falcon-logscale-via-hec-parser-v110-239g</link>
      <guid>https://dev.to/aws-builders/shipping-fsx-for-ontap-audit-logs-to-crowdstrike-falcon-logscale-via-hec-parser-v110-239g</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Scope note&lt;/strong&gt;: This article targets CrowdStrike Falcon LogScale HEC ingestion via Amazon FSx for ONTAP S3 Access Points. For on-premises ONTAP or Cloud Volumes ONTAP, the audit log retrieval path differs (e.g., NFS/SMB export or FPolicy), while the parser and HEC delivery model can still be reused. Falcon Next-Gen SIEM connector-based ingestion (via the Falcon console Data Connectors UI) may require a different setup and should be validated separately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Which path applies to you?&lt;/strong&gt; If your tenant exposes LogScale repositories and ingest tokens, use the HEC path in this article. If your tenant uses Falcon Next-Gen SIEM Data Connectors, implement this as a connector/parser workflow and validate separately.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;TL;DR&lt;/li&gt;
&lt;li&gt;Why CrowdStrike LogScale?&lt;/li&gt;
&lt;li&gt;Architecture&lt;/li&gt;
&lt;li&gt;The HEC Protocol: One Format, Multiple Destinations&lt;/li&gt;
&lt;li&gt;Parser v1.1.0: Designed for Speed and Maintainability&lt;/li&gt;
&lt;li&gt;FIELD_MAPPING: Zero-Code Adaptation to ONTAP Changes&lt;/li&gt;
&lt;li&gt;S3 Access Point Permission Model&lt;/li&gt;
&lt;li&gt;Compatibility Verification (Splunk HEC)&lt;/li&gt;
&lt;li&gt;LogScale CQL Query Examples&lt;/li&gt;
&lt;li&gt;Performance Benchmarks&lt;/li&gt;
&lt;li&gt;Deployment&lt;/li&gt;
&lt;li&gt;Cost&lt;/li&gt;
&lt;li&gt;Data Classification and Privacy&lt;/li&gt;
&lt;li&gt;Token and Egress Governance&lt;/li&gt;
&lt;li&gt;Production Readiness Checklist&lt;/li&gt;
&lt;li&gt;HEC Path vs OTel / Alloy Path&lt;/li&gt;
&lt;li&gt;Pipeline Observability&lt;/li&gt;
&lt;li&gt;Production Checkpoint Design&lt;/li&gt;
&lt;li&gt;Splunk HEC Compatibility Notes&lt;/li&gt;
&lt;li&gt;Lessons Learned&lt;/li&gt;
&lt;li&gt;What's Next&lt;/li&gt;
&lt;li&gt;Partner Discovery Checklist&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Validated / Not Yet Validated
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Area&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Parser unit tests&lt;/td&gt;
&lt;td&gt;✅ Verified (108 tests)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Splunk HEC compatibility&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LogScale HEC live ingest&lt;/td&gt;
&lt;td&gt;⏳ Pending required LogScale entitlement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LogScale CQL queries&lt;/td&gt;
&lt;td&gt;📝 Syntax draft, pending live validation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CrowdStrike UI screenshots&lt;/td&gt;
&lt;td&gt;⏳ Pending required LogScale entitlement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment parameterization&lt;/td&gt;
&lt;td&gt;✅ Verified — endpoint, HEC path, source, sourcetype, index, memory, timeout are configurable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Live LogScale validation plan&lt;/strong&gt; (when tenant is available):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ingest token permission and authentication&lt;/li&gt;
&lt;li&gt;HEC ingest success and HTTP response&lt;/li&gt;
&lt;li&gt;Repository assignment and data visibility&lt;/li&gt;
&lt;li&gt;Parser assignment and field extraction&lt;/li&gt;
&lt;li&gt;Timestamp correctness (@timestamp = event time)&lt;/li&gt;
&lt;li&gt;CQL query validation against live data&lt;/li&gt;
&lt;li&gt;Alert trigger and notification&lt;/li&gt;
&lt;li&gt;Dashboard screenshot capture&lt;/li&gt;
&lt;li&gt;Fusion SOAR workflow trigger test&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;For decision makers&lt;/strong&gt;: CrowdStrike Falcon LogScale receives FSx for ONTAP audit logs via the same HEC (HTTP Event Collector) protocol as Splunk. This means reduced switching cost — you can move between LogScale, Splunk, or any HEC-compatible SIEM without changing your Lambda code. Note: while the HEC wire format is shared, repository/index semantics, parser configuration, query language (CQL vs SPL), alerting rules, and retention policies differ between platforms and require per-platform configuration. Parser v1.1.0 processes 178,000 events/second in parser-only benchmarks. AWS cost: validation estimate ~$1/month under low-volume assumptions.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;CrowdStrike pricing note&lt;/strong&gt;: CrowdStrike licensing and third-party ingestion entitlement vary by contract and product edition. Confirm availability, daily ingest quota, retention, and pricing with your CrowdStrike account team before production planning.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;For NetApp teams&lt;/strong&gt;: This pattern keeps audit data on ONTAP and adds a serverless security analytics path through S3 Access Points, without changing existing SMB/NFS client access or requiring data copies to S3.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For engineers&lt;/strong&gt;: &lt;code&gt;template.yaml&lt;/code&gt; deploys the full stack (Lambda + Scheduler + DLQ + Alarm) in one &lt;code&gt;aws cloudformation deploy&lt;/code&gt; command. The parser uses a &lt;code&gt;FIELD_MAPPING&lt;/code&gt; table — new ONTAP field names require zero code changes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FSx for ONTAP → S3 Access Point → EventBridge Scheduler → Lambda
    → CrowdStrike Falcon LogScale (/api/v1/ingest/hec)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is Part 15 of &lt;a href="https://dev.to/aws-builders/why-your-fsx-for-ontap-audit-logs-deserve-better-than-ec2-kod"&gt;Serverless Observability for FSx for ONTAP&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why CrowdStrike LogScale?
&lt;/h2&gt;

&lt;p&gt;Three reasons this combination makes sense:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Unified XDR + File Audit&lt;/strong&gt;: Falcon EDR endpoint telemetry and FSx file access logs in the same platform. A contractor accessing sensitive files on the NAS? LogScale correlates that with their endpoint behavior.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Index-free architecture&lt;/strong&gt;: LogScale's index-free architecture reduces index management overhead and is designed for high-scale search workloads. Validate search performance against your own retention, query patterns, and ingest volume.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;HEC compatibility&lt;/strong&gt;: LogScale accepts a Splunk HEC-compatible JSON envelope at &lt;code&gt;/api/v1/ingest/hec&lt;/code&gt;. If you're already using Splunk HEC somewhere, the HEC-style JSON envelope is largely reusable, while parser, repository, and query behavior should still be validated per destination.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────────┐
│ FSx for ONTAP                                                   │
│   vserver audit create -format xml                              │
│   → Audit log files written to audit volume                     │
└──────────────┬──────────────────────────────────────────────────┘
               │ S3 Access Point (read-only)
               ▼
┌─────────────────────────────────────────────────────────────────┐
│ EventBridge Scheduler (every 1–5 minutes)                       │
│   → Invokes Lambda                                              │
└──────────────┬──────────────────────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────────────────────┐
│ Lambda (Python 3.12, ARM64, 256MB)                              │
│   1. Read new XML files from S3 AP                              │
│      - PoC: checkpoint in SSM Parameter Store                   │
│      - Production: DynamoDB conditional checkpoint              │
│   2. Parse XML → normalize → HEC format                         │
│   3. POST to LogScale /api/v1/ingest/hec                        │
│      Authorization: Bearer &amp;lt;ingest-token&amp;gt;                       │
└──────────────┬──────────────────────────────────────────────────┘
               │ HTTPS (gzip optional)
               ▼
┌─────────────────────────────────────────────────────────────────┐
│ CrowdStrike Falcon LogScale                                     │
│   Repository: fsxn_audit                                        │
│   → Search, dashboards, alerts, correlation with EDR data       │
└─────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Real-time path (FPolicy)
&lt;/h3&gt;

&lt;p&gt;For sub-second latency (e.g., ransomware detection), use the FPolicy path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ONTAP → FPolicy TCP:9898 → ECS Fargate → SQS → Lambda → LogScale
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;FPolicy sends file operation notifications in real-time — no polling required.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Ransomware response&lt;/strong&gt;: Treat audit-log polling as an investigation and evidence pipeline. For prevention or near-real-time response, evaluate ONTAP FPolicy external mode and Autonomous Ransomware Protection (ARP) alongside this LogScale integration.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  ONTAP Audit Log Lifecycle
&lt;/h3&gt;

&lt;p&gt;Validate ONTAP audit log rotation behavior before production deployment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How frequently audit logs are rotated (&lt;code&gt;rotate-size&lt;/code&gt; / &lt;code&gt;rotate-schedule-minute&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Whether Lambda reads only closed (rotated) files or also active files&lt;/li&gt;
&lt;li&gt;Whether existing audit tools require EVTX and whether parallel XML output is feasible&lt;/li&gt;
&lt;li&gt;How S3 Access Point &lt;code&gt;LastModified&lt;/code&gt; maps to ONTAP file close time&lt;/li&gt;
&lt;li&gt;Expected audit volume size per rotation and FSx throughput impact&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Format choice&lt;/strong&gt;: This pipeline uses XML (&lt;code&gt;-format xml&lt;/code&gt;) because it can be parsed with Python standard library in Lambda without additional dependencies. EVTX is the FSx for ONTAP default and works well with Windows Event Viewer, but requires EVTX-specific parsing libraries in the Lambda package or Layer.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Existing audit tools&lt;/strong&gt;: If you already use a batch-based audit log tool that expects EVTX, verify format compatibility before changing ONTAP audit settings. This serverless pipeline is designed for detection and investigation; it is not a drop-in replacement for tools that provide audit summarization, compression, retention workflows, or compliance report templates. Consider a complementary deployment where the existing tool handles audit reporting and this pipeline handles SOC detection and Falcon correlation.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The HEC Protocol: One Format, Multiple Destinations
&lt;/h2&gt;

&lt;p&gt;CrowdStrike LogScale's HEC endpoint (&lt;code&gt;/api/v1/ingest/hec&lt;/code&gt;) accepts a Splunk HEC-compatible JSON envelope. The only differences:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;CrowdStrike LogScale&lt;/th&gt;
&lt;th&gt;Splunk&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Endpoint&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/api/v1/ingest/hec&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/services/collector/event&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth header&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Authorization: Bearer &amp;lt;token&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Authorization: Splunk &amp;lt;token&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Payload body&lt;/td&gt;
&lt;td&gt;Same HEC-style JSON envelope; validate parser/repository behavior per platform&lt;/td&gt;
&lt;td&gt;Same HEC-style JSON envelope&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The same Lambda handler can be reused for both by changing the URL, auth header, and destination-specific configuration&lt;/li&gt;
&lt;li&gt;You can switch between LogScale and Splunk without modifying event formatting&lt;/li&gt;
&lt;li&gt;Testing against a local Splunk Docker validates the HEC payload shape and provides a strong compatibility signal for LogScale; live LogScale ingest still requires tenant validation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  HEC Event Format
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1780710900.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"event"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-06-06T01:55:00.000000Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"event_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"4663"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fsxn-ontap"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"svm"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ProductionSVM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CORP&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;user-finance-01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"client_ip"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"10.0.1.50"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"operation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"File"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/share/finance/quarterly-reports/Q2-2026-revenue.xlsx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Audit Success"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"s3_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"audit/2026-06-06/events.xml"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"log_format"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"xml"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fsxn-ontap"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sourcetype"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fsxn:audit:xml"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fsxn_audit"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;time&lt;/code&gt; field&lt;/strong&gt;: Per LogScale HEC docs: "time: Time in seconds since January 1, 1970 in UTC. Is translated to @timestamp on ingestion." Always include this top-level field to ensure correct event timestamp assignment in LogScale. If omitted, LogScale uses ingest time instead of event time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-tenant note&lt;/strong&gt;: For MSSP or multi-account deployments, add bounded tenant metadata such as &lt;code&gt;aws_account_id&lt;/code&gt;, &lt;code&gt;environment&lt;/code&gt;, and &lt;code&gt;customer_alias&lt;/code&gt; inside the &lt;code&gt;event&lt;/code&gt; object. Avoid putting high-cardinality values into metric dimensions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Identity enrichment&lt;/strong&gt;: For stronger Falcon Identity / ITDR correlation, consider splitting &lt;code&gt;user&lt;/code&gt; into &lt;code&gt;user_domain&lt;/code&gt; and &lt;code&gt;user_name&lt;/code&gt; fields, and enriching with SID / UPN / email where possible. IP-only correlation (&lt;code&gt;client_ip&lt;/code&gt;) can be unreliable in DHCP, VPN, NAT, and VDI environments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Portability note&lt;/strong&gt;: For maximum portability across HEC-compatible destinations, this example keeps searchable attributes inside the &lt;code&gt;event&lt;/code&gt; JSON object rather than relying on Splunk-specific &lt;code&gt;fields&lt;/code&gt; behavior. LogScale auto-parses JSON &lt;code&gt;event&lt;/code&gt; objects into searchable fields. Validate parser and field extraction behavior in your LogScale tenant before depending on any destination-specific metadata fields.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Parser v1.1.0: Designed for Speed and Maintainability
&lt;/h2&gt;

&lt;p&gt;The shared parser (&lt;code&gt;fsxn_log_parser&lt;/code&gt;) handles EVTX, XML, and JSON formats. Key design decisions in v1.1.0:&lt;/p&gt;

&lt;h3&gt;
  
  
  Universal Entry Point
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fsxn_log_parser&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;parse&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;audit-2026-06-06.xml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;my_callback&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# result.events  → list of AuditEvent
# result.format  → "xml"
# result.parse_duration_ms → 2.8
# result.event_count → 500
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One function handles format detection, parsing, normalization, and metrics — all automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Format Detection (Strategy Pattern)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fsxn_log_parser&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;detect_format&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;register_format&lt;/span&gt;

&lt;span class="c1"&gt;# Built-in: evtx (magic bytes), xml (&amp;lt; prefix), json ([/{ prefix)
&lt;/span&gt;&lt;span class="n"&gt;fmt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;detect_format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file.xml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# → "xml"
&lt;/span&gt;
&lt;span class="c1"&gt;# Extensible: register custom formats without modifying core code
&lt;/span&gt;&lt;span class="nf"&gt;register_format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;custom&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.custom&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Detection inspects only the first 8–64 bytes — O(1) regardless of file size.&lt;/p&gt;

&lt;h3&gt;
  
  
  Performance Optimizations
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Technique&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;rpartition("}")&lt;/code&gt; for namespace stripping&lt;/td&gt;
&lt;td&gt;~15% faster than &lt;code&gt;split("}")[-1]&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;child.get("Name")&lt;/code&gt; direct access&lt;/td&gt;
&lt;td&gt;Avoids &lt;code&gt;.attrib&lt;/code&gt; dict creation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;BytesIO&lt;/code&gt; for streaming iterparse&lt;/td&gt;
&lt;td&gt;Better buffer management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local &lt;code&gt;get = event.get&lt;/code&gt; binding&lt;/td&gt;
&lt;td&gt;Reduces attribute lookups&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Char-count heuristic for streaming threshold&lt;/td&gt;
&lt;td&gt;Avoids O(n) &lt;code&gt;.encode()&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;struct.unpack_from&lt;/code&gt; on buffer&lt;/td&gt;
&lt;td&gt;Zero-copy EVTX parsing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  XML Parsing and XXE Hardening
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defusedxml.ElementTree&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;SafeET&lt;/span&gt;
    &lt;span class="n"&gt;_parse_xml_string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SafeET&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fromstring&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;ImportError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Fallback: stdlib ET.fromstring (no XXE protection —
&lt;/span&gt;    &lt;span class="c1"&gt;# for production with untrusted XML, add defusedxml to your Lambda Layer)
&lt;/span&gt;    &lt;span class="n"&gt;_parse_xml_string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ET&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fromstring&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  FIELD_MAPPING: Zero-Code Adaptation to ONTAP Changes
&lt;/h2&gt;

&lt;p&gt;The biggest maintainability win in v1.1.0:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;FIELD_MAPPING&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TimeCreated_SystemTime&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;event_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EventID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;event_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;svm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;       &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Computer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SVMName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;svm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SubjectUserName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;UserName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client_ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IpAddress&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ClientIP&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client_ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;operation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ObjectType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Operation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;operation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ObjectName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Keywords&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When ONTAP introduces a new field name (e.g., in a version upgrade), you update this table — no code changes required:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Support new ONTAP field name
&lt;/span&gt;&lt;span class="n"&gt;FIELD_MAPPING&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NewOntapUserField&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;normalize_event&lt;/code&gt; function resolves fields by iterating candidates left-to-right, returning the first non-empty value.&lt;/p&gt;




&lt;h2&gt;
  
  
  S3 Access Point Permission Model
&lt;/h2&gt;

&lt;p&gt;FSx for ONTAP S3 Access Points use a dual-layer authorization model. Per AWS documentation, both the access point policy AND the underlying file system identity permissions must permit the request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1: IAM Policy (Lambda execution role)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Lambda execution role must have &lt;code&gt;s3:GetObject&lt;/code&gt; and &lt;code&gt;s3:ListBucket&lt;/code&gt; permissions on the access point ARN:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;arn:aws:s3:&amp;lt;region&amp;gt;:&amp;lt;account&amp;gt;:accesspoint/&amp;lt;name&amp;gt;/object/*&lt;/span&gt;   &lt;span class="c1"&gt;# GetObject&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;arn:aws:s3:&amp;lt;region&amp;gt;:&amp;lt;account&amp;gt;:accesspoint/&amp;lt;name&amp;gt;&lt;/span&gt;            &lt;span class="c1"&gt;# ListBucket&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Layer 2: S3 Access Point Resource Policy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The access point itself must have a resource policy granting access to the Lambda execution role. Use &lt;code&gt;s3control put-access-point-policy&lt;/code&gt; to configure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 3: File System Identity&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;FSx for ONTAP maps S3 API calls to NFS/SMB identity. The NFS export policy or NTFS ACLs on the audit volume must allow read access for the mapped identity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key implication&lt;/strong&gt;: If your Lambda gets &lt;code&gt;AccessDenied&lt;/code&gt; despite correct IAM, check the access point resource policy and the file system export policy. All three layers must allow the request.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Troubleshooting &lt;code&gt;ListObjectsV2&lt;/code&gt; AccessDenied&lt;/strong&gt;: Verify both the access point ARN resource (for &lt;code&gt;s3:ListBucket&lt;/code&gt;) and the object ARN resource (for &lt;code&gt;s3:GetObject&lt;/code&gt;) are present in the Lambda role policy, and confirm the access point resource policy explicitly allows the Lambda execution role.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For production, enforce read-only access consistently across the Lambda IAM policy, the S3 Access Point resource policy, and the mapped FSx file-system identity.&lt;/p&gt;

&lt;p&gt;In multiprotocol environments, validate the volume or qtree security style and effective permissions for the S3 Access Point file-system identity. Mixed security-style volumes can produce unexpected access results if effective permissions differ by path.&lt;/p&gt;




&lt;h2&gt;
  
  
  Compatibility Verification (Splunk HEC)
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Live LogScale ingest validation is pending the required CrowdStrike LogScale entitlement.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Since CrowdStrike LogScale's free trial does not include HEC ingest capability (Data Connectors require a paid Next-Gen SIEM license), we validated using Splunk Enterprise Docker — which accepts the same HEC-style JSON envelope used by this implementation.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;XML audit log parsing (5 events)&lt;/td&gt;
&lt;td&gt;✅ EventID 4663/4656/4660&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HEC delivery&lt;/td&gt;
&lt;td&gt;✅ HTTP 200 &lt;code&gt;{"text":"Success","code":0}&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Indexing&lt;/td&gt;
&lt;td&gt;✅ &lt;code&gt;fsxn_audit&lt;/code&gt; index, all events searchable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Field extraction&lt;/td&gt;
&lt;td&gt;✅ user, path, client_ip, event_type, result, svm&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Splunk Search UI&lt;/td&gt;
&lt;td&gt;✅ All fields parsed and filterable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The HEC-style JSON envelope used in this implementation is compatible with Splunk HEC and provides a strong compatibility signal for LogScale HEC. A successful Splunk HEC test validates payload shape, timestamp metadata, and basic delivery behavior. It does not replace live LogScale tenant validation.&lt;/p&gt;




&lt;h2&gt;
  
  
  LogScale CQL Query Examples
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: The CQL examples below are intended as starting points. Validate in your LogScale repository after parser assignment and field extraction are confirmed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Correlation Use Cases
&lt;/h3&gt;

&lt;p&gt;Once audit events are in LogScale, investigate alongside Falcon endpoint telemetry:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Did the same user trigger Falcon endpoint detections near the file access spike?&lt;/li&gt;
&lt;li&gt;Did the client host run compression, staging, or cloud upload tools before high-volume reads?&lt;/li&gt;
&lt;li&gt;Did identity risk increase before first-seen access to sensitive shares?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Audit Event Count vs Client I/O Count
&lt;/h3&gt;

&lt;p&gt;For SMB, audit event counts are not identical to raw client I/O counts. ONTAP may suppress repeated read/write events on the same object to avoid excessive logging (e.g., EventID 4663 records only the first SMB read and first SMB write per handle). Treat detection thresholds as audit-event baselines, not storage I/O baselines.&lt;/p&gt;

&lt;h3&gt;
  
  
  SOC Detection Examples
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;The detection queries below are starter patterns. Validate CQL syntax, field extraction, threshold values, and business context in your LogScale tenant before enabling alerts. Do not enable these as production alerts without baseline tuning — thresholds should be adjusted per share, user population, and normal business activity. The time-bucketing syntax below is illustrative; adjust to your LogScale CQL version and repository parser behavior.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Mass file deletion detection
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fsxn_audit&lt;/span&gt; &lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"4660"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;groupBy&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;_bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client_ip&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  First-seen access to sensitive share
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fsxn_audit&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=/&lt;/span&gt;&lt;span class="k"&gt;share&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;finance&lt;/span&gt;&lt;span class="cm"&gt;/* OR path=/share/hr/* OR path=/share/legal/*
| groupBy([user], function=[min(@timestamp, as=first_seen), count()])
| first_seen &amp;gt; now() - 24h
| sort(first_seen, order=desc)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;This detects first-seen access within the repository retention window, not necessarily the user's first access ever.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  High-volume file access (possible exfiltration)
&lt;/h4&gt;

&lt;blockquote&gt;
&lt;p&gt;High-volume reads may also be caused by backup, indexing, migration, or legitimate batch processing. Correlate with known job schedules before escalating.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fsxn_audit&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"Audit Success"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;span&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;groupBy&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;_bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client_ip&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Response Automation with Fusion SOAR
&lt;/h3&gt;

&lt;p&gt;After a detection is validated and tuned, connect LogScale alerts to Falcon Fusion SOAR workflows for enrichment and response orchestration. A typical workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Enrich the user and client host context&lt;/li&gt;
&lt;li&gt;Check Falcon detections or incidents in the same time window&lt;/li&gt;
&lt;li&gt;Check known backup, indexing, or migration job schedules&lt;/li&gt;
&lt;li&gt;Notify the SOC channel and storage owner&lt;/li&gt;
&lt;li&gt;Escalate to containment or identity action only after human approval&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  NFS-Friendly Detection Note
&lt;/h3&gt;

&lt;p&gt;For NFS-heavy environments, prefer normalized fields such as &lt;code&gt;operation&lt;/code&gt;, &lt;code&gt;path&lt;/code&gt;, &lt;code&gt;user&lt;/code&gt;, and &lt;code&gt;client_ip&lt;/code&gt; rather than Windows-specific &lt;code&gt;event_type&lt;/code&gt; values. NFS operations use operation names (e.g., REMOVE, RENAME, READ) that map differently from SMB EventIDs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Find all file access by a specific user
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fsxn_audit&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"CORP&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="nv"&gt;user-finance-01"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client_ip&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Detect unusual access patterns (after-hours access)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fsxn_audit&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;parseTimestamp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"yyyy-MM-dd'T'HH:mm:ss"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;hour&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;formatTime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="o"&gt;=@&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"HH"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;hour&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;"19"&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;hour&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nv"&gt;"07"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;groupBy&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="k"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Files not accessed in 30 days
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fsxn_audit&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;groupBy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;last_access&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;last_access&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_access&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;asc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Failed access attempts (permission issues)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="o"&gt;#&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fsxn_audit&lt;/span&gt; &lt;span class="k"&gt;result&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;"Audit Failure"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;groupBy&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="k"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;order&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Performance Benchmarks
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Caveat&lt;/strong&gt;: The 178K events/sec result is a parser-only microbenchmark. End-to-end throughput depends on S3 Access Point read performance (tied to FSx provisioned throughput), Lambda memory, HEC batching, network latency, LogScale ingest quota, and retry behavior.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Environment: Lambda ARM64 (Graviton) 256MB, Python 3.12
Parser: v1.1.0 (FIELD_MAPPING, iterparse, XXE hardening when defusedxml is packaged)

┌─────────────────────┬──────────┬─────────────────────┐
│ Input               │ Time     │ Throughput          │
├─────────────────────┼──────────┼─────────────────────┤
│ 5 events (2KB)      │ 0.045ms  │ 111,000 events/sec  │
│ 500 events (135KB)  │ 2.8ms    │ 178,000 events/sec  │
│ 5,000 events (1.3MB)│ ~28ms    │ ~178,000 events/sec │
├─────────────────────┼──────────┼─────────────────────┤
│ 1M events/day       │ ~6 sec   │ total daily compute │
└─────────────────────┴──────────┴─────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At 178K events/sec, a 5-minute Lambda invocation can process ~53 million events — far exceeding any realistic FSx for ONTAP audit log volume.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deployment
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Store ingest token in Secrets Manager&lt;/span&gt;
aws secretsmanager create-secret &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"crowdstrike/fsxn-ingest-token"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--secret-string&lt;/span&gt; &lt;span class="s1"&gt;'{"ingest_token":"&amp;lt;your-token&amp;gt;"}'&lt;/span&gt;

&lt;span class="c"&gt;# 2. Deploy stack&lt;/span&gt;
aws cloudformation deploy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--template-file&lt;/span&gt; integrations/crowdstrike/template.yaml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--stack-name&lt;/span&gt; fsxn-crowdstrike-integration &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;FsxS3AccessPointArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;arn:aws:s3:&amp;lt;region&amp;gt;:&amp;lt;account&amp;gt;:accesspoint/&amp;lt;name&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;LogScaleIngestTokenSecretArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;secret-arn&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;LogScaleUrl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://cloud.us.humio.com &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;HecPath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/api/v1/ingest/hec &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--capabilities&lt;/span&gt; CAPABILITY_NAMED_IAM

&lt;span class="c"&gt;# 3. Upload handler code (template creates a placeholder; upload the actual handler)&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;integrations/crowdstrike/lambda &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; zip &lt;span class="k"&gt;function&lt;/span&gt;.zip handler.py
aws lambda update-function-code &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--function-name&lt;/span&gt; fsxn-crowdstrike-integration-shipper &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--zip-file&lt;/span&gt; fileb://function.zip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;S3 Access Point design&lt;/strong&gt;: When creating the access point, document the chosen file access type (read-only recommended — Lambda only needs to list and read converted audit logs), file-system user identity (UNIX for UNIX-style volumes, Windows for NTFS), and network configuration (VPC-restricted for production).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ONTAP pre-flight checks&lt;/strong&gt; (run before Lambda deployment):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;vserver audit show -instance&lt;/code&gt; and &lt;code&gt;vserver audit show -fields destination,format,events&lt;/code&gt; — confirm audit is enabled with XML format&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;volume show -fields security-style,junction-path&lt;/code&gt; — verify audit volume&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;vserver security file-directory show-effective-permissions&lt;/code&gt; — verify read access for the S3 AP identity&lt;/li&gt;
&lt;li&gt;Verify the S3 Access Point file-system identity can read the audit log path&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Splunk HEC compatibility testing&lt;/strong&gt;: For local validation without the required LogScale entitlement, set &lt;code&gt;LogScaleUrl&lt;/code&gt; to your Splunk Docker endpoint (e.g., &lt;code&gt;https://localhost:8088&lt;/code&gt;) and &lt;code&gt;HecPath=/services/collector/event&lt;/code&gt;. The payload format is validated against the same HEC envelope.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;defusedxml for production&lt;/strong&gt;: For production XML parsing hardening, include &lt;code&gt;defusedxml&lt;/code&gt; in the Lambda deployment package or Lambda Layer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended CI checks&lt;/strong&gt;: cfn-lint for template, pytest for parser/handler, dependency vulnerability scan, secret scanning, malicious XML / XXE regression test, and local Splunk HEC integration test.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Time from zero to first HEC-compatible test event in Splunk Docker: &lt;strong&gt;~30 minutes&lt;/strong&gt;. Live LogScale first-event timing is pending tenant validation.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Production deployment&lt;/strong&gt;: For production, version the Lambda artifact and promote it across environments through CI/CD. Keep the CloudFormation template, Lambda package, parser version, and dashboard definitions tied to the same release tag. Use SAM build/deploy or a CI pipeline instead of manual &lt;code&gt;update-function-code&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Cost
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Monthly&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lambda (5-min schedule, ~1s avg execution)&lt;/td&gt;
&lt;td&gt;~$0.50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Secrets Manager&lt;/td&gt;
&lt;td&gt;~$0.40&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EventBridge Scheduler&lt;/td&gt;
&lt;td&gt;~$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 AP reads&lt;/td&gt;
&lt;td&gt;~$0.05&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AWS total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$1/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Cost caveats&lt;/strong&gt;: The AWS estimate assumes Lambda is not in a VPC requiring NAT Gateway, low CloudWatch Logs retention, no custom KMS key, and low retry volume. CrowdStrike licensing cost is excluded. S3 Access Point read cost and performance depend on request volume, object size, and the underlying FSx for ONTAP file system throughput configuration. Additional CloudWatch custom metrics, dashboards, alarms, and log retention may increase AWS cost depending on metric cardinality and retention settings.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;LogScale cost, third-party ingestion entitlement, daily ingest quota, and retention depend on your CrowdStrike product edition and contract. Confirm these details with your CrowdStrike account team before production planning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Daily ingest estimate&lt;/strong&gt;: &lt;code&gt;daily_gb = events_per_day * avg_event_size_bytes / (1024^3)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost optimization options&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Filter low-value event types only after compliance and incident-response sign-off (e.g., handle-close events may be high volume but useful in forensic timelines)&lt;/li&gt;
&lt;li&gt;Truncate or hash high-cardinality fields to reduce event size&lt;/li&gt;
&lt;li&gt;Compress HEC request payloads (gzip)&lt;/li&gt;
&lt;li&gt;Tune ONTAP audit policy scope to relevant shares only&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Data Classification and Privacy
&lt;/h2&gt;

&lt;p&gt;FSx for ONTAP audit logs contain information that may be business-sensitive or subject to privacy regulations:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Sensitivity&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;user&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;PII — identifies individuals&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CORP\user-finance-01&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;client_ip&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Network topology exposure&lt;/td&gt;
&lt;td&gt;&lt;code&gt;10.0.1.50&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;path&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Business-sensitive file paths&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/share/finance/quarterly-reports/Q2-2026-revenue.xlsx&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;svm&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Infrastructure naming&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ProductionSVM&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;timestamp&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Access pattern analysis&lt;/td&gt;
&lt;td&gt;Working hours / after-hours&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Considerations&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;File paths may reveal business operations, project codenames, or M&amp;amp;A activity&lt;/li&gt;
&lt;li&gt;Username + path + timestamp combinations enable detailed behavior profiling&lt;/li&gt;
&lt;li&gt;IP addresses expose internal network topology&lt;/li&gt;
&lt;li&gt;Sending this data to an external platform (CrowdStrike) requires appropriate data processing agreements&lt;/li&gt;
&lt;li&gt;Evaluate whether PII redaction (e.g., via OTel Collector transform processor) is required before external transmission&lt;/li&gt;
&lt;li&gt;Confirm your organization's data classification policy covers audit log content shipped to third-party SIEMs&lt;/li&gt;
&lt;li&gt;Treat the SIEM repository itself as a sensitive data store — file paths, usernames, and timestamps can reveal business activity and user behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Future consideration for regulated environments&lt;/strong&gt;: For healthcare, public sector, or financial services, consider a minimization mode that hashes or drops selected fields (&lt;code&gt;user&lt;/code&gt;, &lt;code&gt;client_ip&lt;/code&gt;, or full &lt;code&gt;path&lt;/code&gt;) before external transmission, while preserving enough metadata for investigation. The OTel Collector transform processor or a pre-processing Lambda can implement field-level redaction.&lt;/p&gt;

&lt;p&gt;Example minimization policy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep: &lt;code&gt;event_type&lt;/code&gt;, &lt;code&gt;operation&lt;/code&gt;, &lt;code&gt;result&lt;/code&gt;, &lt;code&gt;svm&lt;/code&gt;, &lt;code&gt;timestamp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Hash: &lt;code&gt;user&lt;/code&gt;, &lt;code&gt;client_ip&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Truncate: &lt;code&gt;path&lt;/code&gt; to directory level (replace file names with hashes)&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  FIELD_MAPPING Masking Strategy (v1.2.0 proposal)
&lt;/h3&gt;

&lt;p&gt;Extend the existing &lt;code&gt;FIELD_MAPPING&lt;/code&gt; table with a per-field &lt;code&gt;action&lt;/code&gt; parameter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;FIELD_MAPPING&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keys&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;TimeCreated_SystemTime&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keep&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keys&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SubjectUserName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;UserName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;         &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client_ip&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keys&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IpAddress&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ClientIP&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;               &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mask_subnet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keys&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ObjectName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;                  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;truncate_dir&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;event_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keys&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EventID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;event_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;              &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keep&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keys&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Keywords&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Result&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;                   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;keep&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Actions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;keep&lt;/code&gt;: Pass through unchanged&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;hash&lt;/code&gt;: &lt;code&gt;hashlib.sha256(salt + value).hexdigest()[:16]&lt;/code&gt; — preserves correlation without exposing identity&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mask_subnet&lt;/code&gt;: &lt;code&gt;10.0.1.50&lt;/code&gt; → &lt;code&gt;10.0.1.0/24&lt;/code&gt; — hides host, preserves network segment&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;truncate_dir&lt;/code&gt;: &lt;code&gt;/share/finance/Q2-revenue.xlsx&lt;/code&gt; → &lt;code&gt;/share/finance/[REDACTED]&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Operational requirement&lt;/strong&gt;: Maintain a secure lookup table (separate from SIEM) mapping hashed values to originals. This enables authorized investigators to de-anonymize during incident response with dual-approval.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Salt management&lt;/strong&gt;: The hash salt MUST NOT be hardcoded in Lambda code or environment variables. Store it in AWS Secrets Manager (or derive from KMS &lt;code&gt;GenerateDataKey&lt;/code&gt;). Lambda retrieves and caches the salt at cold start via the same &lt;code&gt;auth_cache&lt;/code&gt; pattern used for HEC tokens. Rotate the salt quarterly; maintain a salt history table to support lookups against older hashes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Audit Log Immutability (S3 Object Lock)
&lt;/h3&gt;

&lt;p&gt;For compliance environments requiring tamper-evident audit trails:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;S3 Object Lock (COMPLIANCE mode)&lt;/strong&gt;: Once set, even root cannot delete objects before the retention period expires. Suitable for SEC 17a-4, FINRA, FISC.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Governance mode&lt;/strong&gt;: Allows privileged users to override (suitable for internal policy enforcement without regulatory mandate).
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Enable Object Lock on the audit log bucket (must be set at bucket creation)&lt;/span&gt;
aws s3api create-bucket &lt;span class="nt"&gt;--bucket&lt;/span&gt; fsxn-audit-immutable &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--object-lock-enabled-for-object-lock&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# Set default retention (COMPLIANCE mode, 7 years)&lt;/span&gt;
aws s3api put-object-lock-configuration &lt;span class="nt"&gt;--bucket&lt;/span&gt; fsxn-audit-immutable &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--object-lock-configuration&lt;/span&gt; &lt;span class="s1"&gt;'{
    "ObjectLockEnabled": "Enabled",
    "Rule": {"DefaultRetention": {"Mode": "COMPLIANCE", "Years": 7}}
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Token and Egress Governance
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Token Management
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Store in Secrets Manager&lt;/strong&gt;: Never embed HEC ingest tokens in Lambda environment variables or code. Use &lt;code&gt;secretsmanager:GetSecretValue&lt;/code&gt; at runtime with caching.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rotate periodically&lt;/strong&gt;: Generate new ingest tokens in LogScale and update the Secrets Manager secret. The Lambda's &lt;code&gt;auth_cache&lt;/code&gt; module handles reload-on-401/403 automatically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Least-privilege token scope&lt;/strong&gt;: Create a dedicated ingest token scoped to the &lt;code&gt;fsxn_audit&lt;/code&gt; repository only — not a global token.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Egress Control
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Restrict Lambda egress where feasible&lt;/strong&gt;: If the Lambda runs in a VPC, route outbound traffic through an approved egress path (proxy, firewall, or controlled NAT) and restrict destinations to approved CrowdStrike LogScale endpoints according to your network policy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alarm on authentication failures&lt;/strong&gt;: Set a CloudWatch Metric Filter + Alarm for HTTP 401/403 responses from the HEC endpoint. This detects token expiration, revocation, or misconfiguration early.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Monitoring
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example: CloudWatch Metric Filter pattern for HEC auth failures
# { $.status_code = 401 || $.status_code = 403 }
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configure a CloudWatch Alarm that triggers on &amp;gt;0 auth failures in a 5-minute window to alert on token issues before data loss occurs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Production Readiness Checklist
&lt;/h2&gt;

&lt;p&gt;Before promoting this integration from PoC to production:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Validation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] &lt;strong&gt;Validate live LogScale ingest&lt;/strong&gt; — Confirm events appear in the &lt;code&gt;fsxn_audit&lt;/code&gt; repository with correct field extraction (pending paid tenant)&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Confirm LogScale repository and ingest token design&lt;/strong&gt; — Create a dedicated repository, assign the expected parser to the ingest token, document token ownership, rotation, and emergency revocation&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Confirm repository/parser/retention&lt;/strong&gt; — Verify LogScale repository settings: assigned parser, retention period, ingest quota allocation&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Set top-level HEC &lt;code&gt;time&lt;/code&gt;&lt;/strong&gt; — Ensure Lambda sets epoch-seconds &lt;code&gt;time&lt;/code&gt; field on every HEC event (see HEC Event Format section)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reliability&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] &lt;strong&gt;Define retry/DLQ/replay strategy&lt;/strong&gt; — Configure SQS DLQ for failed HEC deliveries; document the replay procedure (see &lt;code&gt;docs/en/runbooks/dlq-replay.md&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Validate IAM least privilege&lt;/strong&gt; — Audit Lambda execution role: only &lt;code&gt;s3:GetObject&lt;/code&gt;, &lt;code&gt;s3:ListBucket&lt;/code&gt; on the AP, &lt;code&gt;secretsmanager:GetSecretValue&lt;/code&gt; on the token secret. For checkpoint: PoC uses &lt;code&gt;ssm:GetParameter&lt;/code&gt;/&lt;code&gt;PutParameter&lt;/code&gt;; production uses DynamoDB (&lt;code&gt;dynamodb:GetItem&lt;/code&gt;, &lt;code&gt;PutItem&lt;/code&gt;, &lt;code&gt;UpdateItem&lt;/code&gt; with condition expressions)&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Confirm token rotation procedure&lt;/strong&gt; — Test Secrets Manager secret rotation and verify Lambda handles token refresh without data loss&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Estimate daily volume&lt;/strong&gt; — Calculate expected daily ingest volume (events × avg event size) against the ingest quota and retention terms confirmed for your CrowdStrike tenant&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Enable CloudWatch Alarms&lt;/strong&gt; — DLQ depth &amp;gt; 0, Lambda error rate &amp;gt; 1%, checkpoint staleness &amp;gt; 15 minutes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Security and Governance&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] &lt;strong&gt;Confirm tenant isolation model&lt;/strong&gt; — For MSSP or multi-account: define repository/index separation, token scope, IAM boundary, and customer-specific retention&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Define audit evidence handling&lt;/strong&gt; — Confirm retention, export procedure, chain-of-custody requirements, and how delivery gaps are detected and explained&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Protect the audit log volume&lt;/strong&gt; — Confirm audit log volume placement, backup/replication policy, retention, and recovery procedure. Do not place audit log destination on the SVM root volume (root volume content is not replicated in SVM DR). If audit logs are part of regulated evidence, align ONTAP audit log volume protection with the SIEM retention and evidence export policy&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Data classification sign-off&lt;/strong&gt; — Confirm audit log content classification and external transmission approval (see Data Classification section)&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Egress governance&lt;/strong&gt; — Implement token storage, rotation, egress restriction, and 401/403 alarming (see Token and Egress Governance section)&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Harden XML parsing&lt;/strong&gt; — Package &lt;code&gt;defusedxml&lt;/code&gt; in the Lambda Layer or fail closed if unavailable (see XML Parsing and XXE Hardening section)&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Define SIEM access governance&lt;/strong&gt; — Confirm who can search FSx audit logs, who can export results, and how incident evidence is retained&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Define audit reporting workflow&lt;/strong&gt; — Identify recurring audit reports (monthly access summary, deletion tracking, sensitive folder access), export format, report owner, approval process, and retention period. See also: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/existing-audit-tool-coexistence.md" rel="noopener noreferrer"&gt;Existing Audit Tool Coexistence Guide&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Validate audit format coexistence&lt;/strong&gt; — If an existing audit tool requires EVTX, confirm whether a separate SVM or format migration strategy is needed (ONTAP does not support simultaneous EVTX and XML output per SVM). See &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/existing-audit-tool-coexistence.md" rel="noopener noreferrer"&gt;coexistence guide&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Observability&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] &lt;strong&gt;Validate delivery SLO measurement&lt;/strong&gt; — Confirm &lt;code&gt;LogFileAgeSeconds&lt;/code&gt; is a suitable proxy for audit log close-to-delivery lag in your environment&lt;/li&gt;
&lt;li&gt;[ ] &lt;strong&gt;Emit pipeline health metrics&lt;/strong&gt; — Use CloudWatch EMF or Lambda Powertools Metrics to emit files processed, events parsed, events sent, HEC success/failure, delivery latency, and log file age. Use SQS native metrics for DLQ depth&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  HEC Path vs OTel / Alloy Path
&lt;/h2&gt;

&lt;p&gt;This article uses the HEC path because Falcon LogScale provides a Splunk HEC-compatible endpoint. For teams that standardize on OpenTelemetry, the same normalized audit events can also be routed through an OTel Collector or Grafana Alloy pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use HEC when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The target SIEM natively supports HEC (LogScale, Splunk)&lt;/li&gt;
&lt;li&gt;You want a lightweight Lambda-to-SIEM delivery path&lt;/li&gt;
&lt;li&gt;You are validating Splunk / LogScale compatibility&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use OTel / Alloy when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need multi-backend routing (e.g., Grafana + Splunk + custom)&lt;/li&gt;
&lt;li&gt;You want field transformation, redaction, or sampling before export&lt;/li&gt;
&lt;li&gt;You want a common telemetry contract across multiple backends&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For OTel / Alloy-based routing, apply minimization before export: hash &lt;code&gt;user&lt;/code&gt; and &lt;code&gt;client_ip&lt;/code&gt;, truncate &lt;code&gt;path&lt;/code&gt; to directory level, and use only bounded low-cardinality attributes such as &lt;code&gt;event_type&lt;/code&gt;, &lt;code&gt;operation&lt;/code&gt;, and &lt;code&gt;result&lt;/code&gt; as metric dimensions when deriving metrics.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;High-cardinality warning&lt;/strong&gt;: Do not promote &lt;code&gt;user&lt;/code&gt;, full &lt;code&gt;path&lt;/code&gt;, or &lt;code&gt;client_ip&lt;/code&gt; to metric labels. Keep them as log fields / event attributes. Use only bounded, low-cardinality dimensions such as &lt;code&gt;environment&lt;/code&gt;, &lt;code&gt;region&lt;/code&gt;, &lt;code&gt;fsx_file_system_id&lt;/code&gt;, &lt;code&gt;event_type&lt;/code&gt;, &lt;code&gt;operation&lt;/code&gt;, and &lt;code&gt;result&lt;/code&gt;. Use &lt;code&gt;svm&lt;/code&gt; only if the number of SVMs is bounded and operationally meaningful.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Decision Matrix: Which Path to Choose?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Factor&lt;/th&gt;
&lt;th&gt;→ HEC Direct&lt;/th&gt;
&lt;th&gt;→ OTel/Alloy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Single SIEM destination (LogScale or Splunk)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Minimize infrastructure components&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-backend fanout needed&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Field redaction/hashing before external transmission&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Need trace_id correlation with application traces&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Already running OTel Collector in environment&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PoC / validation phase (simplicity first)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regulatory minimization required (hash PII at edge)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Start with HEC, evolve to OTel&lt;/strong&gt;: Most teams start with the HEC direct path (this article) for initial validation, then add an OTel Collector when multi-backend routing or pre-processing needs emerge. The normalized field schema is identical in both paths — switching is a routing change, not a data model change.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Pipeline Observability
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Suggested Delivery SLO
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;99% of audit log files are delivered to the target SIEM within 10 minutes
of being closed by ONTAP. No checkpoint staleness over 15 minutes during
expected audit activity windows.

To measure this SLO, track LogFileAgeSeconds (current_time minus the audit
log object's last modified time from S3 AP). DeliveryLatencyMs only measures
Lambda-side processing and HEC response latency, not end-to-end lag.

Note: LogFileAgeSeconds is a practical proxy for end-to-end lag. Validate
how S3 Access Point LastModified maps to ONTAP audit log rotation / close
timing in your environment.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pipeline Health Metrics
&lt;/h3&gt;

&lt;p&gt;For production observability, emit pipeline health metrics using CloudWatch Embedded Metric Format (EMF) or AWS Lambda Powertools Metrics.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;EMF dimensions&lt;/strong&gt;: Avoid high-cardinality dimensions such as &lt;code&gt;user&lt;/code&gt;, &lt;code&gt;path&lt;/code&gt;, &lt;code&gt;client_ip&lt;/code&gt;, and &lt;code&gt;s3_key&lt;/code&gt;. Use bounded dimensions such as &lt;code&gt;FunctionName&lt;/code&gt;, &lt;code&gt;environment&lt;/code&gt;, and &lt;code&gt;target&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Unit&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;FilesScanned&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Count&lt;/td&gt;
&lt;td&gt;Files listed per invocation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;FilesProcessed&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Count&lt;/td&gt;
&lt;td&gt;Files successfully parsed and shipped&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;EventsParsed&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Count&lt;/td&gt;
&lt;td&gt;Total audit events extracted&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;EventsSent&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Count&lt;/td&gt;
&lt;td&gt;Events successfully delivered to HEC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;HecSuccess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Count&lt;/td&gt;
&lt;td&gt;HTTP 2xx responses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;HecFailure&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Count&lt;/td&gt;
&lt;td&gt;HTTP 4xx/5xx responses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DeliveryLatencyMs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Milliseconds&lt;/td&gt;
&lt;td&gt;Time from S3 read to HEC response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CheckpointAgeSeconds&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Seconds&lt;/td&gt;
&lt;td&gt;Time since last checkpoint advance (derived from checkpoint state; recommended for production)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DlqMessages&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Count&lt;/td&gt;
&lt;td&gt;Messages in dead letter queue (from SQS native metric)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;LogFileAgeSeconds&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Seconds&lt;/td&gt;
&lt;td&gt;Time since audit log file was last modified (measures SLO)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Gray Failure Detection
&lt;/h3&gt;

&lt;p&gt;Watch for gray failures, not only hard failures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HEC returns intermittent 429 / 5xx but Lambda doesn't error&lt;/li&gt;
&lt;li&gt;Checkpoint advances slowly but does not stop&lt;/li&gt;
&lt;li&gt;DLQ remains empty but delivery latency increases&lt;/li&gt;
&lt;li&gt;Event count drops unexpectedly compared to historical baseline&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Handle HEC 429 / 5xx with exponential backoff and bounded retries. Do not advance the checkpoint until delivery succeeds or the failed object is safely moved to DLQ for replay.&lt;/p&gt;

&lt;h3&gt;
  
  
  Observability as Code
&lt;/h3&gt;

&lt;p&gt;For production, manage not only the Lambda pipeline but also dashboards, alarms, metric filters, and runbook links as code. The CloudFormation template should include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CloudWatch Dashboard for pipeline health&lt;/li&gt;
&lt;li&gt;Alarms for DLQ depth, Lambda errors, HEC 401/403, HEC 5xx, checkpoint staleness&lt;/li&gt;
&lt;li&gt;Metric filters for authentication failures&lt;/li&gt;
&lt;li&gt;Runbook links in alarm descriptions&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Splunk HEC Compatibility Notes
&lt;/h2&gt;

&lt;p&gt;For teams using this integration with Splunk (via &lt;code&gt;HecPath=/services/collector/event&lt;/code&gt;):&lt;/p&gt;

&lt;h3&gt;
  
  
  Verification Coverage
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;HEC timestamp (&lt;code&gt;time&lt;/code&gt; metadata)&lt;/td&gt;
&lt;td&gt;✅ Reflected as event time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sourcetype assignment&lt;/td&gt;
&lt;td&gt;✅ &lt;code&gt;fsxn:audit:xml&lt;/code&gt; assigned&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;source assignment&lt;/td&gt;
&lt;td&gt;✅ &lt;code&gt;fsxn-ontap&lt;/code&gt; assigned&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JSON field extraction&lt;/td&gt;
&lt;td&gt;✅ All &lt;code&gt;event&lt;/code&gt; fields searchable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Duplicate replay tolerance&lt;/td&gt;
&lt;td&gt;📝 Pending production replay test&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  HEC Acknowledgement Gap
&lt;/h3&gt;

&lt;p&gt;Splunk HEC provides &lt;strong&gt;Indexer Acknowledgement&lt;/strong&gt; (&lt;code&gt;/services/collector/ack&lt;/code&gt;) — a mechanism to confirm events are committed to disk, not just accepted into the ingestion pipeline.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Splunk HEC&lt;/th&gt;
&lt;th&gt;LogScale HEC&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Delivery guarantee&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;ackId&lt;/code&gt; confirms disk write&lt;/td&gt;
&lt;td&gt;HTTP 200 = accepted (best-effort)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Retry safety&lt;/td&gt;
&lt;td&gt;Replay until ack received&lt;/td&gt;
&lt;td&gt;No built-in replay mechanism&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data loss window&lt;/td&gt;
&lt;td&gt;Zero (if ack used correctly)&lt;/td&gt;
&lt;td&gt;Between acceptance and disk flush&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Recommendation for production&lt;/strong&gt;: Use a write-ahead pattern — persist events to S3 (or DynamoDB) before HEC delivery. Treat HEC as best-effort delivery. On failure, replay from the durable store. This pattern works identically for both Splunk and LogScale.&lt;/p&gt;

&lt;h3&gt;
  
  
  DLQ Replay Strategy
&lt;/h3&gt;

&lt;p&gt;When HEC is unavailable and events accumulate in the DLQ (SQS):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Pros&lt;/th&gt;
&lt;th&gt;Cons&lt;/th&gt;
&lt;th&gt;Recommended&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Replay DLQ JSON directly to HEC&lt;/td&gt;
&lt;td&gt;Fast, no re-parsing&lt;/td&gt;
&lt;td&gt;Requires DLQ messages to be complete HEC payloads&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reset checkpoint and re-parse from S3&lt;/td&gt;
&lt;td&gt;Guaranteed consistency&lt;/td&gt;
&lt;td&gt;Slower, re-reads S3, may re-process already-delivered events&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Recommended&lt;/strong&gt;: Store the complete HEC JSON payload in DLQ messages. On recovery, drain the DLQ and POST each message directly to the HEC endpoint. This avoids re-parsing and ensures idempotent delivery (same &lt;code&gt;time&lt;/code&gt; + &lt;code&gt;event&lt;/code&gt; content).&lt;/p&gt;

&lt;p&gt;See &lt;code&gt;docs/en/runbooks/dlq-replay.md&lt;/code&gt; for the step-by-step procedure.&lt;/p&gt;

&lt;h3&gt;
  
  
  SPL vs CQL Query Comparison
&lt;/h3&gt;

&lt;p&gt;For SOC analysts working across both platforms:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Splunk SPL&lt;/th&gt;
&lt;th&gt;LogScale CQL&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Time bucket (5 min)&lt;/td&gt;
&lt;td&gt;`\&lt;/td&gt;
&lt;td&gt;bin _time span=5m`&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Top 10 users&lt;/td&gt;
&lt;td&gt;`\&lt;/td&gt;
&lt;td&gt;top limit=10 user`&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Count by user&lt;/td&gt;
&lt;td&gt;`\&lt;/td&gt;
&lt;td&gt;stats count by user`&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Filter + aggregate&lt;/td&gt;
&lt;td&gt;`source="fsxn" event_type=4660 \&lt;/td&gt;
&lt;td&gt;stats count by user`&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time range&lt;/td&gt;
&lt;td&gt;&lt;code&gt;earliest=-1h latest=now&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Query time picker or &lt;code&gt;@timestamp &amp;gt; now() - 1h&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;String match&lt;/td&gt;
&lt;td&gt;&lt;code&gt;path="*finance*"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;path = /share/finance/*&lt;/code&gt; (glob)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key difference&lt;/strong&gt;: SPL uses a pipe-forward streaming model where each command transforms the event set sequentially. CQL uses a similar pipe model but with different function names and grouping semantics. The normalized field names (&lt;code&gt;user&lt;/code&gt;, &lt;code&gt;path&lt;/code&gt;, &lt;code&gt;client_ip&lt;/code&gt;, &lt;code&gt;event_type&lt;/code&gt;, &lt;code&gt;result&lt;/code&gt;) are intentionally aligned so queries translate 1:1 in structure.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Production Checkpoint Design
&lt;/h2&gt;

&lt;p&gt;For production, checkpoint should be advanced only after successful HEC delivery. Use DynamoDB conditional writes to avoid concurrent Lambda invocations advancing the same checkpoint.&lt;/p&gt;

&lt;h3&gt;
  
  
  DynamoDB Schema
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Attribute&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;file_path&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;String (PK)&lt;/td&gt;
&lt;td&gt;S3 key of the audit log file&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;etag&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;String&lt;/td&gt;
&lt;td&gt;Object ETag for idempotency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;status&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;String&lt;/td&gt;
&lt;td&gt;PENDING → PROCESSING → COMPLETED / FAILED&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;lease_expiry&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Number (epoch)&lt;/td&gt;
&lt;td&gt;Auto-release time for ghost lock prevention&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;event_count&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Number&lt;/td&gt;
&lt;td&gt;Events successfully delivered&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;updated_at&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;String (ISO)&lt;/td&gt;
&lt;td&gt;For staleness detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ttl&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Number (epoch)&lt;/td&gt;
&lt;td&gt;Auto-delete COMPLETED records after 7 days&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Lease-Based Concurrency Control
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Lambda invocation
    │
    ├── DynamoDB ConditionExpression:
    │   "attribute_not_exists(file_path)
    │    OR #s = :failed
    │    OR lease_expiry &amp;lt; :now"
    │
    ├── [Success] → Set status=PROCESSING, lease_expiry=now+15m
    │                → Read S3 → Parse → Ship to HEC
    │                → Set status=COMPLETED
    │
    └── [ConditionalCheckFailed] → Skip (another instance owns it)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Ghost Lock Prevention
&lt;/h3&gt;

&lt;p&gt;If Lambda crashes or times out, the &lt;code&gt;lease_expiry&lt;/code&gt; attribute ensures another invocation can reclaim the file after 15 minutes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Acquire lease with ghost-lock prevention
&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;s3_key&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;UpdateExpression&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SET #s = :processing, lease_expiry = :expiry, updated_at = :now&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ConditionExpression&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attribute_not_exists(file_path) OR #s = :failed OR lease_expiry &amp;lt; :now&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ExpressionAttributeNames&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:processing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PROCESSING&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FAILED&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:expiry&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;900&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# 15 min lease
&lt;/span&gt;        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:now&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Alarm&lt;/strong&gt;: Set a CloudWatch alarm when any item has &lt;code&gt;status=PROCESSING&lt;/code&gt; for longer than 2× the lease period. This detects systematic Lambda failures that bypass the normal retry path.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Use &lt;code&gt;updated_at&lt;/code&gt; to derive &lt;code&gt;CheckpointAgeSeconds&lt;/code&gt; for dashboarding and checkpoint staleness alarms.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-SVM / Multi-FileSystem Key Design
&lt;/h3&gt;

&lt;p&gt;For large-scale environments with multiple FSx file systems or SVMs, use a composite partition key to avoid DynamoDB hot partitions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PK: {fsx_file_system_id}#{svm_name}#{s3_key}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example: &lt;code&gt;fs-0123456789abcdef0#svm-prod#audit/2026-06-14/events-001.xml&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This ensures DynamoDB distributes writes across partitions even at scale (100+ SVMs), avoiding RCU/WCU burst throttling on a single partition.&lt;/p&gt;

&lt;h3&gt;
  
  
  Catch-up Storm Prevention
&lt;/h3&gt;

&lt;p&gt;After a Lambda outage (hours of accumulated unprocessed files), recovery can cause a burst of S3 AP read requests that impacts FSx ONTAP production workloads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mitigations&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reserved Concurrency = 1&lt;/strong&gt;: During catch-up, limit Lambda to a single concurrent execution to serialize file processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Max files per invocation&lt;/strong&gt;: Cap at 50 files per Lambda run; let the next scheduled invocation handle the rest&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backoff on queue depth&lt;/strong&gt;: If DynamoDB shows &amp;gt;100 PENDING items, add a 5-second sleep between file reads&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runbook&lt;/strong&gt;: Document the recovery procedure — temporarily increase &lt;code&gt;ScheduleRate&lt;/code&gt; to &lt;code&gt;rate(1 minute)&lt;/code&gt; while keeping concurrency at 1&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Production Readiness Checklist item&lt;/strong&gt;: Validate recovery behavior by simulating a 4-hour Lambda outage and measuring S3 AP read throughput during catch-up against the FSx provisioned throughput capacity.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. HEC compatibility is a hidden superpower
&lt;/h3&gt;

&lt;p&gt;By targeting the HEC protocol, we reduced the amount of destination-specific Lambda code needed for LogScale support. The payload envelope remains largely reusable, while repository, parser, query language, alerting, and retention settings still require LogScale-specific validation.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The trial doesn't include ingest
&lt;/h3&gt;

&lt;p&gt;CrowdStrike's Falcon EDR trial includes the Next-Gen SIEM UI (read-only search, dashboards) but does NOT include Data Connectors / HEC ingest. The "Add data connector" page returns 404. CrowdStrike trial and licensing behavior may vary by product edition and contract. Confirm Falcon LogScale or Next-Gen SIEM ingest entitlement, daily quota, retention, and pricing with your CrowdStrike account team before production planning.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. FIELD_MAPPING &amp;gt; hardcoded &lt;code&gt;.get()&lt;/code&gt; chains
&lt;/h3&gt;

&lt;p&gt;The v1.0.0 parser had 10 lines of chained &lt;code&gt;.get()&lt;/code&gt; calls in &lt;code&gt;normalize_event&lt;/code&gt;. The v1.1.0 &lt;code&gt;FIELD_MAPPING&lt;/code&gt; table:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Makes field resolution self-documenting&lt;/li&gt;
&lt;li&gt;Supports new ONTAP versions without code changes&lt;/li&gt;
&lt;li&gt;Centralizes all field name knowledge in one place&lt;/li&gt;
&lt;li&gt;Is still fast (inner loop uses local &lt;code&gt;get&lt;/code&gt; binding)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Test with Docker, deploy to Cloud
&lt;/h3&gt;

&lt;p&gt;Splunk Enterprise Docker (&lt;code&gt;splunk/splunk:latest --platform linux/amd64&lt;/code&gt;) provides a fully functional HEC in 2 minutes. We used it to validate the HEC-style payload shape before live LogScale tenant validation — at $0 cost.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CrowdStrike UI verification&lt;/strong&gt;: When the required CrowdStrike LogScale or Next-Gen SIEM entitlement is available, capture search screenshots showing FSx audit events with full field extraction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Charlotte AI integration&lt;/strong&gt;: Using natural language to query audit logs ("show me all file deletions by contractors this week")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Falcon Identity correlation&lt;/strong&gt;: Cross-referencing file access with AD authentication events for insider threat detection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OTel / Alloy routing option&lt;/strong&gt;: Add an optional pipeline that routes normalized audit events through OpenTelemetry Collector or Grafana Alloy for multi-backend delivery and field minimization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Existing audit tool coexistence&lt;/strong&gt;: Explore complementary deployment where existing batch-based tools handle audit reporting and this pipeline handles SOC detection and Falcon correlation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ARP / FPolicy correlation&lt;/strong&gt;: Correlate LogScale audit events with ONTAP Autonomous Ransomware Protection alerts and FPolicy notifications for ransomware investigation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Falcon content package&lt;/strong&gt;: Package starter LogScale detections for mass delete, abnormal access volume, after-hours access, and repeated access failures — with metadata (name, required fields, CQL, thresholds, false positives, MITRE mapping, response, tuning notes), dashboards, lookup tables, and Fusion SOAR workflow templates for repeatable deployment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Identity-aware file access graph&lt;/strong&gt;: Correlate FSx audit logs with AD / IdP authentication events, EDR telemetry, and CloudTrail to build an investigation graph for insider threat and ransomware analysis&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Partner Discovery Checklist
&lt;/h2&gt;

&lt;p&gt;Before deployment, validate the following with the customer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;FSx audit configuration (format, rotation, volume location)&lt;/li&gt;
&lt;li&gt;S3 Access Point identity and security style&lt;/li&gt;
&lt;li&gt;Expected daily event volume and file sizes&lt;/li&gt;
&lt;li&gt;SIEM entitlement and daily ingest quota&lt;/li&gt;
&lt;li&gt;Retention and compliance requirements&lt;/li&gt;
&lt;li&gt;Privacy / PII redaction requirements&lt;/li&gt;
&lt;li&gt;Network egress policy and approved endpoints&lt;/li&gt;
&lt;li&gt;Audit log format decision (XML for serverless parsing, EVTX for Windows Event Viewer)&lt;/li&gt;
&lt;li&gt;Audit destination volume and protection policy (backup, replication, retention)&lt;/li&gt;
&lt;li&gt;Volume/qtree security style and effective permissions for S3 AP identity&lt;/li&gt;
&lt;li&gt;SMB/NFS audit policy coverage and scope&lt;/li&gt;
&lt;li&gt;Existing audit tool deployment: scope, format expectations, reporting use cases, and complementary vs replacement positioning&lt;/li&gt;
&lt;li&gt;ARP / FPolicy coexistence with this pipeline&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;CrowdStrike discovery&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is the customer using Falcon LogScale repositories or NG-SIEM connectors?&lt;/li&gt;
&lt;li&gt;Is HEC ingestion enabled and contractually allowed?&lt;/li&gt;
&lt;li&gt;What is the daily ingest entitlement and retention?&lt;/li&gt;
&lt;li&gt;Which repository/parser/token will own FSx audit logs?&lt;/li&gt;
&lt;li&gt;Who owns detection tuning and alert routing?&lt;/li&gt;
&lt;li&gt;Will Fusion SOAR / Charlotte AI be used for investigation workflows?&lt;/li&gt;
&lt;li&gt;LogScale repository/view strategy per customer (MSSP)&lt;/li&gt;
&lt;li&gt;Ingest token per customer and rotation owner (MSSP)&lt;/li&gt;
&lt;li&gt;Parser package ownership and change control&lt;/li&gt;
&lt;/ul&gt;




&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations" rel="noopener noreferrer"&gt;GitHub Repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>crowdstrike</category>
      <category>serverless</category>
      <category>amazonfsxfornetappontap</category>
    </item>
    <item>
      <title>Governance &amp; Cross-Platform Access: Lake Formation, PII Anonymization, and Multi-Engine Reality for S3 Tables</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Mon, 08 Jun 2026 16:49:37 +0000</pubDate>
      <link>https://dev.to/aws-builders/governance-cross-platform-access-lake-formation-pii-anonymization-and-multi-engine-reality-for-30bf</link>
      <guid>https://dev.to/aws-builders/governance-cross-platform-access-lake-formation-pii-anonymization-and-multi-engine-reality-for-30bf</guid>
      <description>&lt;h2&gt;
  
  
  Previously...
&lt;/h2&gt;

&lt;p&gt;In &lt;a href="https://dev.to/aws-builders/from-hours-to-seconds-an-ai-powered-metadata-catalog-for-unstructured-data-on-fsx-for-ontap-5f54"&gt;Part 1&lt;/a&gt;, we built the metadata catalog. In &lt;a href="https://dev.to/aws-builders/ai-enrichment-pipeline-from-sample-classification-to-100k-file-metadata-search-with-bedrock-and-1imb"&gt;Part 2&lt;/a&gt;, we added AI classification and vector search. Now we need to answer the hard questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Who can see what?&lt;/strong&gt; (governance)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What about PII?&lt;/strong&gt; (anonymization)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Can Databricks/Snowflake access this?&lt;/strong&gt; (cross-platform)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Lake Formation: Governance on Unstructured Data
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;Unstructured data on NAS storage may be well protected at the file-system layer, but it is often not consistently classified, searchable, or governed from analytics and AI workflows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No unified classification → you may not know what's sensitive across the entire corpus&lt;/li&gt;
&lt;li&gt;File-system permissions exist, but analytics/AI tools can't leverage them for discovery&lt;/li&gt;
&lt;li&gt;Audit trails may exist at the file-system layer, but they are often not unified with analytics and AI query activity&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Solution
&lt;/h3&gt;

&lt;p&gt;With metadata in S3 Tables (Iceberg), Lake Formation provides:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌───────────────────────────────────────────────────┐
│  Lake Formation                                   │
│                                                   │
│  Table-level:  SELECT, DESCRIBE                   │
│  Column exposure: controlled via Athena Views     │
│                   (hide embedding_vector, paths)  │
│  Row filtering: WHERE sensitivity_level = 'public'│
│  Audit:        CloudTrail logs metadata queries   │
└───────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Verified: Access Control in Action
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: Authorized user queries metadata
  → ✅ SUCCEEDED (3 rows returned)

Step 2: Revoke SELECT permission
  → 🔒 BLOCKED: "Column 'file_name' cannot be resolved
     or requester is not authorized"

Step 3: Restore permission
  → ✅ SUCCEEDED (access restored)

Step 4: CloudTrail audit
  → All queries logged with user identity and timestamp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every query against the metadata table is governed and audited. This gives you &lt;strong&gt;100% metadata query governance coverage&lt;/strong&gt; in this PoC. Raw file access remains governed separately by FSx for ONTAP file-system permissions, S3 Access Point policies, and application-specific access paths.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lake Formation Governance Status
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Table-level SELECT / DESCRIBE&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;td&gt;Grant/revoke works correctly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Athena query governance&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;td&gt;Unauthorized access blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudTrail audit logging&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;td&gt;All queries logged with user identity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Column-level exclusion (ColumnWildcard)&lt;/td&gt;
&lt;td&gt;⚠️ Failed&lt;/td&gt;
&lt;td&gt;On tested S3 Tables federated catalog path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Row-level filtering / LF-Tags&lt;/td&gt;
&lt;td&gt;📋 Design pattern&lt;/td&gt;
&lt;td&gt;Taxonomy defined, needs validation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Column exposure via Athena Views&lt;/td&gt;
&lt;td&gt;✅ Workaround&lt;/td&gt;
&lt;td&gt;Recommended alternative to column-level grants&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Observed Limitation: Column-Level Grants on This S3 Tables Federated Catalog Path
&lt;/h3&gt;

&lt;p&gt;In this PoC, table-level Lake Formation SELECT grants worked as expected. However, column exclusion grants using &lt;code&gt;ColumnWildcard&lt;/code&gt; with &lt;code&gt;ExcludedColumnNames&lt;/code&gt; returned &lt;code&gt;InvalidInputException: Permissions modification is invalid&lt;/code&gt; against the &lt;code&gt;s3tablescatalog/...&lt;/code&gt; federated catalog path we tested.&lt;/p&gt;

&lt;p&gt;AWS documentation &lt;a href="https://docs.aws.amazon.com/lake-formation/latest/dg/s3-tables-grant-permissions.html" rel="noopener noreferrer"&gt;describes table, column, and row-level permissions&lt;/a&gt; for S3 Tables integrated with Lake Formation. Therefore, treat this as an observed limitation in our specific validation path (CLI command, region, catalog ID, engine version), not a confirmed general product limitation. The exact error and test conditions are recorded in the &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/tree/main/integrations/iceberg-metadata-catalog/verification-evidence" rel="noopener noreferrer"&gt;verification evidence&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workaround&lt;/strong&gt;: Create Athena Views that expose only permitted columns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- View for general users (no embeddings, no PII paths)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;public_files&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;file_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;file_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;file_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classification&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confidence_score&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="nv"&gt;"s3tablescatalog/fsxn-metadata-catalog"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"unstructured_files"&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;is_deleted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;sensitivity_level&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'public'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Apply Lake Formation on the view&lt;/span&gt;
&lt;span class="c1"&gt;-- Users query the view, not the base table&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Governance model choice&lt;/strong&gt;: For simple use cases, table/column-level permissions suffice. For dynamic, attribute-based access (e.g., "only files classified as 'public'"), use LF-Tags. For enterprise SSO integration, combine with IAM Identity Center. For enterprise governance, map &lt;code&gt;sensitivity_level&lt;/code&gt;, &lt;code&gt;path_classification&lt;/code&gt;, &lt;code&gt;tenant_id&lt;/code&gt;, and &lt;code&gt;pii_status&lt;/code&gt; to LF-Tags. See &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/governance/lf-tag-taxonomy.yaml" rel="noopener noreferrer"&gt;&lt;code&gt;governance/lf-tag-taxonomy.yaml&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Untested alternative&lt;/strong&gt;: Registering the S3 Tables table in a standard (non-federated) Glue Catalog may enable column-level permissions. This requires manual Iceberg metadata location configuration and has not been verified.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  PII Detection: English + Japanese
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Challenge
&lt;/h3&gt;

&lt;p&gt;Amazon Comprehend's &lt;code&gt;detect_pii_entities&lt;/code&gt; API supports only English and Spanish. For Japanese PII (names, addresses, My Number), we need a different approach.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dual-Engine Architecture
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Language&lt;/th&gt;
&lt;th&gt;Engine&lt;/th&gt;
&lt;th&gt;Detectable PII&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;English&lt;/td&gt;
&lt;td&gt;Amazon Comprehend&lt;/td&gt;
&lt;td&gt;NAME, EMAIL, PHONE, ADDRESS, SSN, CREDIT_CARD, DATE_TIME&lt;/td&gt;
&lt;td&gt;~200ms&lt;/td&gt;
&lt;td&gt;$0.0001/100 chars&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Japanese&lt;/td&gt;
&lt;td&gt;Bedrock Claude&lt;/td&gt;
&lt;td&gt;氏名, メール, 電話, 住所, マイナンバー, クレジットカード, 生年月日&lt;/td&gt;
&lt;td&gt;~2-5s&lt;/td&gt;
&lt;td&gt;~$0.003/request&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Data privacy note&lt;/strong&gt;: When using Bedrock Claude for PII detection, document text is sent to the Bedrock API. Per &lt;a href="https://aws.amazon.com/bedrock/faqs/" rel="noopener noreferrer"&gt;AWS's data privacy policy&lt;/a&gt;, Bedrock does not store or use your inputs/outputs to train models. For highly sensitive workloads, consider VPC endpoints and AWS PrivateLink for Bedrock access.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Japanese PII Detection (Verified)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bedrock Claude detects Japanese PII via prompt
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic.claude-3-haiku-20240307-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Detect all PII in this text. Return JSON array: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[{{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;begin&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:N,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;end&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:N}}]&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Text:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;japanese_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Results on a controlled synthetic sample&lt;/strong&gt; (not real personal data):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;PII Type&lt;/th&gt;
&lt;th&gt;Detected Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;NAME&lt;/td&gt;
&lt;td&gt;山田太郎&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EMAIL&lt;/td&gt;
&lt;td&gt;&lt;a href="mailto:taro.yamada@example.co.jp"&gt;taro.yamada@example.co.jp&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PHONE&lt;/td&gt;
&lt;td&gt;090-1234-5678&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ADDRESS&lt;/td&gt;
&lt;td&gt;〒150-0002 東京都渋谷区渋谷1-2-3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MY_NUMBER&lt;/td&gt;
&lt;td&gt;1234 5678 9012&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CREDIT_CARD&lt;/td&gt;
&lt;td&gt;4111-1111-1111-1111&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DATE_OF_BIRTH&lt;/td&gt;
&lt;td&gt;1985年3月15日&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Anonymization Pipeline
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Original document
       │
       ▼
PII Detection (Comprehend or Bedrock)
       │
       ├─ No PII → has_pii = false (no action needed)
       │
       └─ PII found → has_pii = true
                          │
                          ▼
              Redaction: all PII → [REDACTED]
                          │
                          ▼
              Store anonymized version
              anonymization_status = "completed"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Before&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Name: Taro Yamada
Email: taro.yamada@example.com
Phone: 090-1234-5678
SSN: 123-45-6789
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;After&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Name: [REDACTED]
Email: [REDACTED]
Phone: [REDACTED]
SSN: [REDACTED]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Data Clean Room Pattern
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────┐
│  Restricted Table (full metadata)       │
│  • has_pii, anonymized_path, raw paths  │
│  • Access: Security team only           │
│  • Lake Formation: strict SELECT grant  │
└─────────────────────────────────────────┘

┌─────────────────────────────────────────┐
│  Public Table (anonymized metadata)     │
│  • classification, summary (redacted)   │
│  • No PII, no raw file paths            │
│  • Access: All analysts                 │
│  • Lake Formation: broad SELECT grant   │
└─────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Encryption and Data Residency
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;At rest&lt;/strong&gt;: S3 Tables uses SSE-S3 encryption by default. All metadata is encrypted.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;In transit&lt;/strong&gt;: All API calls use TLS 1.2+.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data residency&lt;/strong&gt;: Both metadata (S3 Tables) and raw files (FSx for ONTAP) remain in the same AWS region. No cross-border data transfer occurs in the default architecture.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For detailed data sovereignty analysis, see the &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/docs/en/iceberg-metadata-catalog.md#data-sovereignty-encryption-and-audit-retention" rel="noopener noreferrer"&gt;Architecture Document — Data Sovereignty section&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Audit Log Retention
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CloudTrail&lt;/strong&gt;: Default 90-day event history. For long-term retention, create a Trail delivering to S3 (recommended: 1+ year for regulated industries)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lake Formation&lt;/strong&gt;: Data access audit logs are recorded via CloudTrail&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenSearch&lt;/strong&gt;: Access logs can be delivered to CloudWatch Logs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analysis&lt;/strong&gt;: Use CloudTrail Lake (SQL queries) or Athena + S3 (cost-efficient) for audit analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For detailed operational monitoring setup, see the &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/docs/en/iceberg-metadata-catalog.md#operational-monitoring" rel="noopener noreferrer"&gt;Operational Monitoring section&lt;/a&gt; in the architecture document.&lt;/p&gt;

&lt;h2&gt;
  
  
  Path Sensitivity Model
&lt;/h2&gt;

&lt;p&gt;File paths can reveal sensitive context even when file contents are not exposed (e.g., &lt;code&gt;/hr/layoffs/2026/&lt;/code&gt; or &lt;code&gt;/legal/mna/target-company/&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Recommended controls:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Store &lt;code&gt;raw_path&lt;/code&gt; only in the restricted metadata table&lt;/li&gt;
&lt;li&gt;Expose &lt;code&gt;hashed_path&lt;/code&gt; or &lt;code&gt;anonymized_path&lt;/code&gt; to general users&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;path_classification&lt;/code&gt;: public / internal / restricted / confidential&lt;/li&gt;
&lt;li&gt;Apply Lake Formation grants to curated views, not the base table&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Raw Data Access Boundary
&lt;/h2&gt;

&lt;p&gt;This architecture governs &lt;strong&gt;metadata access&lt;/strong&gt; through S3 Tables and Lake Formation. It does not automatically replace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ONTAP/NFS/SMB file-system permissions&lt;/li&gt;
&lt;li&gt;S3 Access Point resource policies&lt;/li&gt;
&lt;li&gt;IAM permissions for raw file reads&lt;/li&gt;
&lt;li&gt;Application-level authorization&lt;/li&gt;
&lt;li&gt;Downstream use of presigned URLs or copied files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Treat metadata governance and raw data governance as two linked but separate control planes. Both must be configured for end-to-end security.&lt;/p&gt;

&lt;h3&gt;
  
  
  S3 Access Point Identity Boundary
&lt;/h3&gt;

&lt;p&gt;Each FSx for ONTAP S3 Access Point has an associated file-system identity (&lt;code&gt;OntapFileSystemIdentity&lt;/code&gt; — UNIX UID/GID or Windows domain user). All file access through that AP is authorized as that identity.&lt;/p&gt;

&lt;p&gt;For each access point, document:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IAM principals allowed to use the access point&lt;/li&gt;
&lt;li&gt;Access point policy (allowed S3 actions)&lt;/li&gt;
&lt;li&gt;Associated UNIX or Windows file-system identity&lt;/li&gt;
&lt;li&gt;Allowed volume / prefix scope&lt;/li&gt;
&lt;li&gt;Whether the identity can access files beyond what metadata governance intends&lt;/li&gt;
&lt;li&gt;Audit evidence location&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;If the AI enrichment access point uses a broad UNIX identity (e.g., root or a service account with wide read access), metadata-level Lake Formation controls do not prevent raw file reads through that AP. Scope the AP identity to minimum required access.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;See &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/security/s3-access-point-identity-matrix.yaml" rel="noopener noreferrer"&gt;&lt;code&gt;security/s3-access-point-identity-matrix.yaml&lt;/code&gt;&lt;/a&gt; for the template.&lt;/p&gt;

&lt;h3&gt;
  
  
  Permission Identity Strategy
&lt;/h3&gt;

&lt;p&gt;For multiprotocol environments (NFS + SMB + S3 AP):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Record &lt;code&gt;discovery_protocol&lt;/code&gt;: nfs / smb / s3ap&lt;/li&gt;
&lt;li&gt;Record &lt;code&gt;access_point_identity_type&lt;/code&gt;: unix / windows&lt;/li&gt;
&lt;li&gt;Record &lt;code&gt;effective_reader_identity&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Record &lt;code&gt;permission_source&lt;/code&gt;: nfs_mode / ntfs_acl / mixed&lt;/li&gt;
&lt;li&gt;Do not assume metadata visibility implies raw file readability&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Retention and Deletion Semantics
&lt;/h2&gt;

&lt;p&gt;This PoC uses metadata records to represent file discovery and enrichment state. For regulated workloads, define:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Metadata retention period (how long to keep catalog records)&lt;/li&gt;
&lt;li&gt;Raw file retention period (governed by storage policy, not this catalog)&lt;/li&gt;
&lt;li&gt;Anonymized metadata retention period&lt;/li&gt;
&lt;li&gt;Deletion request workflow (who can request, who approves, how it's executed)&lt;/li&gt;
&lt;li&gt;Snapshot expiration impact on deletion (Iceberg time travel may expose deleted metadata until snapshots expire)&lt;/li&gt;
&lt;li&gt;Audit evidence retention (keep deletion evidence longer than the data itself)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: Iceberg time travel is useful for recovery, but it means deleted metadata may still be queryable during the snapshot retention window. Align snapshot expiration with your data deletion SLA.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Snowflake-side retention&lt;/strong&gt;: If redacted metadata is synced into Snowflake-managed tables, define Snowflake-side retention, Time Travel (default 1 day, up to 90 days), and Fail-safe (7 days, non-configurable) separately from Iceberg snapshot retention. Deletion from the Snowflake copy does not delete from the Iceberg source, and vice versa.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Approval Evidence Template (for Regulated Industries)
&lt;/h3&gt;

&lt;p&gt;For organizations requiring formal access approval documentation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Approval ID: &amp;lt;unique-id&amp;gt;
Data owner: &amp;lt;name/group&amp;gt;
Security owner: &amp;lt;name/group&amp;gt;
Platform owner: &amp;lt;name/group&amp;gt;
Allowed metadata columns: &amp;lt;columns&amp;gt;
Allowed raw file prefixes: &amp;lt;prefixes&amp;gt;
Allowed operations: metadata query only / raw file read / anonymized export
Review date: &amp;lt;date&amp;gt;
Expiration date: &amp;lt;date&amp;gt;
Evidence location: verification-evidence/&amp;lt;path&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Regulated Workload Readiness
&lt;/h2&gt;

&lt;p&gt;For public sector, healthcare, financial services, and other regulated industries, validate the following before production deployment:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Area&lt;/th&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Status in this PoC&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Data residency&lt;/td&gt;
&lt;td&gt;Metadata and raw files in same AWS Region&lt;/td&gt;
&lt;td&gt;✅ Single region (ap-northeast-1)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Encryption at rest&lt;/td&gt;
&lt;td&gt;S3 Tables: SSE-S3; FSx: at-rest encryption&lt;/td&gt;
&lt;td&gt;✅ Default encryption&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Encryption in transit&lt;/td&gt;
&lt;td&gt;TLS 1.2+ for all API calls&lt;/td&gt;
&lt;td&gt;✅ AWS default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raw data access boundary&lt;/td&gt;
&lt;td&gt;File reads governed by S3 AP policy + ONTAP permissions&lt;/td&gt;
&lt;td&gt;✅ Documented&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Metadata access boundary&lt;/td&gt;
&lt;td&gt;Lake Formation table-level + CloudTrail audit&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI processing data flow&lt;/td&gt;
&lt;td&gt;Content sent to Bedrock API, not stored by provider&lt;/td&gt;
&lt;td&gt;✅ Per AWS data protection policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PII detection limitations&lt;/td&gt;
&lt;td&gt;English (Comprehend) + Japanese (Claude) only&lt;/td&gt;
&lt;td&gt;⚠️ Other languages not covered&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Human review workflow&lt;/td&gt;
&lt;td&gt;Low-confidence queue defined&lt;/td&gt;
&lt;td&gt;✅ Design documented&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audit log retention&lt;/td&gt;
&lt;td&gt;CloudTrail 90-day default; configure Trail for longer&lt;/td&gt;
&lt;td&gt;⚠️ Requires Trail setup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deletion SLA&lt;/td&gt;
&lt;td&gt;Define separately for metadata, raw files, and snapshots&lt;/td&gt;
&lt;td&gt;⚠️ Requires policy definition&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Legal/compliance sign-off&lt;/td&gt;
&lt;td&gt;Not in scope for this PoC&lt;/td&gt;
&lt;td&gt;❌ Required before production&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;AI governance note&lt;/strong&gt;: AI enrichment in this pattern is assistive metadata generation. It does not constitute authoritative regulatory classification. Final classification decisions, data handling approvals, and compliance certifications must be confirmed by data owners, security teams, legal counsel, and compliance officers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Cross-Platform Access: The Current Reality
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Fully Verified ✅
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Access Method&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Athena&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Direct query via Glue federated catalog&lt;/td&gt;
&lt;td&gt;✅ Fully verified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Lambda/Python&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;PyIceberg SDK&lt;/td&gt;
&lt;td&gt;✅ Fully verified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EMR Spark&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Glue Iceberg REST (EMR 7.13.0+)&lt;/td&gt;
&lt;td&gt;✅ Fully verified (SELECT, COUNT, time travel)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Snowflake&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Glue Iceberg REST + VENDED_CREDENTIALS&lt;/td&gt;
&lt;td&gt;✅ Fully verified (CREATE TABLE, SELECT, COUNT, DESCRIBE, AUTO_REFRESH)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Snowflake&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;External Stage (FSx S3 AP) + TO_FILE + Cortex AI&lt;/td&gt;
&lt;td&gt;✅ Fully verified&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Expected / Requires Validation ⚠️
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Access Method&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EMR Trino&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Glue Iceberg REST (EMR 7.13.0+)&lt;/td&gt;
&lt;td&gt;⚠️ Expected (same EMR SigV4 handling as Spark)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Redshift Spectrum&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Same as Athena (Glue catalog)&lt;/td&gt;
&lt;td&gt;⚠️ Expected, not fully validated&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  What Doesn't Work (Yet) ⚠️
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Tested method&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;th&gt;Tested&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Databricks SQL Warehouse&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;CREATE CONNECTION TYPE iceberg_rest&lt;/code&gt; to S3 Tables REST&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CONNECTION_TYPE_NOT_SUPPORTED&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;2026-05-31&lt;/td&gt;
&lt;td&gt;Observed limitation in this path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Databricks Spark cluster&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Iceberg REST + SigV4 via spark.conf.set / cluster config&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;NO_SUCH_CATALOG_EXCEPTION&lt;/code&gt; (UC blocks external catalog registration)&lt;/td&gt;
&lt;td&gt;2026-06-01&lt;/td&gt;
&lt;td&gt;Confirmed: UC Foreign Catalog required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Databricks Delta Sharing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Delta Sharing server accessing S3 AP-backed storage&lt;/td&gt;
&lt;td&gt;Sharing server uses same UC storage credentials; cannot bypass session policy&lt;/td&gt;
&lt;td&gt;2026-06-01&lt;/td&gt;
&lt;td&gt;Confirmed limitation (not a workaround for S3 AP)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Databricks NFS → UC Volume&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;NFS mount path as UC External Volume&lt;/td&gt;
&lt;td&gt;Cloud storage URIs only (s3://, abfss://, gs://); NFS/FUSE paths not supported&lt;/td&gt;
&lt;td&gt;2026-06-01&lt;/td&gt;
&lt;td&gt;Confirmed limitation; internal feature request exists&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Snowflake&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;External Iceberg Table with S3 Tables direct REST endpoint&lt;/td&gt;
&lt;td&gt;Not a supported catalog type (use Glue REST instead)&lt;/td&gt;
&lt;td&gt;2026-05-31&lt;/td&gt;
&lt;td&gt;Use Glue REST + VENDED_CREDENTIALS (✅ verified)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Snowflake&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CATALOG INTEGRATION with default ACCESS_DELEGATION_MODE&lt;/td&gt;
&lt;td&gt;Defaults to EXTERNAL_VOLUME_CREDENTIALS which triggers ListObjectsV2 (rejected by S3 Tables)&lt;/td&gt;
&lt;td&gt;2026-06-02&lt;/td&gt;
&lt;td&gt;✅ Resolved: set explicit &lt;code&gt;ACCESS_DELEGATION_MODE = VENDED_CREDENTIALS&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Snowflake&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lake Formation column-level via VENDED_CREDENTIALS&lt;/td&gt;
&lt;td&gt;AllowFullTableExternalDataAccess=false blocks all VENDED_CREDENTIALS access&lt;/td&gt;
&lt;td&gt;2026-06-08&lt;/td&gt;
&lt;td&gt;Use Snowflake Horizon (Row Access Policy / Dynamic Masking) for column governance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Snowflake Open Catalog&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Polaris as Iceberg catalog&lt;/td&gt;
&lt;td&gt;Not tested&lt;/td&gt;
&lt;td&gt;TBD&lt;/td&gt;
&lt;td&gt;Strategic alternative&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Databricks: Three Integration Paths to Validate
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Update note (2026-06-09)&lt;/strong&gt;: We revalidated the S3 Tables path after Databricks announced GA for Foreign Iceberg and credential vending (May 28, 2026). Glue Connection creation and credential configuration succeeded, but Unity Catalog External Location validation still failed because S3 Tables internal buckets reject standard S3 API validation (HeadBucket/ListBucket). The S3 Tables path remains blocked in this tested Databricks UC configuration. A new Databricks support case has been submitted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For Databricks business users&lt;/strong&gt;: The value is not only table access. The value is turning previously invisible NAS files into governed metadata assets that can be searched, explained, lineage-tracked, and consumed from Databricks SQL, AI/BI, and dashboards.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In this PoC, &lt;code&gt;CREATE CONNECTION TYPE iceberg_rest&lt;/code&gt; to the S3 Tables REST endpoint returned &lt;code&gt;CONNECTION_TYPE_NOT_SUPPORTED&lt;/code&gt; on Databricks SQL Warehouse (tested 2026-05-31). This does not mean Databricks lacks Iceberg REST support — Databricks provides Unity Catalog Iceberg REST endpoints and Foreign Iceberg capabilities that evolve rapidly.&lt;/p&gt;

&lt;h4&gt;
  
  
  Confirmed Limitations (2026-06-01)
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Path&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;th&gt;Confirmed by&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Spark cluster + Iceberg REST (spark.conf.set / cluster config)&lt;/td&gt;
&lt;td&gt;❌ UC blocks external catalog registration&lt;/td&gt;
&lt;td&gt;Databricks support + our testing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delta Sharing via S3 Access Point&lt;/td&gt;
&lt;td&gt;❌ Sharing server uses same UC storage credentials&lt;/td&gt;
&lt;td&gt;Databricks support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NFS mount path as UC External Volume&lt;/td&gt;
&lt;td&gt;❌ Cloud storage URIs only (s3://, abfss://, gs://)&lt;/td&gt;
&lt;td&gt;Databricks support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DataSync → S3 → UC External Delta Table → Delta Sharing&lt;/td&gt;
&lt;td&gt;✅ Works (Delta format required)&lt;/td&gt;
&lt;td&gt;Databricks support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Delta Sharing note&lt;/strong&gt;: Delta Sharing is not a workaround for the FSx S3 Access Point session policy limitation in our tested path. The sharing server uses the same UC storage credentials and cannot bypass the session policy that blocks S3 AP ARNs. Note that Databricks has &lt;a href="https://www.databricks.com/blog/announcing-first-class-support-iceberg-format-databricks-delta-sharing" rel="noopener noreferrer"&gt;announced first-class Iceberg format support in Delta Sharing&lt;/a&gt; (Jan 2026), enabling providers to share Iceberg tables via the Iceberg REST Catalog API. This broader capability is not contradicted by our finding — our limitation is specific to S3 AP-backed storage access through UC credentials, not Delta Sharing's format support in general.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NFS Volume note&lt;/strong&gt;: UC External Volumes require cloud storage URIs. An internal feature request (AHA) exists for EFS/NFS access via UC. Until this is implemented, DataSync → S3 → UC External Location remains the only supported path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📢 Databricks users&lt;/strong&gt;: If S3 Tables access from Databricks is important for your workflow, the UC Foreign Catalog for S3 Tables feature is being tracked internally by Databricks (request DB-I-15824). Contact your Databricks account team to express interest and increase prioritization. Snowflake achieved full S3 Tables access via VENDED_CREDENTIALS in June 2026 — the same architectural pattern should be feasible for UC.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Immediate workaround for Databricks&lt;/strong&gt;: Use &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/docs/en/datasync-to-s3-guide.md" rel="noopener noreferrer"&gt;DataSync → S3 → UC External Table&lt;/a&gt; to sync metadata into a standard S3 location accessible by Unity Catalog. This is not zero-copy for the synced metadata, but raw files remain on FSx for ONTAP.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Path 1: Spark cluster + Iceberg REST (SigV4)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Best for technical validation and batch processing. Two endpoint options:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Tested 2026-06-01&lt;/strong&gt;: On Databricks with Unity Catalog enabled, external Iceberg catalogs cannot be registered via &lt;code&gt;spark.conf.set&lt;/code&gt; or cluster Spark config. Unity Catalog controls catalog registration exclusively. Both Serverless (&lt;code&gt;CONFIG_NOT_AVAILABLE&lt;/code&gt;) and All-Purpose clusters (&lt;code&gt;NO_SUCH_CATALOG_EXCEPTION&lt;/code&gt;) fail. &lt;strong&gt;Unity Catalog Foreign Catalog (Path 2) is the required approach.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="c"&gt;# Path 1a: Direct S3 Tables REST endpoint (used in this PoC)
&lt;/span&gt;&lt;span class="py"&gt;spark.sql.catalog.s3tables.uri&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;https://s3tables.ap-northeast-1.amazonaws.com/iceberg&lt;/span&gt;
&lt;span class="py"&gt;spark.sql.catalog.s3tables.rest.signing-name&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;s3tables&lt;/span&gt;

&lt;span class="c"&gt;# Path 1b: AWS Glue Iceberg REST endpoint (recommended for production + Lake Formation)
&lt;/span&gt;&lt;span class="py"&gt;spark.sql.catalog.s3tables.uri&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;https://glue.ap-northeast-1.amazonaws.com/iceberg&lt;/span&gt;
&lt;span class="py"&gt;spark.sql.catalog.s3tables.rest.signing-name&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;glue&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Common config for both:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;spark.sql.catalog.s3tables&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;org.apache.iceberg.spark.SparkCatalog&lt;/span&gt;
&lt;span class="py"&gt;spark.sql.catalog.s3tables.catalog-impl&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;org.apache.iceberg.rest.RESTCatalog&lt;/span&gt;
&lt;span class="py"&gt;spark.sql.catalog.s3tables.warehouse&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;arn:aws:s3tables:ap-northeast-1:&amp;lt;ACCOUNT&amp;gt;:bucket/fsxn-metadata-catalog&lt;/span&gt;
&lt;span class="py"&gt;spark.sql.catalog.s3tables.rest.sigv4-enabled&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;true&lt;/span&gt;
&lt;span class="py"&gt;spark.sql.catalog.s3tables.rest.signing-region&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;ap-northeast-1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Path 2: Unity Catalog Foreign Iceberg&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Register external Iceberg tables into Unity Catalog if supported for the target catalog/storage path. Best for Databricks governance, lineage, and discovery. Verify refresh semantics and read/write limitations. Retested for S3 Tables on 2026-06-09: Glue Connection and credentials succeeded, but UC External Location validation failed because S3 Tables internal buckets reject standard S3 API validation. This path remains blocked for the tested S3 Tables configuration.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Documentation/version note&lt;/strong&gt;: Databricks Iceberg capabilities are evolving rapidly. Earlier documentation and our initial validation showed limitations around Foreign Iceberg credential vending and automatic refresh behavior. After the &lt;a href="https://www.databricks.com/blog/unity-catalog-and-next-era-apache-icebergtm" rel="noopener noreferrer"&gt;May 2026 GA announcement&lt;/a&gt;, we revalidated the S3 Tables path on 2026-06-09. Credential configuration progressed further, but the tested path still failed at UC External Location validation against S3 Tables internal storage. Additionally, Databricks supports &lt;a href="https://docs.databricks.com/aws/en/query-federation/hms-federation-glue" rel="noopener noreferrer"&gt;catalog federation with AWS Glue&lt;/a&gt; (Hive Metastore type), which can expose Glue-registered tables in UC. Whether a future Iceberg REST catalog federation path could bypass the S3 Tables internal bucket constraint is an open question.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Refresh semantics&lt;/strong&gt;: If UC Foreign Iceberg works for S3 Tables via Glue REST, define refresh semantics explicitly. Our metadata catalog is append-only (new records added on file events). Analysts should know whether Databricks reads the latest Iceberg snapshot automatically or only after &lt;code&gt;REFRESH FOREIGN TABLE&lt;/code&gt;. Without auto-refresh, Athena and Databricks may show temporarily different results until the next refresh cycle. Plan for a scheduled refresh job or event-driven trigger.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS reference for this path&lt;/strong&gt;: AWS has published &lt;a href="https://aws.amazon.com/blogs/big-data/access-amazon-s3-iceberg-tables-from-databricks-using-aws-glue-iceberg-rest-catalog-in-amazon-sagemaker-lakehouse" rel="noopener noreferrer"&gt;guidance on accessing S3 Iceberg tables from Databricks using the Glue Iceberg REST Catalog&lt;/a&gt;. This validates the architectural direction of B-4/B-5, though S3 Tables-specific compatibility requires separate validation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Path 3: AWS Glue Catalog Federation with Databricks&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AWS Glue can federate metadata from Databricks Unity Catalog for Iceberg tables. This is the reverse direction but useful for cross-platform governance patterns.&lt;/p&gt;

&lt;h4&gt;
  
  
  Federation Directionality
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Direction&lt;/th&gt;
&lt;th&gt;Primary governance&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;UC Foreign Catalog / Catalog Federation to Glue&lt;/td&gt;
&lt;td&gt;Databricks reads AWS-managed metadata&lt;/td&gt;
&lt;td&gt;Unity Catalog&lt;/td&gt;
&lt;td&gt;Databricks users querying AWS Iceberg (S3 Tables)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS Glue federation to UC&lt;/td&gt;
&lt;td&gt;AWS reads Databricks-managed metadata&lt;/td&gt;
&lt;td&gt;Lake Formation / Glue&lt;/td&gt;
&lt;td&gt;Athena/EMR/Redshift reading UC Iceberg/UniForm&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;AWS reference&lt;/strong&gt;: AWS has published guidance on &lt;a href="https://aws.amazon.com/blogs/big-data/access-amazon-s3-iceberg-tables-from-databricks-using-aws-glue-iceberg-rest-catalog-in-amazon-sagemaker-lakehouse" rel="noopener noreferrer"&gt;accessing S3 Iceberg tables from Databricks using AWS Glue Iceberg REST Catalog&lt;/a&gt;, and on &lt;a href="https://docs.aws.amazon.com/lake-formation/latest/dg/catalog-federation-databricks.html" rel="noopener noreferrer"&gt;federating Databricks Unity Catalog data into AWS Glue Data Catalog&lt;/a&gt;. Both directions are documented.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Iceberg here (not Delta Lake)?&lt;/strong&gt; This architecture uses Iceberg because S3 Tables is Iceberg-native, and the Iceberg REST endpoint enables multi-engine access (Athena, EMR, Snowflake). For Databricks-only environments, Delta Lake on S3 remains the natural choice. This pattern targets &lt;strong&gt;multi-platform&lt;/strong&gt; scenarios.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Databricks UC Audit Logging for External Engines (Confirmed 2026-06-01)
&lt;/h3&gt;

&lt;p&gt;External engine access via the UC Iceberg REST Catalog endpoint is fully auditable:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Audit aspect&lt;/th&gt;
&lt;th&gt;Confirmed behavior&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Metadata requests (listNamespaces, listTables, loadTable)&lt;/td&gt;
&lt;td&gt;✅ Logged in &lt;code&gt;system.access.audit&lt;/code&gt; under &lt;code&gt;uniformIcebergRestCatalog&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vended credential issuance&lt;/td&gt;
&lt;td&gt;✅ Logged as &lt;code&gt;loadTableCredentials&lt;/code&gt; / &lt;code&gt;generateTemporaryTableCredential&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audit fields&lt;/td&gt;
&lt;td&gt;user_identity, source_ip_address, user_agent, event_time, action_name, request_params&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Distinguish external vs internal&lt;/td&gt;
&lt;td&gt;✅ &lt;code&gt;service_name = 'uniformIcebergRestCatalog'&lt;/code&gt; (external) vs &lt;code&gt;'unityCatalog'&lt;/code&gt; (internal)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Databricks audit logs record credential issuance, not individual S3 file reads after credentials are vended. Complement with AWS CloudTrail + S3 access logging for file-level audit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Databricks integration documentation&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For UC Foreign Catalog validation steps, see &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/databricks/uc-foreign-iceberg-validation.md" rel="noopener noreferrer"&gt;&lt;code&gt;databricks/uc-foreign-iceberg-validation.md&lt;/code&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;For coexistence planning, see &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/databricks/coexistence-roadmap.md" rel="noopener noreferrer"&gt;&lt;code&gt;databricks/coexistence-roadmap.md&lt;/code&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;For audit investigation, see &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/databricks/audit-correlation-guide.md" rel="noopener noreferrer"&gt;&lt;code&gt;databricks/audit-correlation-guide.md&lt;/code&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Snowflake: S3 Tables via Glue REST + VENDED_CREDENTIALS ✅
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Working Configuration (Verified 2026-06-05)
&lt;/h4&gt;

&lt;p&gt;Snowflake can directly query S3 Tables Iceberg tables via the Glue Iceberg REST endpoint with &lt;code&gt;VENDED_CREDENTIALS&lt;/code&gt;. Here's the complete working setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- 1. Catalog Integration (CRITICAL: explicit ACCESS_DELEGATION_MODE)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;CATALOG&lt;/span&gt; &lt;span class="n"&gt;INTEGRATION&lt;/span&gt; &lt;span class="n"&gt;s3tables_glue_rest_int&lt;/span&gt;
  &lt;span class="n"&gt;CATALOG_SOURCE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ICEBERG_REST&lt;/span&gt;
  &lt;span class="n"&gt;TABLE_FORMAT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ICEBERG&lt;/span&gt;
  &lt;span class="n"&gt;CATALOG_NAMESPACE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'metadata'&lt;/span&gt;
  &lt;span class="n"&gt;REST_CONFIG&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;CATALOG_URI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'https://glue.ap-northeast-1.amazonaws.com/iceberg'&lt;/span&gt;
    &lt;span class="n"&gt;CATALOG_API_TYPE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AWS_GLUE&lt;/span&gt;
    &lt;span class="k"&gt;CATALOG_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'&amp;lt;ACCOUNT_ID&amp;gt;:s3tablescatalog/fsxn-metadata-catalog'&lt;/span&gt;
    &lt;span class="n"&gt;ACCESS_DELEGATION_MODE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;VENDED_CREDENTIALS&lt;/span&gt;  &lt;span class="c1"&gt;-- MUST be explicit&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;REST_AUTHENTICATION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;TYPE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SIGV4&lt;/span&gt;
    &lt;span class="n"&gt;SIGV4_IAM_ROLE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'arn:aws:iam::&amp;lt;ACCOUNT_ID&amp;gt;:role/fsxn-snowflake-verification-role'&lt;/span&gt;
    &lt;span class="n"&gt;SIGV4_SIGNING_REGION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'ap-northeast-1'&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;ENABLED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- 2. Schema WITHOUT default EXTERNAL_VOLUME (critical)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="n"&gt;FSXN_LAKEHOUSE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;S3TABLES_VENDED&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="n"&gt;USE&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="n"&gt;FSXN_LAKEHOUSE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;S3TABLES_VENDED&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- 3. Table WITHOUT EXTERNAL_VOLUME parameter (critical)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;ICEBERG&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;s3tables_vended_creds_test&lt;/span&gt;
  &lt;span class="k"&gt;CATALOG&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'s3tables_glue_rest_int'&lt;/span&gt;
  &lt;span class="n"&gt;CATALOG_TABLE_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'unstructured_files'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AWS prerequisites&lt;/strong&gt; (must be completed before Snowflake configuration):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Register S3 Tables resource with Lake Formation (--with-federation is REQUIRED)&lt;/span&gt;
aws lakeformation register-resource &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-arn&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:s3tables:ap-northeast-1:&amp;lt;ACCOUNT_ID&amp;gt;:bucket/fsxn-metadata-catalog"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-arn&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:iam::&amp;lt;ACCOUNT_ID&amp;gt;:role/S3TablesRoleForLakeFormation"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--with-federation&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; ap-northeast-1

&lt;span class="c"&gt;# Grant SELECT + DESCRIBE to Snowflake's IAM role (table-level)&lt;/span&gt;
aws lakeformation grant-permissions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--principal&lt;/span&gt; &lt;span class="s1"&gt;'{"DataLakePrincipalIdentifier":"arn:aws:iam::&amp;lt;ACCOUNT_ID&amp;gt;:role/fsxn-snowflake-verification-role"}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource&lt;/span&gt; &lt;span class="s1"&gt;'{"Table":{"CatalogId":"&amp;lt;ACCOUNT_ID&amp;gt;:s3tablescatalog/fsxn-metadata-catalog","DatabaseName":"metadata","Name":"unstructured_files"}}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--permissions&lt;/span&gt; SELECT DESCRIBE &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; ap-northeast-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;IAM role policy must include: &lt;code&gt;glue:GetTable&lt;/code&gt;, &lt;code&gt;glue:GetDatabase&lt;/code&gt;, &lt;code&gt;glue:GetCatalog&lt;/code&gt;, &lt;code&gt;lakeformation:GetDataAccess&lt;/code&gt;, &lt;code&gt;s3tables:GetTableBucket&lt;/code&gt;, &lt;code&gt;s3tables:GetTable&lt;/code&gt;, &lt;code&gt;s3tables:GetNamespace&lt;/code&gt;, &lt;code&gt;s3tables:GetTableData&lt;/code&gt;, &lt;code&gt;s3tables:GetTableMetadataLocation&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verify integration&lt;/strong&gt; (expected DESCRIBE output after creation):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DESCRIBE&lt;/span&gt; &lt;span class="k"&gt;CATALOG&lt;/span&gt; &lt;span class="n"&gt;INTEGRATION&lt;/span&gt; &lt;span class="n"&gt;s3tables_glue_rest_int&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ENABLED&lt;/td&gt;
&lt;td&gt;true&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CATALOG_SOURCE&lt;/td&gt;
&lt;td&gt;ICEBERG_REST&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TABLE_FORMAT&lt;/td&gt;
&lt;td&gt;ICEBERG&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CATALOG_NAMESPACE&lt;/td&gt;
&lt;td&gt;metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;REST_CONFIG&lt;/td&gt;
&lt;td&gt;{CATALOG_URI=&lt;a href="https://glue.ap-northeast-1.amazonaws.com/iceberg" rel="noopener noreferrer"&gt;https://glue.ap-northeast-1.amazonaws.com/iceberg&lt;/a&gt;, ...}&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;REST_AUTHENTICATION&lt;/td&gt;
&lt;td&gt;{TYPE=SIGV4, SIGV4_IAM_ROLE=arn:aws:iam::&amp;lt;ACCOUNT_ID&amp;gt;:role/..., ...}&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API_AWS_IAM_USER_ARN&lt;/td&gt;
&lt;td&gt;arn:aws:iam::465774455528:user/&amp;lt;snowflake-user-id&amp;gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API_AWS_EXTERNAL_ID&lt;/td&gt;
&lt;td&gt;&amp;lt;external-id-for-trust-policy&amp;gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;REFRESH_INTERVAL_SECONDS&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Setup step&lt;/strong&gt;: Copy &lt;code&gt;API_AWS_IAM_USER_ARN&lt;/code&gt; and &lt;code&gt;API_AWS_EXTERNAL_ID&lt;/code&gt; from this output into your IAM role's trust policy to allow Snowflake to assume the role.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Verified Capabilities (2026-06-08)
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;th&gt;Performance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;CREATE ICEBERG TABLE&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;6.5s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SELECT * LIMIT 5&lt;/td&gt;
&lt;td&gt;✅ (5 rows)&lt;/td&gt;
&lt;td&gt;1.9s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;COUNT(*)&lt;/td&gt;
&lt;td&gt;✅ (170 rows)&lt;/td&gt;
&lt;td&gt;66ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DESCRIBE TABLE&lt;/td&gt;
&lt;td&gt;✅ (23 columns)&lt;/td&gt;
&lt;td&gt;69ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ALTER ... SET AUTO_REFRESH = TRUE&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;131ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SHOW ICEBERG TABLES&lt;/td&gt;
&lt;td&gt;✅ (UNMANAGED type)&lt;/td&gt;
&lt;td&gt;567ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time Travel&lt;/td&gt;
&lt;td&gt;✅ (available, snapshot-dependent)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Screenshots&lt;/strong&gt; (URL bar excluded, S3 Tables internal bucket masked):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6wh3589a4rmafhc3dk6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6wh3589a4rmafhc3dk6.png" alt="COUNT(*) = 170 rows" width="799" height="367"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;COUNT(&lt;/em&gt;) returns 170 rows in 66ms*&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvdpx1lm1mivohqyl107a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvdpx1lm1mivohqyl107a.png" alt="DESCRIBE TABLE — 23 columns" width="800" height="444"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;DESCRIBE TABLE shows all 23 Iceberg columns&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq5g2ntq9m6d2dnlsijjh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq5g2ntq9m6d2dnlsijjh.png" alt="SHOW ICEBERG TABLES — UNMANAGED type" width="800" height="369"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;SHOW ICEBERG TABLES confirms UNMANAGED type with S3TABLES_GLUE_REST catalog&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhm1ipy2r40bh2219zbmy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhm1ipy2r40bh2219zbmy.png" alt="SELECT * LIMIT 5 — data from S3 Tables" width="800" height="413"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;SELECT * LIMIT 5 returns actual file metadata from S3 Tables&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AUTO_REFRESH + Time Travel verification:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2s9dh5apfcekpcneg2fl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2s9dh5apfcekpcneg2fl.png" alt="AUTO_REFRESH: COUNT(*) = 171" width="800" height="481"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;AUTO_REFRESH verified: PyIceberg appended 1 record → Snowflake COUNT(&lt;/em&gt;) automatically updated from 170 to 171 within 30 seconds*&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnto5v44pd1s7thpol7uf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnto5v44pd1s7thpol7uf.png" alt="Time Travel: AT(OFFSET =&gt; -1200) = 170" width="800" height="457"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Time Travel: querying 20 minutes ago returns 170 (before the append), confirming snapshot history is accessible&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;About FILE_PATH&lt;/strong&gt;: The &lt;code&gt;FILE_PATH&lt;/code&gt; column shows the S3 path used during metadata ingestion (via FSx for ONTAP S3 Access Point). This is the path recorded in the Iceberg metadata catalog — it does not mean the files were copied to S3. The actual files remain on FSx for ONTAP and are accessible via NFS, SMB, or S3 Access Point depending on your application's protocol.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4&gt;
  
  
  Key Insight: Why Previous Attempts Failed
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;ACCESS_DELEGATION_MODE&lt;/code&gt; defaults to &lt;code&gt;EXTERNAL_VOLUME_CREDENTIALS&lt;/code&gt; when not explicitly specified. In this default mode, Snowflake validates storage access through the External Volume path, which triggers &lt;code&gt;ListObjectsV2&lt;/code&gt; against S3 Tables internal buckets — an operation that returns &lt;code&gt;MethodNotAllowed&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;With &lt;code&gt;VENDED_CREDENTIALS&lt;/code&gt; explicit:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Snowflake calls Glue REST &lt;code&gt;loadTable&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Lake Formation (via &lt;code&gt;GetTemporaryGlueTableCredentials&lt;/code&gt;) returns temporary storage credentials in the &lt;code&gt;loadTable&lt;/code&gt; response config map&lt;/li&gt;
&lt;li&gt;Snowflake uses these credentials to access data files directly by exact path&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No &lt;code&gt;ListObjectsV2&lt;/code&gt; is required&lt;/strong&gt; — Snowflake reads files by exact path from Iceberg metadata&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: The Glue REST endpoint does not implement the standard Iceberg REST &lt;code&gt;/credentials&lt;/code&gt; endpoint. Credential vending works through Lake Formation's proprietary mechanism embedded in the &lt;code&gt;loadTable&lt;/code&gt; response. This is transparent to Snowflake when configured correctly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4&gt;
  
  
  Governance Limitation: Lake Formation Column-Level (2026-06-08)
&lt;/h4&gt;

&lt;p&gt;Lake Formation column-level filtering is &lt;strong&gt;NOT enforced&lt;/strong&gt; via the VENDED_CREDENTIALS path:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When &lt;code&gt;AllowFullTableExternalDataAccess = false&lt;/code&gt;, the entire VENDED_CREDENTIALS path is blocked&lt;/li&gt;
&lt;li&gt;Explicit column/table-level grants + &lt;code&gt;ExternalDataFilteringAllowList&lt;/code&gt; do not resolve this&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AllowFullTableExternalDataAccess = true&lt;/code&gt; is required for VENDED_CREDENTIALS to function&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Technical context&lt;/strong&gt;: &lt;code&gt;AllowFullTableExternalDataAccess&lt;/code&gt; controls whether external engines (those using Lake Formation credential vending) can access table data without per-table SELECT grants. When set to &lt;code&gt;false&lt;/code&gt;, fine-grained column/row filtering is the intended enforcement mechanism — but for S3 Tables accessed via VENDED_CREDENTIALS, this currently results in complete access denial rather than filtered access. This may be a service-specific constraint of the S3 Tables federated catalog path, or it may require additional &lt;code&gt;AllowExternalDataFiltering&lt;/code&gt; + &lt;code&gt;ExternalDataFilteringAllowList&lt;/code&gt; configuration that was not effective in our testing. A feature request has been submitted to AWS.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Workaround&lt;/strong&gt;: Use Snowflake Horizon for column-level governance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Row Access Policy: restrict by sensitivity_level&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;ACCESS&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;metadata_sensitivity_filter&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sensitivity_level&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="nb"&gt;BOOLEAN&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
      &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;IS_ROLE_IN_SESSION&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'SECURITY_ADMIN'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="k"&gt;TRUE&lt;/span&gt;
      &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;sensitivity_level&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'public'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'internal'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="k"&gt;TRUE&lt;/span&gt;
      &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;FALSE&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;s3tables_vended_creds_test&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;ACCESS&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt;
  &lt;span class="n"&gt;metadata_sensitivity_filter&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sensitivity_level&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Dynamic Data Masking: hide embedding vectors from non-ML roles&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="n"&gt;MASKING&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;mask_embedding&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="nb"&gt;BINARY&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
      &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;IS_ROLE_IN_SESSION&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ML_ENGINEER'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;
      &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;s3tables_vended_creds_test&lt;/span&gt; &lt;span class="k"&gt;MODIFY&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt;
  &lt;span class="n"&gt;embedding_vector&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;MASKING&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;mask_embedding&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Snowflake Iceberg Access Modes (Summary)
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Access mode&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Glue REST + VENDED_CREDENTIALS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;S3 Tables direct query&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;✅ VERIFIED&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;External Stage (FSx S3 AP) + TO_FILE&lt;/td&gt;
&lt;td&gt;File AI analysis (Cortex COMPLETE)&lt;/td&gt;
&lt;td&gt;✅ VERIFIED&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Metadata sync to Snowflake table&lt;/td&gt;
&lt;td&gt;BI / Cortex Search / governance&lt;/td&gt;
&lt;td&gt;Available&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Object Store Catalog&lt;/td&gt;
&lt;td&gt;Direct metadata file read&lt;/td&gt;
&lt;td&gt;❌ Blocked (S3 Tables internal bucket)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snowflake Open Catalog (Polaris)&lt;/td&gt;
&lt;td&gt;Alternative Iceberg catalog&lt;/td&gt;
&lt;td&gt;Not tested&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;📖 Investigation History (2026-06-01 to 2026-06-05) — click to expand&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2026-05-31&lt;/strong&gt;: Tested S3 Tables direct REST endpoint as External Iceberg catalog → not a supported catalog type.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2026-06-01&lt;/strong&gt;: Created CATALOG INTEGRATION using ICEBERG_REST + AWS_GLUE + VENDED_CREDENTIALS. DESCRIBE succeeded but CREATE ICEBERG TABLE failed with "Failed to retrieve credentials from the Catalog". Root cause identified: Glue REST does not implement &lt;code&gt;/credentials&lt;/code&gt; endpoint (UnknownOperationException).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2026-06-02&lt;/strong&gt;: AWS Support confirmed Lake Formation uses proprietary mechanism (GetTemporaryGlueTableCredentials) for credential vending, not standard Iceberg REST &lt;code&gt;/credentials&lt;/code&gt;. Snowflake Support confirmed Error 004174 occurs when &lt;code&gt;s3.access-key-id/secret/token&lt;/code&gt; absent from loadTable response.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2026-06-02&lt;/strong&gt;: Tested Object Store catalog and EXTERNAL_VOLUME_CREDENTIALS mode — both blocked by S3 Tables internal bucket rejecting ListObjectsV2.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2026-06-03&lt;/strong&gt;: Discovered &lt;code&gt;register-resource --with-federation&lt;/code&gt; was missing. After setup, loadTable response included credentials. However, CREATE TABLE still failed at storage validation (ListObjectsV2).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2026-06-05&lt;/strong&gt;: Snowflake Support identified the critical distinction: &lt;code&gt;ACCESS_DELEGATION_MODE&lt;/code&gt; defaults to &lt;code&gt;EXTERNAL_VOLUME_CREDENTIALS&lt;/code&gt;. Explicitly setting &lt;code&gt;VENDED_CREDENTIALS&lt;/code&gt; + schema without External Volume + CREATE TABLE without External Volume parameter → &lt;strong&gt;SUCCESS&lt;/strong&gt;. CREATE TABLE + SELECT both working.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2026-06-08&lt;/strong&gt;: Additional testing confirmed COUNT(*), DESCRIBE, AUTO_REFRESH, SHOW ICEBERG TABLES all working. Lake Formation column-level filtering NOT enforced via this path (AllowFullTableExternalDataAccess=false blocks all access).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;External Stage note&lt;/strong&gt;: Snowflake External Stage against the FSx S3 Access Point alias was verified in this PoC (2026-05-31, ap-northeast-1). &lt;strong&gt;Update (2026-06-02): TO_FILE (Cortex COMPLETE multimodal) also verified working&lt;/strong&gt; — Claude Sonnet 4.5 can directly read files from FSx for ONTAP via S3 AP-backed External Stage. See &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/snowflake/external-stage-fsx-s3ap-validation.md" rel="noopener noreferrer"&gt;&lt;code&gt;snowflake/external-stage-fsx-s3ap-validation.md&lt;/code&gt;&lt;/a&gt; for exact DDL and verified operations.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Snowflake Metadata Activation Pattern
&lt;/h3&gt;

&lt;p&gt;If you sync only the metadata into Snowflake (not raw files), you preserve the zero-copy principle for actual data while enabling Snowflake-native use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Governed metadata analytics and executive dashboards&lt;/li&gt;
&lt;li&gt;File inventory and PII coverage reporting&lt;/li&gt;
&lt;li&gt;Cortex Search over redacted summaries (RAG on metadata)&lt;/li&gt;
&lt;li&gt;Snowflake Intelligence / Cortex Analyst style business Q&amp;amp;A&lt;/li&gt;
&lt;li&gt;Row Access Policies and Dynamic Masking on synced metadata&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Horizon Catalog note&lt;/strong&gt;: When metadata reaches Snowflake, Snowflake governance features such as Row Access Policies and Dynamic Masking can be applied to Snowflake-managed access paths. For external engine access via Iceberg REST, validate the exact Open Catalog / Horizon behavior for your target engine and security model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Metadata sync best practice&lt;/strong&gt;: Sync curated latest-record metadata, not the append-only base table, unless analysts explicitly need history. Preserve &lt;code&gt;scan_run_id&lt;/code&gt;, &lt;code&gt;change_type&lt;/code&gt;, and &lt;code&gt;is_deleted&lt;/code&gt; for audit and reconciliation. Use &lt;code&gt;MERGE INTO&lt;/code&gt; keyed by &lt;code&gt;file_id&lt;/code&gt; or &lt;code&gt;path_hash&lt;/code&gt; to make metadata activation idempotent. See &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/snowflake/metadata-sync-example.sql" rel="noopener noreferrer"&gt;&lt;code&gt;snowflake/metadata-sync-example.sql&lt;/code&gt;&lt;/a&gt; for the full pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Governance policy mapping&lt;/strong&gt;: When syncing metadata into Snowflake, map AWS-side fields such as &lt;code&gt;sensitivity_level&lt;/code&gt;, &lt;code&gt;tenant_id&lt;/code&gt;, &lt;code&gt;pii_status&lt;/code&gt;, and &lt;code&gt;path_classification&lt;/code&gt; to Snowflake tags, masking policies, and row access policies. Track policy drift between Lake Formation and Snowflake governance. See &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/snowflake/path-decision-guide.md" rel="noopener noreferrer"&gt;&lt;code&gt;snowflake/path-decision-guide.md&lt;/code&gt;&lt;/a&gt; for the full policy mapping.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Snowflake Cortex Search Activation Pattern
&lt;/h3&gt;

&lt;p&gt;If redacted metadata and summaries are synced into Snowflake, &lt;a href="https://docs.snowflake.com/user-guide/snowflake-cortex/cortex-search/cortex-search-overview" rel="noopener noreferrer"&gt;Cortex Search&lt;/a&gt; can provide Snowflake-native enterprise search and RAG over metadata — without managing embeddings, infrastructure, or search quality tuning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Cortex Search here:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Business users can search approved metadata without operating a separate vector database&lt;/li&gt;
&lt;li&gt;RAG and enterprise search can run over redacted summaries already governed in Snowflake&lt;/li&gt;
&lt;li&gt;Search quality, embedding management, and index refresh are delegated to Snowflake-managed services&lt;/li&gt;
&lt;li&gt;This is best suited for Snowflake-first organizations that want business-facing discovery inside the AI Data Cloud&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use Cortex Search for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Executive metadata search (natural language queries over file inventory)&lt;/li&gt;
&lt;li&gt;File inventory Q&amp;amp;A (powered by LLM + retrieval)&lt;/li&gt;
&lt;li&gt;PII coverage reporting and compliance dashboards&lt;/li&gt;
&lt;li&gt;Governed search over redacted summaries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OpenSearch Serverless NextGen remains the AWS-native serving index for this PoC. Cortex Search is an optional Snowflake-native alternative for organizations that standardize on Snowflake for business discovery.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Role separation&lt;/strong&gt;: S3 Tables / Iceberg remains the metadata source of truth. OpenSearch (AWS path) or Cortex Search (Snowflake path) are serving indexes for search UX. Choose based on your primary platform. Cortex Search operates over redacted summaries and metadata synced into Snowflake, not raw files, unless the customer explicitly chooses to copy/extract document content into Snowflake (which would break the zero-copy raw data principle).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cortex Search scope&lt;/strong&gt;: Cortex Search should operate on redacted metadata and summaries by default. If raw document content is extracted or copied into Snowflake for Cortex use cases, treat that as a separate data movement decision with its own governance, retention, and cost model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Snowflake activation cost drivers&lt;/strong&gt;: Snowflake activation introduces separate cost drivers from the AWS-native catalog: warehouse compute for metadata sync tasks and dashboards, Cortex Search service usage (based on corpus size and query volume), task/stream orchestration for refresh, and small metadata storage. These costs should be modeled separately from the AWS-native catalog cost ($114/month estimate in Part 1 does not include Snowflake-side compute).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retention alignment&lt;/strong&gt;: Confirm Snowflake account edition, table type, and retention settings before promising deletion SLAs. Snowflake Time Travel (1–90 days) and Fail-safe (7 days) operate independently from Iceberg snapshot expiration. Snowflake-side deletion evidence should be retained separately from Iceberg snapshot expiration evidence.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Snowflake Metadata Product Contract
&lt;/h3&gt;

&lt;p&gt;When activating metadata in Snowflake, expose a curated subset as the governed metadata product:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended curated columns:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Column&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Governance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;file_id&lt;/td&gt;
&lt;td&gt;Unique identifier&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;business_domain&lt;/td&gt;
&lt;td&gt;Organizational grouping&lt;/td&gt;
&lt;td&gt;Row access policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;file_type&lt;/td&gt;
&lt;td&gt;File format&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;classification&lt;/td&gt;
&lt;td&gt;AI-generated classification&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sensitivity_level&lt;/td&gt;
&lt;td&gt;Data sensitivity tier&lt;/td&gt;
&lt;td&gt;Snowflake tag + masking policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pii_status&lt;/td&gt;
&lt;td&gt;PII detection result&lt;/td&gt;
&lt;td&gt;Access policy / dashboard filter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;redacted_summary&lt;/td&gt;
&lt;td&gt;AI-generated (PII-free) summary&lt;/td&gt;
&lt;td&gt;Cortex Search source column&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;owner_team&lt;/td&gt;
&lt;td&gt;Business ownership&lt;/td&gt;
&lt;td&gt;Business glossary / stewardship&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;last_seen_at&lt;/td&gt;
&lt;td&gt;Last scan timestamp&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;data_quality_status&lt;/td&gt;
&lt;td&gt;Enrichment quality flag&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Snowflake governance mapping:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;sensitivity_level&lt;/code&gt; → Snowflake tag + masking policy&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;tenant_id&lt;/code&gt; / &lt;code&gt;business_domain&lt;/code&gt; → row access policy&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;pii_status&lt;/code&gt; → access policy / dashboard filter&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;redacted_summary&lt;/code&gt; → Cortex Search source column&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;owner_team&lt;/code&gt; → business glossary / stewardship workflow&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Databricks Metadata Activation Pattern
&lt;/h3&gt;

&lt;p&gt;If UC Foreign Catalog is not yet validated for your S3 Tables path, sync only the redacted metadata into a UC-managed Delta table. This preserves the zero-copy principle for raw files while enabling Databricks-native use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Databricks SQL dashboards and executive reporting&lt;/li&gt;
&lt;li&gt;AI/BI Genie over curated metadata (natural language queries)&lt;/li&gt;
&lt;li&gt;UC lineage and audit on metadata usage&lt;/li&gt;
&lt;li&gt;ML feature generation from file metadata&lt;/li&gt;
&lt;li&gt;Operational reporting on PII coverage and enrichment backlog&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Raw files remain on FSx for ONTAP. Only the small metadata table (~MB scale for 100K files) is synced.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This is analogous to the Snowflake metadata activation pattern: it copies only curated metadata, not the original unstructured files. Both patterns preserve the zero-copy principle for raw data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Databricks Raw File Access Decision&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Recommended path&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Governed metadata analytics only&lt;/td&gt;
&lt;td&gt;UC Foreign Catalog (if validated) or sync metadata to UC Delta&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raw file processing in Databricks&lt;/td&gt;
&lt;td&gt;DataSync → S3 → UC External Volume&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zero-copy raw file access from Databricks&lt;/td&gt;
&lt;td&gt;Not supported in validated paths (NFS mount works but without UC governance)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Business discovery / BI&lt;/td&gt;
&lt;td&gt;Sync redacted metadata to UC Delta table&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If metadata is synced into Databricks for BI, include Databricks SQL / Jobs compute cost in the activation model. This does not affect raw-file zero-copy storage, but it is part of the business-facing analytics cost.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Other Lakehouse Engines to Validate
&lt;/h3&gt;

&lt;p&gt;Beyond Databricks and Snowflake, the most natural validation targets for this metadata catalog are:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Engine&lt;/th&gt;
&lt;th&gt;Access path&lt;/th&gt;
&lt;th&gt;Likely fit&lt;/th&gt;
&lt;th&gt;Validation priority&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Trino / Starburst&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Glue Iceberg REST or S3 Tables REST&lt;/td&gt;
&lt;td&gt;Federated SQL, ad hoc query&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EMR Spark&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Glue Iceberg REST (native since EMR 7.5.0+)&lt;/td&gt;
&lt;td&gt;Bulk backfill, batch enrichment&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Redshift Spectrum&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Glue catalog (external schema)&lt;/td&gt;
&lt;td&gt;DWH integration, BI&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dremio&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Glue catalog or Iceberg REST&lt;/td&gt;
&lt;td&gt;Query acceleration, BI&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;StarRocks / Doris&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Glue Iceberg REST&lt;/td&gt;
&lt;td&gt;Low-latency serving queries&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Apache Flink&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Glue Iceberg REST&lt;/td&gt;
&lt;td&gt;Streaming metadata updates&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;dbt (via Athena)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;dbt-athena + Iceberg materialization&lt;/td&gt;
&lt;td&gt;Analytics engineering, governed marts&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Apache NiFi&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Iceberg REST or Polaris&lt;/td&gt;
&lt;td&gt;Event-driven ingestion&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These engines should be validated against:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;S3 Tables direct REST vs AWS Glue Iceberg REST&lt;/li&gt;
&lt;li&gt;Read vs write capability&lt;/li&gt;
&lt;li&gt;Lake Formation behavior (credential vending, column/row filtering)&lt;/li&gt;
&lt;li&gt;Snapshot freshness after external writes&lt;/li&gt;
&lt;li&gt;Latest-record view compatibility&lt;/li&gt;
&lt;li&gt;Case-sensitivity and lowercase naming requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key finding from validation (2026-06-08)&lt;/strong&gt;: AWS Glue Iceberg REST supports SigV4-authenticated catalog access. Lake Formation credential vending works through a proprietary mechanism (&lt;code&gt;GetTemporaryGlueTableCredentials&lt;/code&gt;). &lt;strong&gt;Snowflake&lt;/strong&gt; requires explicit &lt;code&gt;ACCESS_DELEGATION_MODE = VENDED_CREDENTIALS&lt;/code&gt; — the default mode fails. Engines that can sign requests with their own IAM credentials (EMR Spark ✅ verified, Trino on EMR expected, PyIceberg ✅ verified) work out of the box. Snowflake also works when configured correctly (✅ verified 2026-06-05). &lt;strong&gt;EMR requirement: 7.13.0+&lt;/strong&gt; (7.5.0 has a credential resolution bug). &lt;strong&gt;Governance note&lt;/strong&gt;: Lake Formation column-level filtering is NOT enforced via the VENDED_CREDENTIALS path for Snowflake.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trino note&lt;/strong&gt;: AWS has published &lt;a href="https://aws.amazon.com/blogs/big-data/" rel="noopener noreferrer"&gt;guidance on querying S3 Tables from Trino using the Iceberg REST endpoint&lt;/a&gt;. Trino's Iceberg connector supports REST catalogs natively, making it one of the most straightforward third-party validation targets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;EMR Spark note&lt;/strong&gt;: For large-scale backfill or re-enrichment (100K+ files), Spark on EMR Serverless or EMR on EC2 can be used as an alternative to Lambda/Fargate. Use &lt;a href="https://docs.aws.amazon.com/glue/latest/dg/connect-glu-iceberg-rest.html" rel="noopener noreferrer"&gt;Glue Iceberg REST&lt;/a&gt; for centralized metadata access with Lake Formation governance. &lt;strong&gt;Verified (2026-06-02)&lt;/strong&gt;: EMR Serverless Spark 7.13.0 successfully reads S3 Tables metadata via Glue Iceberg REST — SHOW NAMESPACES, SHOW TABLES, SELECT, COUNT, and snapshot history all work. Requires EMR 7.13.0+ (7.5.0 has a credential resolution bug for S3 Tables warehouse format).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Redshift note&lt;/strong&gt;: Validate separately from Athena — external schema setup, Glue statistics, Lake Formation permissions, and query latency against latest-record views may differ.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For the full compatibility matrix, see &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/lakehouse-tools/tool-compatibility-matrix.yaml" rel="noopener noreferrer"&gt;&lt;code&gt;lakehouse-tools/tool-compatibility-matrix.yaml&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Catalog Authority Rule
&lt;/h3&gt;

&lt;p&gt;For each Iceberg table, define exactly one authoritative catalog for metadata pointer and commit coordination. Do not operate S3 Tables, Polaris, Gravitino, Nessie, and Glue as independent writable catalogs for the same table unless the integration explicitly supports federation without dual writes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    ┌─────────────────────┐
                    │ Authoritative Catalog│
                    │ (ONE per table)      │
                    │ • S3 Tables + Glue   │
                    │   (this PoC)         │
                    └──────────┬──────────┘
                               │
              ┌────────────────┼────────────────┐
              │                │                │
              ▼                ▼                ▼
         Read-only        Read-only        Read-only
         consumers        consumers        consumers
         (Trino,          (Databricks,     (Snowflake,
          Dremio,          UC Foreign       Cortex,
          StarRocks)       Catalog)         Open Catalog)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Split-brain warning&lt;/strong&gt;: If two catalogs independently write to the same Iceberg table, snapshot pointers can diverge, causing data loss or corruption. Federation (one writer, many readers) is safe. Dual-write is not.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Bigger Picture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    S3 Tables (Iceberg)
                           │
              ┌────────────┼────────────┐
              │            │            │
              ▼            ▼            ▼
         Athena ✅    Databricks ❌  Snowflake ✅
         EMR Spark ✅  (UC Foreign    (Glue REST +
         PyIceberg ✅   path still    VENDED_CREDENTIALS
                      blocked in     verified 2026-06-05)
                      tested config)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Databricks integration summary&lt;/strong&gt; (confirmed 2026-06-01):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Direct S3 AP access: ❌ (UC session policy)&lt;/li&gt;
&lt;li&gt;NFS mount → UC Volume: ❌ (cloud URI only)&lt;/li&gt;
&lt;li&gt;Delta Sharing via S3 AP: ❌ (same credentials)&lt;/li&gt;
&lt;li&gt;DataSync → S3 → UC: ✅ (supported workaround, not zero-copy for synced data)&lt;/li&gt;
&lt;li&gt;UC Foreign Catalog / Foreign Iceberg via Glue: ❌ Retested 2026-06-09; Glue Connection and credentials succeeded, but UC External Location validation failed against S3 Tables internal storage. Support case submitted.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For capability-level details such as read, write, time travel, metadata tables, and governance behavior, see &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/verification-evidence/cross-platform-compatibility.yaml" rel="noopener noreferrer"&gt;&lt;code&gt;verification-evidence/cross-platform-compatibility.yaml&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is a &lt;strong&gt;temporary gap&lt;/strong&gt;. S3 Tables is relatively new (GA Dec 2024), and cross-platform federation is actively being developed. Feature requests have been filed with both platforms. Timeline for native S3 Tables support is unknown, but the Iceberg ecosystem is converging rapidly — Unity Catalog 2.0's native Iceberg support and Snowflake's Open Catalog (Polaris) both point toward broader interoperability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Catalog Decision Guide
&lt;/h3&gt;

&lt;p&gt;In the Iceberg world, the catalog is the system of record for table metadata pointers and atomic operations. Choose based on your primary platform:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Primary platform&lt;/th&gt;
&lt;th&gt;Recommended catalog&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AWS-first / Athena-first&lt;/td&gt;
&lt;td&gt;S3 Tables + Glue/Lake Formation&lt;/td&gt;
&lt;td&gt;Used in this PoC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databricks-first&lt;/td&gt;
&lt;td&gt;Unity Catalog Managed/Foreign Iceberg&lt;/td&gt;
&lt;td&gt;Best for UC governance, lineage, discovery&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snowflake-first&lt;/td&gt;
&lt;td&gt;Snowflake Open Catalog (Polaris)&lt;/td&gt;
&lt;td&gt;Best for Snowflake-governed Iceberg interoperability; validate external engine governance behavior&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Neutral / OSS-first&lt;/td&gt;
&lt;td&gt;Apache Polaris or other REST catalog&lt;/td&gt;
&lt;td&gt;Maximum portability&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Dual catalog warning&lt;/strong&gt;: Avoid running two authoritative catalogs for the same Iceberg table. Use Snowflake Open Catalog / Polaris when Snowflake or a neutral REST catalog should be authoritative. Use S3 Tables when AWS-native Athena / Lake Formation / Glue governance is authoritative. If both platforms need access, use federation (one authoritative catalog, others read via REST).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  When to Consider Snowflake Open Catalog / Polaris
&lt;/h4&gt;

&lt;p&gt;Use S3 Tables + Glue/Lake Formation when AWS-native governance is authoritative (this PoC).&lt;/p&gt;

&lt;p&gt;Consider &lt;a href="https://docs.snowflake.com/en/user-guide/polaris-catalog" rel="noopener noreferrer"&gt;Snowflake Open Catalog&lt;/a&gt; / Polaris when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Snowflake should be the primary governance and interoperability plane&lt;/li&gt;
&lt;li&gt;Multiple engines need Iceberg REST access through a neutral catalog&lt;/li&gt;
&lt;li&gt;Snowflake-managed Iceberg or Snowflake-first AI/Data Cloud workflows are the center of gravity&lt;/li&gt;
&lt;li&gt;You want managed Polaris instead of operating your own REST catalog&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This would be a different authoritative-catalog design from the current PoC and should not be mixed as a second writer for the same table.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Databricks-first note&lt;/strong&gt;: For organizations standardizing on Databricks, consider whether the metadata catalog itself should be managed in Unity Catalog as Managed Iceberg or Delta + UniForm, then exposed to AWS engines through &lt;a href="https://docs.aws.amazon.com/lake-formation/latest/dg/catalog-federation-databricks.html" rel="noopener noreferrer"&gt;Glue federation to UC&lt;/a&gt; or the UC Iceberg REST endpoint. Use S3 Tables when AWS-native Athena/Lake Formation is the primary governance path. The choice depends on which governance plane (UC or Lake Formation) is authoritative for your organization.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Format Decision for Databricks Environments
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Option&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;th&gt;Tradeoff&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;S3 Tables Iceberg&lt;/td&gt;
&lt;td&gt;AWS-first Athena/LF governance&lt;/td&gt;
&lt;td&gt;UC integration pending (Foreign Catalog validation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC Managed / Foreign Iceberg&lt;/td&gt;
&lt;td&gt;Databricks-first open format governance&lt;/td&gt;
&lt;td&gt;Validate current feature availability, region support, and limitations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delta + UniForm&lt;/td&gt;
&lt;td&gt;Databricks-native pipelines + Iceberg read compatibility&lt;/td&gt;
&lt;td&gt;Iceberg metadata generated asynchronously; non-Databricks writes constrained&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Metadata sync to Delta&lt;/td&gt;
&lt;td&gt;BI activation in Databricks SQL&lt;/td&gt;
&lt;td&gt;Metadata copy, but raw files remain zero-copy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Summary: What We Built
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Storage&lt;/td&gt;
&lt;td&gt;FSx for ONTAP (files) + S3 Tables (metadata)&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI&lt;/td&gt;
&lt;td&gt;Bedrock Claude Vision + Titan Embeddings V2&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Search&lt;/td&gt;
&lt;td&gt;OpenSearch Serverless NextGen (scale-to-zero)&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Governance&lt;/td&gt;
&lt;td&gt;Lake Formation (table-level) + CloudTrail&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PII&lt;/td&gt;
&lt;td&gt;Comprehend (EN) + Bedrock Claude (JA)&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-platform&lt;/td&gt;
&lt;td&gt;Athena ✅, EMR Spark ✅, PyIceberg ✅, Snowflake ✅, Databricks ⚠️&lt;/td&gt;
&lt;td&gt;Mostly verified&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Numbers
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;42 seconds&lt;/strong&gt;: Full demo execution time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$0.07&lt;/strong&gt;: Total demo cost&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Near $0 idle compute/search cost&lt;/strong&gt;: Persistent metadata, logs, and audit trails may still incur small charges&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$114/month&lt;/strong&gt;: Projected cost at 100K files, 1000 changes/day&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;95%&lt;/strong&gt;: Storage cost reduction vs S3 full copy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;0.95&lt;/strong&gt;: AI classification confidence (invoice detection)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;7/7&lt;/strong&gt;: PII entities detected and redacted&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;For regulated workloads, align Iceberg snapshot retention with deletion SLAs and audit evidence retention.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What's Next for This Project
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Monitor support cases&lt;/strong&gt;: Databricks UC Foreign Catalog for S3 Tables — timeline unknown&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Production hardening&lt;/strong&gt;: SQS batching, DLQ alerting, reconciliation jobs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-language PII&lt;/strong&gt;: Extend beyond EN/JA to other languages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost optimization&lt;/strong&gt;: Provisioned Throughput for high-volume Bedrock usage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Production semantics&lt;/strong&gt;: File identity, latest-record views, index reconciliation, and snapshot retention alignment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ONTAP production hardening&lt;/strong&gt;: S3 Access Point identity matrix, FPolicy event filtering, SnapMirror catalog rebinding, and FSx performance dashboard&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Snowflake governance&lt;/strong&gt;: Implement Horizon Row Access Policies and Dynamic Data Masking for column-level protection (since Lake Formation column-level is not enforced via VENDED_CREDENTIALS)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Get Involved
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;⭐ &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations" rel="noopener noreferrer"&gt;Star the repo&lt;/a&gt; if this was useful&lt;/li&gt;
&lt;li&gt;🐛 &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/issues" rel="noopener noreferrer"&gt;Open an Issue&lt;/a&gt; for questions or suggestions&lt;/li&gt;
&lt;li&gt;🍴 Fork and adapt for your own unstructured data catalog&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;This concludes the 3-part series. All code is at &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/fsxn-lakehouse-integrations&lt;/a&gt;. Questions? Open a GitHub Issue.&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Governance disclaimer&lt;/strong&gt;: This article provides governance guidance and architectural patterns. It does not substitute for legal or compliance judgment. Final regulatory determinations should be confirmed with legal and compliance teams.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>aws</category>
      <category>iceberg</category>
      <category>s3tables</category>
      <category>amazonfsxfornetappontap</category>
    </item>
    <item>
      <title>AI Enrichment Pipeline: From Sample Classification to 100K-File Metadata Search with Bedrock and OpenSearch NextGen</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Mon, 08 Jun 2026 16:37:44 +0000</pubDate>
      <link>https://dev.to/aws-builders/ai-enrichment-pipeline-from-sample-classification-to-100k-file-metadata-search-with-bedrock-and-1imb</link>
      <guid>https://dev.to/aws-builders/ai-enrichment-pipeline-from-sample-classification-to-100k-file-metadata-search-with-bedrock-and-1imb</guid>
      <description>&lt;h2&gt;
  
  
  Quick Recap: What We Built in Part 1
&lt;/h2&gt;

&lt;p&gt;In &lt;a href="https://dev.to/aws-builders/from-hours-to-seconds-an-ai-powered-metadata-catalog-for-unstructured-data-on-fsx-for-ontap-5f54"&gt;Part 1&lt;/a&gt;, we built a metadata catalog on Apache Iceberg (S3 Tables) that makes unstructured files on FSx for ONTAP instantly searchable via Athena SQL — in under 2 seconds, at $5-15/month for 100K files, without bulk-copying raw files.&lt;/p&gt;

&lt;p&gt;But basic metadata (file name, size, type) only gets you so far. &lt;strong&gt;What if you could ask&lt;/strong&gt;: "Find all invoices from Q4" or "Show me files similar to this contract"?&lt;/p&gt;

&lt;p&gt;That requires &lt;strong&gt;AI enrichment&lt;/strong&gt;: automatically classifying files and generating vector embeddings for similarity search.&lt;/p&gt;

&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;File on FSx for ONTAP
       │
       │ S3 Access Point (read)
       ▼
┌─────────────────────────────────────────┐
│  Bedrock Claude Vision                  │
│  "What is this file?"                   │
│  → classification: "invoice"            │
│  → confidence: 0.95                     │
│  → summary: "Invoice #INV-2024-..."     │
└──────────────────┬──────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────┐
│  Titan Embeddings V2                    │
│  "Represent this file as a vector"      │
│  → 1024-dimensional embedding           │
│  → normalized for cosine similarity     │
└──────────────────┬──────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────┐
│  S3 Tables (Iceberg)                    │
│  classification, confidence_score,      │
│  summary, embedding_vector              │
└──────────────────┬──────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────┐
│  OpenSearch Serverless NextGen          │
│  kNN vector search                      │
│  "Find files similar to X"              │
│  Scale-to-zero: $0 when idle            │
└─────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  AI Classification: Bedrock Claude Vision
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;For image files (PNG, JPEG, TIFF), we send the file to Claude Vision with a simple prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic.claude-3-haiku-20240307-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic_version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-2023-05-31&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base64&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;media_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image/png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;image_b64&lt;/span&gt;
            &lt;span class="p"&gt;}},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; 
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Classify this image. Respond JSON only: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;classification&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence_score&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:0.X,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]}]&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Results (Measured)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;Classification&lt;/th&gt;
&lt;th&gt;Confidence&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;invoice_sample.png&lt;/td&gt;
&lt;td&gt;Invoice&lt;/td&gt;
&lt;td&gt;0.95&lt;/td&gt;
&lt;td&gt;~3s&lt;/td&gt;
&lt;td&gt;$0.01&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;product_inspection.png&lt;/td&gt;
&lt;td&gt;Pie Chart&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;~2s&lt;/td&gt;
&lt;td&gt;$0.01&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sensor_dashboard.png&lt;/td&gt;
&lt;td&gt;IoT Sensor Dashboard&lt;/td&gt;
&lt;td&gt;0.9&lt;/td&gt;
&lt;td&gt;~3s&lt;/td&gt;
&lt;td&gt;$0.01&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Key insight&lt;/strong&gt;: In this demo, Claude 3 Haiku classified sample images in ~2-3 seconds at roughly $0.01/image. Production accuracy and cost depend on image size, prompt length, model version, and document type.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Model version note&lt;/strong&gt;: Model ID &lt;code&gt;anthropic.claude-3-haiku-20240307-v1:0&lt;/code&gt; was used at time of testing. Check &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html" rel="noopener noreferrer"&gt;Bedrock model IDs&lt;/a&gt; for the latest available version.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  For Non-Image Files
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File Type&lt;/th&gt;
&lt;th&gt;Enrichment Strategy&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;PDF&lt;/td&gt;
&lt;td&gt;Extract text → summarize with Claude&lt;/td&gt;
&lt;td&gt;$0.01-0.05&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CSV/Parquet&lt;/td&gt;
&lt;td&gt;Schema extraction + row count&lt;/td&gt;
&lt;td&gt;~$0 (metadata only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audio&lt;/td&gt;
&lt;td&gt;Transcribe → summarize&lt;/td&gt;
&lt;td&gt;$0.02-0.10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Video&lt;/td&gt;
&lt;td&gt;Frame sampling → Vision&lt;/td&gt;
&lt;td&gt;$0.05-0.50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CAD/3D&lt;/td&gt;
&lt;td&gt;Metadata extraction only&lt;/td&gt;
&lt;td&gt;~$0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Vector Embeddings: Titan Embeddings V2
&lt;/h2&gt;

&lt;p&gt;Every file gets a 1024-dimensional vector embedding based on its content or AI-generated description:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amazon.titan-embed-text-v2:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inputText&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# AI-generated description
&lt;/span&gt;        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dimensions&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;normalize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;())[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;embedding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="c1"&gt;# → [0.023, -0.041, 0.089, ...] (1024 floats)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why 1024 Dimensions?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimensions&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;th&gt;Storage&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;256&lt;/td&gt;
&lt;td&gt;Lowest&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;1KB/file&lt;/td&gt;
&lt;td&gt;High-volume, cost-sensitive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;512&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Better&lt;/td&gt;
&lt;td&gt;2KB/file&lt;/td&gt;
&lt;td&gt;General purpose&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1024&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;High&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4KB/file&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Recommended balance&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1536&lt;/td&gt;
&lt;td&gt;Higher&lt;/td&gt;
&lt;td&gt;Highest&lt;/td&gt;
&lt;td&gt;6KB/file&lt;/td&gt;
&lt;td&gt;Maximum precision&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;1024 dimensions was a practical default for this PoC. Validate 256/512/1024/1536 dimensions against your own top-k relevance and storage/cost targets (~4KB per file × 100K files = 400MB total at 1024-dim).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Pricing note&lt;/strong&gt;: Titan Embeddings V2 charges per 1K &lt;strong&gt;input&lt;/strong&gt; tokens ($0.00002). The cost is the same whether you request 256, 512, or 1024 dimensions — so there's no cost penalty for choosing higher dimensions.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Embedding Storage in Iceberg
&lt;/h3&gt;

&lt;p&gt;Embeddings are stored as &lt;code&gt;binary&lt;/code&gt; type in the Iceberg table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;struct&lt;/span&gt;

&lt;span class="c1"&gt;# Convert float list to binary for Iceberg storage
&lt;/span&gt;&lt;span class="n"&gt;embedding_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Write to Iceberg table
&lt;/span&gt;&lt;span class="n"&gt;arrow_table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;table&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;file_id&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;embedding_vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;embedding_bytes&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enrichment_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;arrow_table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Important: Append-Only Writes and Deduplication
&lt;/h3&gt;

&lt;p&gt;Iceberg on S3 Tables uses append-only writes. If you enrich the same file twice (e.g., after a retry), you'll get duplicate records. Use this dedup pattern in queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;ranked&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ROW_NUMBER&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;file_id&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;modified_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;rn&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="nv"&gt;"s3tablescatalog/fsxn-metadata-catalog"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;"unstructured_files"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;ranked&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;rn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;is_deleted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;S3 Tables auto-compaction handles the storage overhead of duplicates over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vector Search: OpenSearch Serverless NextGen
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Scale-to-Zero Revolution
&lt;/h3&gt;

&lt;p&gt;Before May 2026, OpenSearch Serverless had a &lt;strong&gt;minimum cost of ~$350/month&lt;/strong&gt; (2 OCUs always running). This made it impractical for PoC and dev environments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenSearch Serverless NextGen&lt;/strong&gt; (GA May 2026) introduces &lt;strong&gt;scale-to-zero&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;State&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Idle (no queries)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0/month&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cold start (first query)&lt;/td&gt;
&lt;td&gt;$0.24/OCU-hour&lt;/td&gt;
&lt;td&gt;10-30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Warm (subsequent queries)&lt;/td&gt;
&lt;td&gt;$0.24/OCU-hour&lt;/td&gt;
&lt;td&gt;~54ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This changes the economics completely: you can keep vector search compute cost near zero until you actually need it.&lt;/p&gt;

&lt;h3&gt;
  
  
  kNN Search Implementation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;opensearchpy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenSearch&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;requests_aws4auth&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AWS4Auth&lt;/span&gt;

&lt;span class="c1"&gt;# Generate query embedding
&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;find invoice or payment documents&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# kNN search
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fsxn-metadata-embeddings&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;size&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;knn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;embedding_vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;k&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Vector search requires OpenSearch — you cannot perform kNN queries directly on the &lt;code&gt;embedding_vector&lt;/code&gt; binary column in Athena. The Iceberg table stores embeddings for durability; OpenSearch provides the search interface.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Search Results (Measured)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Query: "find invoice or payment documents"

Results:
  1. invoice_sample.png (score: 0.6749)
     Classification: Invoice
     Summary: "Invoice #INV-2024-..."

  2. (other similar files ranked by cosine similarity)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Score interpretation&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;0.9+: Near-identical content&lt;/li&gt;
&lt;li&gt;0.7-0.9: Highly similar&lt;/li&gt;
&lt;li&gt;0.5-0.7: Related topic&lt;/li&gt;
&lt;li&gt;&amp;lt; 0.5: Weak or no relation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our score of 0.67 for "invoice or payment documents" → &lt;code&gt;invoice_sample.png&lt;/code&gt; is reasonable — the query is broad, and the match is correct.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Improving search scores&lt;/strong&gt;: Use more specific queries ("Q4 2024 invoice from vendor ABC" vs "find invoices"), enrich files with more detailed summaries, or increase embedding dimensions to 1536 for higher precision at ~50% more storage cost.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;These score bands are demo heuristics, not universal thresholds. Calibrate thresholds with labeled examples for each document type and business workflow.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Complete Pipeline
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Processing Flow
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;New file detected (FPolicy event or batch scan)
       │
       ▼
┌─ Is it an image? ──────────────────────────┐
│  YES → Claude Vision classification        │
│  NO  → Metadata-only (file type, size)     │
└────────────────────────────────────────────┘
       │
       ▼
┌─ Generate embedding ──────────────────────┐
│  Input: classification + summary text     │
│  Output: 1024-dim normalized vector       │
└───────────────────────────────────────────┘
       │
       ▼
┌─ Write to S3 Tables (Iceberg) ────────────┐
│  classification, confidence_score,        │
│  summary, embedding_vector,               │
│  enrichment_status = "completed"          │
│  index_status = "pending"                 │
└───────────────────────────────────────────┘
       │
       ▼
┌─ Index in OpenSearch ─────────────────────┐
│  file_id, embedding_vector, metadata      │
│  (for kNN similarity search)              │
│  index_status = "indexed" / "stale" / "failed" │
└───────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Error Handling
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Error&lt;/th&gt;
&lt;th&gt;Strategy&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Bedrock ThrottlingException&lt;/td&gt;
&lt;td&gt;Exponential backoff (1s, 2s, 4s)&lt;/td&gt;
&lt;td&gt;Retry up to 3 times&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bedrock ModelNotReadyException&lt;/td&gt;
&lt;td&gt;Wait 5s, retry&lt;/td&gt;
&lt;td&gt;Model warming up (first invocation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File read failure (S3 AP)&lt;/td&gt;
&lt;td&gt;Mark as &lt;code&gt;failed&lt;/code&gt;, retry next cycle&lt;/td&gt;
&lt;td&gt;No data loss&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Low confidence (&amp;lt; 0.3)&lt;/td&gt;
&lt;td&gt;Mark as &lt;code&gt;low_confidence&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Human review queue&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda timeout (large files)&lt;/td&gt;
&lt;td&gt;Fallback to ECS Fargate&lt;/td&gt;
&lt;td&gt;No timeout limit&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Monitoring the Pipeline
&lt;/h3&gt;

&lt;p&gt;How do you know when something goes wrong? Set up these CloudWatch alarms:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Alert Condition&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DLQ message count&lt;/td&gt;
&lt;td&gt;CloudWatch (SQS)&lt;/td&gt;
&lt;td&gt;&amp;gt; 0&lt;/td&gt;
&lt;td&gt;Inspect DLQ messages, redrive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda error rate&lt;/td&gt;
&lt;td&gt;CloudWatch (Lambda)&lt;/td&gt;
&lt;td&gt;&amp;gt; 5% for 5 min&lt;/td&gt;
&lt;td&gt;Check logs, Iceberg commit conflict?&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bedrock throttling&lt;/td&gt;
&lt;td&gt;CloudWatch (Bedrock)&lt;/td&gt;
&lt;td&gt;&amp;gt; 10/min&lt;/td&gt;
&lt;td&gt;Reduce request rate, adjust backoff&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enrichment backlog&lt;/td&gt;
&lt;td&gt;Athena query (pending count)&lt;/td&gt;
&lt;td&gt;&amp;gt; 1000&lt;/td&gt;
&lt;td&gt;Increase Lambda concurrency or batch size&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenSearch index size&lt;/td&gt;
&lt;td&gt;OpenSearch metrics&lt;/td&gt;
&lt;td&gt;&amp;gt; 80% capacity&lt;/td&gt;
&lt;td&gt;Add shards or rotate index&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Quick health check: DLQ + Lambda errors in one command&lt;/span&gt;
aws cloudwatch get-metric-data &lt;span class="nt"&gt;--metric-data-queries&lt;/span&gt; &lt;span class="s1"&gt;'[
  {"Id":"dlq","MetricStat":{"Metric":{"Namespace":"AWS/SQS","MetricName":"ApproximateNumberOfMessagesVisible","Dimensions":[{"Name":"QueueName","Value":"fsxn-metadata-sync-dlq"}]},"Period":300,"Stat":"Sum"}},
  {"Id":"errors","MetricStat":{"Metric":{"Namespace":"AWS/Lambda","MetricName":"Errors","Dimensions":[{"Name":"FunctionName","Value":"fsxn-metadata-sync"}]},"Period":300,"Stat":"Sum"}}
]'&lt;/span&gt; &lt;span class="nt"&gt;--start-time&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; &lt;span class="nt"&gt;-v-1H&lt;/span&gt; +%Y-%m-%dT%H:%M:%S&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="nt"&gt;--end-time&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; +%Y-%m-%dT%H:%M:%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For detailed operational monitoring guidance, see the &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/docs/en/iceberg-metadata-catalog.md#operational-monitoring" rel="noopener noreferrer"&gt;Operational Monitoring section&lt;/a&gt; in the architecture document.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost at Scale
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;th&gt;AI Cost&lt;/th&gt;
&lt;th&gt;Embedding Cost&lt;/th&gt;
&lt;th&gt;OpenSearch&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;100 files/day&lt;/td&gt;
&lt;td&gt;$1/day&lt;/td&gt;
&lt;td&gt;$0.002/day&lt;/td&gt;
&lt;td&gt;$0 (idle)&lt;/td&gt;
&lt;td&gt;~$30/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000 files/day&lt;/td&gt;
&lt;td&gt;$10/day&lt;/td&gt;
&lt;td&gt;$0.02/day&lt;/td&gt;
&lt;td&gt;~$42/month&lt;/td&gt;
&lt;td&gt;~$342/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000 files/day&lt;/td&gt;
&lt;td&gt;$100/day&lt;/td&gt;
&lt;td&gt;$0.20/day&lt;/td&gt;
&lt;td&gt;~$84/month&lt;/td&gt;
&lt;td&gt;~$3,084/month&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;At 10K files/day, consider batch processing during off-hours and Provisioned Throughput for Bedrock to reduce per-request cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost optimization tip&lt;/strong&gt;: Not all files need AI enrichment. A common pattern: images → Vision classification, documents → text extraction + embedding, data files (CSV/Parquet) → metadata only (no AI cost). This can reduce AI costs by 60-80% depending on your file mix.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Batch Inference&lt;/strong&gt;: For initial bulk enrichment (10K+ files), Bedrock Batch Inference can reduce costs by ~50% compared to real-time invocations. Use real-time for incremental new files, batch for backfill.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Batch Inference example — submit a batch job for bulk classification
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;bedrock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-northeast-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Prepare input JSONL file in S3 (one request per line)
# Each line: {"recordId":"file-001","modelInput":{"anthropic_version":"bedrock-2023-05-31",...}}
&lt;/span&gt;
&lt;span class="c1"&gt;# 2. Create batch inference job
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_model_invocation_job&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;jobName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata-enrichment-backfill&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic.claude-3-haiku-20240307-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;roleArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:iam::&amp;lt;ACCOUNT&amp;gt;:role/BedrockBatchRole&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;inputDataConfig&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3InputDataConfig&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3Uri&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3://my-bucket/batch-input/enrichment-requests.jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;outputDataConfig&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3OutputDataConfig&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3Uri&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3://my-bucket/batch-output/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Job runs asynchronously — results written to S3 when complete
# Typical processing: 10K files in ~30 minutes at ~50% cost reduction
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;The batch input JSONL contains prompts, file references, or extracted/redacted text depending on your design. It does not require copying the original raw files from FSx for ONTAP to S3. If images are included as base64, treat the JSONL as temporary processing data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Batch job monitoring&lt;/strong&gt;: Use EventBridge rules to detect Bedrock batch job state changes (&lt;code&gt;COMPLETED&lt;/code&gt;, &lt;code&gt;FAILED&lt;/code&gt;). Route to SNS → Lambda to automatically write results back to S3 Tables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt Caching&lt;/strong&gt;: If using the same system prompt across all classifications (recommended), Bedrock's Prompt Caching feature can reduce input token costs by up to 90% for repeated prompts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;EMR Spark for large-scale backfill&lt;/strong&gt;: For initial backfill or re-enrichment of 100K+ files, Spark on EMR Serverless or EMR on EC2 can be used as an alternative to Lambda/Fargate. EMR 7.13.0+ supports &lt;a href="https://docs.aws.amazon.com/glue/latest/dg/connect-glu-iceberg-rest.html" rel="noopener noreferrer"&gt;Glue as an Iceberg REST catalog&lt;/a&gt;, enabling distributed metadata writes with Lake Formation governance. Verified 2026-06-02: SELECT, COUNT, and time travel all work on EMR Serverless 7.13.0. Use Lambda for incremental (event-driven) processing and Spark for bulk operations.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Search Index Consistency
&lt;/h2&gt;

&lt;p&gt;OpenSearch is a derived index, not the system of record. S3 Tables / Iceberg remains the metadata source of truth.&lt;/p&gt;

&lt;p&gt;Recommended controls:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Store &lt;code&gt;iceberg_snapshot_id&lt;/code&gt; in OpenSearch documents for traceability&lt;/li&gt;
&lt;li&gt;Store &lt;code&gt;embedding_model_id&lt;/code&gt; and &lt;code&gt;prompt_version&lt;/code&gt; in both Iceberg and OpenSearch&lt;/li&gt;
&lt;li&gt;Reconcile OpenSearch index against latest Iceberg view periodically&lt;/li&gt;
&lt;li&gt;Mark &lt;code&gt;index_status&lt;/code&gt;: pending / indexed / stale / failed&lt;/li&gt;
&lt;li&gt;If search returns a stale result, fall back to Athena query on the base table&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FPolicy Event Design
&lt;/h2&gt;

&lt;p&gt;For incremental metadata sync via ONTAP FPolicy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;batch scan&lt;/strong&gt; for initial backfill (not FPolicy)&lt;/li&gt;
&lt;li&gt;Use FPolicy only for &lt;strong&gt;incremental changes&lt;/strong&gt; after initial catalog population&lt;/li&gt;
&lt;li&gt;Prefer &lt;code&gt;create&lt;/code&gt; / &lt;code&gt;close-with-modification&lt;/code&gt; / &lt;code&gt;rename&lt;/code&gt; / &lt;code&gt;delete&lt;/code&gt; events&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoid&lt;/strong&gt; &lt;code&gt;read&lt;/code&gt; / &lt;code&gt;open&lt;/code&gt; events (excessive volume, no catalog value)&lt;/li&gt;
&lt;li&gt;Apply path and extension filters to reduce event noise&lt;/li&gt;
&lt;li&gt;Add backpressure via SQS batching (not fan-out Lambda per event)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;FPolicy can significantly impact file system throughput if configured too broadly. Filter to only the operations and paths that matter for catalog updates.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Hybrid Search Pattern
&lt;/h2&gt;

&lt;p&gt;For production discovery, vector search should be combined with lexical filters and keyword search:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lexical search&lt;/strong&gt;: file_name, path, classification, summary, tags&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector search&lt;/strong&gt;: embedding similarity (kNN)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filters&lt;/strong&gt;: tenant_id, sensitivity_level, file_type, path_classification, last_modified&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OpenSearch supports both search and vector collection types. Use a single index with both text fields and vector fields for hybrid queries. S3 Tables / Iceberg remains the metadata source of truth; OpenSearch is the serving index.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;For sensitive workloads, use VPC interface endpoints for Bedrock Runtime and S3 VPC endpoints for batch input/output. See &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/genai/bedrock-private-connectivity.md" rel="noopener noreferrer"&gt;&lt;code&gt;genai/bedrock-private-connectivity.md&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Storage Tier Impact During Backfill
&lt;/h2&gt;

&lt;p&gt;Initial AI enrichment may read cold files from capacity pool storage, causing higher latency and throughput consumption.&lt;/p&gt;

&lt;p&gt;Recommended controls:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run backfill during off-hours (minimize impact on production NFS/SMB)&lt;/li&gt;
&lt;li&gt;Limit Lambda concurrency during backfill&lt;/li&gt;
&lt;li&gt;Enrich only selected file types first (images → documents → data files)&lt;/li&gt;
&lt;li&gt;Monitor FSx capacity pool read activity via CloudWatch&lt;/li&gt;
&lt;li&gt;Separate backfill cost from steady-state cost in planning&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Backfill vs Incremental Cost Model
&lt;/h2&gt;

&lt;p&gt;Separate cost planning for:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;th&gt;Cost driver&lt;/th&gt;
&lt;th&gt;Optimization&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial backfill&lt;/td&gt;
&lt;td&gt;All existing files (e.g., 100K)&lt;/td&gt;
&lt;td&gt;Bedrock AI at scale&lt;/td&gt;
&lt;td&gt;Batch Inference (~50% savings)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Daily incremental&lt;/td&gt;
&lt;td&gt;New/modified files (e.g., 1000/day)&lt;/td&gt;
&lt;td&gt;Real-time Lambda + Bedrock&lt;/td&gt;
&lt;td&gt;Selective enrichment by file type&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Re-enrichment&lt;/td&gt;
&lt;td&gt;After prompt/model change&lt;/td&gt;
&lt;td&gt;Full re-scan of enriched files&lt;/td&gt;
&lt;td&gt;Batch + compare confidence delta&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenSearch reindex&lt;/td&gt;
&lt;td&gt;After schema/embedding change&lt;/td&gt;
&lt;td&gt;Index rebuild&lt;/td&gt;
&lt;td&gt;Off-hours, parallel shards&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;The largest cost spike is typically the initial backfill, not steady-state. Plan Bedrock Batch Inference and off-peak scheduling for the first catalog population.&lt;/p&gt;

&lt;p&gt;For adjustable assumptions, see &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/verification-evidence/cost-assumptions.yaml" rel="noopener noreferrer"&gt;&lt;code&gt;verification-evidence/cost-assumptions.yaml&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Enrich pending files with AI&lt;/span&gt;
python3 demo/scripts/demo-enrich.py &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--table-bucket-arn&lt;/span&gt; &amp;lt;TABLE_BUCKET_ARN&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--ap-alias&lt;/span&gt; &amp;lt;AP_ALIAS&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-files&lt;/span&gt; 10

&lt;span class="c"&gt;# Search by natural language&lt;/span&gt;
python3 demo/scripts/demo-search.py &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"find documents about contracts or agreements"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  AI Safety and Human Review Boundary
&lt;/h2&gt;

&lt;p&gt;AI enrichment should not be treated as authoritative classification for regulated data without human review.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;For regulated industries&lt;/strong&gt;: AI enrichment is assistive metadata generation, not authoritative classification. Final regulatory classification must be confirmed by data owners, security, legal, and compliance teams. This system provides AI-generated signals to accelerate human review — it does not replace it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deterministic vs AI boundary&lt;/strong&gt;: AI generates classifications and summaries, but pipeline state transitions, retry logic, deduplication, access controls, and audit evidence are deterministic and version-controlled. The deterministic pipeline guarantees reproducibility; AI provides enrichment quality.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Recommended controls:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Human review queue for low-confidence classifications (&amp;lt; 0.7)&lt;/li&gt;
&lt;li&gt;Sampling review for high-confidence results (periodic spot-check)&lt;/li&gt;
&lt;li&gt;False negative testing for PII detection&lt;/li&gt;
&lt;li&gt;Model/prompt version recorded in metadata (&lt;code&gt;enriched_at&lt;/code&gt; + model ID)&lt;/li&gt;
&lt;li&gt;Re-enrichment policy when model or prompt changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Recommended metadata columns for AI lineage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;classification_model_id&lt;/code&gt; — which model produced the classification&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;embedding_model_id&lt;/code&gt; — which model produced the embedding&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;prompt_version&lt;/code&gt; — version of the classification prompt&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;enrichment_code_version&lt;/code&gt; — version of the enrichment Lambda/script&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;enriched_at&lt;/code&gt; — timestamp of enrichment&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;human_review_status&lt;/code&gt; — pending / approved / rejected&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;human_reviewed_by&lt;/code&gt; — reviewer identity (if applicable)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;human_reviewed_at&lt;/code&gt; — review timestamp&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Evaluation Plan
&lt;/h2&gt;

&lt;p&gt;For production use, do not rely only on model-reported confidence. Create a labeled validation set and measure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Classification accuracy (overall and per document type)&lt;/li&gt;
&lt;li&gt;Precision / recall per category&lt;/li&gt;
&lt;li&gt;False positive rate for PII detection&lt;/li&gt;
&lt;li&gt;False negative rate for PII detection&lt;/li&gt;
&lt;li&gt;Embedding search top-k relevance (nDCG@5, MRR)&lt;/li&gt;
&lt;li&gt;Human review acceptance rate&lt;/li&gt;
&lt;li&gt;Cost per accepted classification&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Business acceptance metrics&lt;/strong&gt; (beyond model accuracy):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Time saved per analyst for file discovery&lt;/li&gt;
&lt;li&gt;Dataset discovery lead-time reduction (days → hours target)&lt;/li&gt;
&lt;li&gt;Business owner approval rate for AI classifications&lt;/li&gt;
&lt;li&gt;Cost per useful search result&lt;/li&gt;
&lt;li&gt;False negative risk by document category (which misses matter most?)&lt;/li&gt;
&lt;li&gt;Governance coverage (% of assets searchable in BI/AI tools)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;The 7/7 PII detection result was measured on a controlled synthetic sample. Production use requires evaluation with domain-specific documents, false-positive/false-negative review, human approval workflow, and legal/compliance sign-off.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Snowflake users&lt;/strong&gt;: Snowflake can now directly query S3 Tables Iceberg metadata via Glue REST + VENDED_CREDENTIALS (verified 2026-06-05). Additionally, you can sync redacted metadata into Snowflake-managed tables for Cortex Search / Snowflake Intelligence business-facing discovery. In this PoC, OpenSearch remains the AWS-native vector search component.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;In &lt;strong&gt;Part 3&lt;/strong&gt;, we'll cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lake Formation governance&lt;/strong&gt;: Column-level access control on metadata&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PII detection and anonymization&lt;/strong&gt;: Comprehend (English) + Bedrock Claude (Japanese)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-platform access&lt;/strong&gt;: What works and what doesn't with Databricks and Snowflake&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Clean Room pattern&lt;/strong&gt;: Separate tables for sensitive vs. anonymized metadata&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Full code: &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/tree/main/integrations/iceberg-metadata-catalog" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/fsxn-lakehouse-integrations&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>iceberg</category>
      <category>datalake</category>
      <category>amazonfsxfornetappontap</category>
    </item>
    <item>
      <title>From Hours to Seconds: An AI-Powered Metadata Catalog for Unstructured Data on FSx for ONTAP</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Mon, 08 Jun 2026 16:26:50 +0000</pubDate>
      <link>https://dev.to/aws-builders/from-hours-to-seconds-an-ai-powered-metadata-catalog-for-unstructured-data-on-fsx-for-ontap-5f54</link>
      <guid>https://dev.to/aws-builders/from-hours-to-seconds-an-ai-powered-metadata-catalog-for-unstructured-data-on-fsx-for-ontap-5f54</guid>
      <description>&lt;h2&gt;
  
  
  What Works Now vs What Requires Validation
&lt;/h2&gt;

&lt;p&gt;This article separates verified AWS-native capabilities from cross-platform paths that still require validation. The core pattern — keeping raw files on FSx for ONTAP and cataloging only metadata in S3 Tables — is verified. Databricks paths are still evolving. Snowflake Glue REST + VENDED_CREDENTIALS and External Stage paths are verified in this PoC, with governance limitations noted below. Validate all cross-platform paths in your own environment before production use.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AWS Native PoC (Athena + S3 Tables + Bedrock + OpenSearch + Lake Formation)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;td&gt;Full end-to-end in 42 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Glue Iceberg REST endpoint access&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;td&gt;Both S3 Tables REST and Glue REST confirmed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lake Formation table-level governance&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;td&gt;Grant/revoke/audit working&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lake Formation column-level exclusion&lt;/td&gt;
&lt;td&gt;⚠️ Observed limitation&lt;/td&gt;
&lt;td&gt;Failed on tested federated catalog path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databricks SQL Warehouse direct&lt;/td&gt;
&lt;td&gt;⚠️ Observed limitation&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;iceberg_rest&lt;/code&gt; connection type not supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databricks Spark + Iceberg REST&lt;/td&gt;
&lt;td&gt;❌ Blocked by UC&lt;/td&gt;
&lt;td&gt;spark.conf.set and cluster config both fail; UC Foreign Catalog required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databricks UC Foreign Catalog&lt;/td&gt;
&lt;td&gt;❌ Still blocked&lt;/td&gt;
&lt;td&gt;Retested post-&lt;a href="https://www.databricks.com/blog/unity-catalog-and-next-era-apache-icebergtm" rel="noopener noreferrer"&gt;Foreign Iceberg GA&lt;/a&gt; (2026-06-09): Glue Connection ✅, Credentials ✅, but External Location fails — S3 Tables internal bucket rejects standard S3 API validation. No bypass available.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databricks Delta Sharing via S3 AP&lt;/td&gt;
&lt;td&gt;❌ Confirmed&lt;/td&gt;
&lt;td&gt;Sharing server uses same UC credentials; not a workaround for S3 AP session policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databricks NFS → UC Volume&lt;/td&gt;
&lt;td&gt;❌ Confirmed&lt;/td&gt;
&lt;td&gt;Cloud storage URIs only; internal feature request exists&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databricks UC audit logging&lt;/td&gt;
&lt;td&gt;✅ Confirmed&lt;/td&gt;
&lt;td&gt;External engine access fully logged&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snowflake via Glue REST (VENDED_CREDENTIALS)&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;td&gt;Explicit &lt;code&gt;ACCESS_DELEGATION_MODE = VENDED_CREDENTIALS&lt;/code&gt;; CREATE TABLE + SELECT + COUNT + AUTO_REFRESH all working (2026-06-05)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snowflake External Stage (FSx S3 AP)&lt;/td&gt;
&lt;td&gt;✅ Verified&lt;/td&gt;
&lt;td&gt;LIST, SELECT/COPY, and TO_FILE + Cortex AI all verified&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Important distinction&lt;/strong&gt;: This pattern does not use FSx for ONTAP S3 Access Points as an Iceberg warehouse. Raw files stay on FSx for ONTAP, while only the metadata catalog is written to S3 Tables. Direct Iceberg table writes to FSx for ONTAP S3 Access Points are tracked separately as a &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/tree/main/integrations/iceberg" rel="noopener noreferrer"&gt;known limitation&lt;/a&gt; because Iceberg commit behavior and S3FileIO compatibility require additional validation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  This is an Iceberg Adoption Pattern, Not a Raw-Data Migration
&lt;/h3&gt;

&lt;p&gt;This pattern does not convert the original unstructured files into Iceberg table data. Instead, it adopts Iceberg for the metadata layer only.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;th&gt;What happens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data files&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not migrated. Raw files remain on FSx for ONTAP.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Metadata table&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Newly created as an Iceberg table on S3 Tables.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Processing jobs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Metadata scan and AI enrichment jobs write append-only metadata.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Consumers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Athena, EMR, Snowflake, Databricks, and BI/search tools consume curated metadata views.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Storage Boundary: What Moves and What Doesn't
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FSx for ONTAP S3 Access Point:
  ✅ Raw file READ path only (AI enrichment input)
  ❌ NOT an Iceberg warehouse
  ❌ NOT a table commit target
  ❌ NOT bulk-copied to S3

S3 Tables:
  ✅ Iceberg METADATA table (file catalog)
  ✅ Metadata source of truth
  ✅ Query and governance target
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Data movement disclosure (for regulated environments)&lt;/strong&gt;: Raw files are NOT bulk-copied to S3. However, during AI enrichment, selected file content is temporarily read via the S3 Access Point and sent to Amazon Bedrock APIs for classification/embedding. Per &lt;a href="https://docs.aws.amazon.com/bedrock/latest/userguide/data-protection.html" rel="noopener noreferrer"&gt;AWS Bedrock data protection policy&lt;/a&gt;, model providers have no access to customer prompts or completions. Extracted/redacted metadata and embeddings are written to S3 Tables, OpenSearch, and optionally to Snowflake or Databricks depending on the activation path. Define your data flow boundary documentation before regulated-workload deployment.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem: Most Enterprise Unstructured Data is Difficult to Discover and Govern
&lt;/h2&gt;

&lt;p&gt;Most organizations store terabytes of unstructured data — PDFs, images, CAD files, sensor logs — on network-attached storage. This data is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Undiscoverable&lt;/strong&gt;: "Where is that invoice from last quarter?" requires manual searching or asking colleagues&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Governed at the file-system layer, but not classified or searchable&lt;/strong&gt; from analytics and AI workflows&lt;/li&gt;
&lt;li&gt;Audit trails may exist at the file-system layer, but they are often not unified with analytics and AI query activity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of this as &lt;strong&gt;unstructured-data modernization&lt;/strong&gt;: inventory first, classify selectively, govern metadata, and activate only what is needed — without bulk-copying the raw files.&lt;/p&gt;

&lt;h3&gt;
  
  
  Business Outcomes (Beyond Technical Metrics)
&lt;/h3&gt;

&lt;p&gt;This pattern is not only about faster file search. It is about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reducing dataset discovery lead time&lt;/strong&gt; for AI projects (days → hours)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Improving PII visibility&lt;/strong&gt; across the organization (unknown → 95%+ coverage target)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lowering duplicate storage cost&lt;/strong&gt; ($230-256/month eliminated for 10TB)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creating governed metadata products&lt;/strong&gt; for analytics and AI teams&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enabling AI-readiness&lt;/strong&gt; without raw-data copy or migration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Activating governed metadata in Snowflake AI Data Cloud&lt;/strong&gt; for Cortex Search, semantic Q&amp;amp;A, executive dashboards, and business-facing file discovery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The traditional solution? Copy everything to S3 and build a catalog. But at 10TB, that's &lt;strong&gt;~$230-256/month&lt;/strong&gt; just for the copy — plus sync pipelines, duplicate governance, and data drift.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Hot Metadata × Cold Data
&lt;/h2&gt;

&lt;p&gt;What if we could catalog every file &lt;em&gt;without&lt;/em&gt; moving it?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────┐
│  HOT: Metadata (Apache Iceberg on S3 Tables)            │
│  • File path, type, size, timestamps                    │
│  • AI classification + confidence score                 │
│  • Vector embedding (1024-dim, similarity search)       │
│  • PII detection flag                                   │
│  • Cost: ~$5-15/month for 100K files                    │
└────────────────────────┬────────────────────────────────┘
                         │ file_path reference
┌────────────────────────▼────────────────────────────────┐
│  COLD: Actual Files (FSx for ONTAP)                     │
│  • PDF, images, CAD, video, audio, logs                 │
│  • Deduplication (50-70% storage savings typical*)      │
│  • NFS/SMB (existing workflows) + S3 AP (AI/analytics)  │
│  • No bulk raw-data copy required                       │
└─────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key insight&lt;/strong&gt;: Keep the data where it is. Move only the metadata into a queryable format.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FSx for ONTAP ──S3 Access Point──→ AI Enrichment (Bedrock)
       │                                    │
       │                                    ▼
       │                          S3 Tables (Iceberg)
       │                                    │
       │                                    ▼
       │                          ┌──────────────────┐
       │                          │ Query Engines    │
       │                          │ • Athena (SQL)   │
       │                          │ • OpenSearch     │
       │                          │   (vector kNN)   │
       │                          │ • Lake Formation │
       │                          │   (governance)   │
       │                          └──────────────────┘
       │
       └──NFS/SMB──→ Existing applications (unchanged)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Observability&lt;/strong&gt; (production add-on):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;       ┌──────────────────────────────────────┐
       │  • CloudWatch Metrics + Alarms       │
       │  • CloudWatch Logs (Lambda/SQS)      │
       │  • CloudTrail (governance audit)     │
       │  • OpenSearch Dashboards (search UX) │
       │  • FSx metrics (throughput, IOPS,    │
       │    latency, capacity pool reads)     │
       └──────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Components&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FSx for ONTAP S3 Access Point&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Read files for AI processing (no copy)&lt;/td&gt;
&lt;td&gt;Included with FSx&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;S3 Tables&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;AWS managed Apache Iceberg table service (auto-compaction, REST endpoint)&lt;/td&gt;
&lt;td&gt;~$5/month metadata&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bedrock Claude Vision&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Image classification&lt;/td&gt;
&lt;td&gt;~$0.01/file in this demo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Titan Embeddings V2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1024-dim vectors for similarity search&lt;/td&gt;
&lt;td&gt;$0.00002/1K input tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OpenSearch Serverless NextGen&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;kNN vector search (scale-to-zero)&lt;/td&gt;
&lt;td&gt;$0 idle compute when inactive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Lake Formation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Metadata access governance&lt;/td&gt;
&lt;td&gt;No additional Lake Formation charge&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;S3 Tables Iceberg REST endpoint: &lt;code&gt;https://s3tables.&amp;lt;region&amp;gt;.amazonaws.com/iceberg&lt;/code&gt;&lt;br&gt;
Check &lt;a href="https://aws.amazon.com/s3/tables/" rel="noopener noreferrer"&gt;S3 Tables availability&lt;/a&gt; for regional support before deployment.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Deduplication ratio is a general ONTAP range. Actual savings depend on data characteristics and were not measured in this PoC.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  PoC Results (Verified 2026-05-31)
&lt;/h2&gt;

&lt;p&gt;We built and verified this end-to-end in a single day. Here's what we measured:&lt;/p&gt;

&lt;h3&gt;
  
  
  S3 Tables Access Paths: Which Endpoint Should You Use?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Access path&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;th&gt;Governance path&lt;/th&gt;
&lt;th&gt;Verified&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;S3 Tables Iceberg REST (&lt;code&gt;s3tables.&amp;lt;region&amp;gt;.amazonaws.com/iceberg&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Direct Iceberg client / simple PoC&lt;/td&gt;
&lt;td&gt;IAM + S3 Tables permissions&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS Glue Iceberg REST (&lt;code&gt;glue.&amp;lt;region&amp;gt;.amazonaws.com/iceberg&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Production analytics integration&lt;/td&gt;
&lt;td&gt;IAM + Lake Formation&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Athena via Glue federated catalog&lt;/td&gt;
&lt;td&gt;SQL analytics&lt;/td&gt;
&lt;td&gt;Lake Formation + Athena&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PyIceberg local client&lt;/td&gt;
&lt;td&gt;Lightweight validation&lt;/td&gt;
&lt;td&gt;IAM/LF depending on endpoint&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;For production workloads with centralized governance, the &lt;strong&gt;AWS Glue Iceberg REST endpoint&lt;/strong&gt; is recommended over the S3 Tables direct endpoint. See &lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-integrating-glue-endpoint.html" rel="noopener noreferrer"&gt;AWS docs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Catalog authority rule&lt;/strong&gt;: S3 Tables + Glue is the authoritative catalog for this metadata table in this PoC. Other engines should consume the table through the authoritative catalog or a controlled metadata activation path. Do not configure multiple writable catalogs for the same Iceberg table — dual-write causes split-brain and potential data corruption.&lt;/p&gt;

&lt;p&gt;Athena Iceberg behavior depends on Athena engine version, Iceberg version, Glue/Lake Formation integration, and table maintenance state. Validate DDL/DML requirements separately before using this as a write-heavy production catalog.&lt;/p&gt;

&lt;p&gt;Verification details are recorded in &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/verification-evidence/evidence-record.yaml" rel="noopener noreferrer"&gt;evidence-record.yaml&lt;/a&gt; and &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/verification-evidence/cross-platform-compatibility.yaml" rel="noopener noreferrer"&gt;cross-platform-compatibility.yaml&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Before vs After
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;th&gt;Improvement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;File discovery time&lt;/td&gt;
&lt;td&gt;Minutes-hours&lt;/td&gt;
&lt;td&gt;&amp;lt; 2 seconds&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;100x+ at scale&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI classification&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;Automatic (6 sec/file)&lt;/td&gt;
&lt;td&gt;Fully automated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage cost (10TB)&lt;/td&gt;
&lt;td&gt;~$250/month (S3 copy)&lt;/td&gt;
&lt;td&gt;$5-15/month (metadata only)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;95% reduction&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Metadata query governance&lt;/td&gt;
&lt;td&gt;Not applicable&lt;/td&gt;
&lt;td&gt;100% in this PoC&lt;/td&gt;
&lt;td&gt;Complete for metadata queries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Idle compute/search cost&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Near $0 when inactive&lt;/td&gt;
&lt;td&gt;Persistent metadata/logs may still incur small charges&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Search Time Scaling (Measured + Projected)
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Files&lt;/th&gt;
&lt;th&gt;ListObjectsV2&lt;/th&gt;
&lt;th&gt;Athena SQL&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;40&lt;/td&gt;
&lt;td&gt;892 ms&lt;/td&gt;
&lt;td&gt;3.0 sec&lt;/td&gt;
&lt;td&gt;0.3x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;22.3 sec&lt;/td&gt;
&lt;td&gt;1.8 sec&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;12x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;3.7 min&lt;/td&gt;
&lt;td&gt;1.8 sec&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;124x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000&lt;/td&gt;
&lt;td&gt;37.2 min&lt;/td&gt;
&lt;td&gt;1.8 sec&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1,239x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000,000&lt;/td&gt;
&lt;td&gt;371.7 min&lt;/td&gt;
&lt;td&gt;1.8 sec&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;12,389x&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;At 40 files, ListObjectsV2 is faster — Athena has cold start overhead. Athena query time does not scale linearly with the number of files on FSx because it queries the Iceberg metadata table instead of listing the raw file namespace. In this controlled demo, the query stayed around ~1.8 seconds for projected file counts, but production latency depends on Iceberg metadata size, manifest count, predicate selectivity, Athena cold start, and table maintenance state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Projection method&lt;/strong&gt;: ListObjectsV2 latency was extrapolated linearly from the measured 40-file scan. This is intentionally conservative for demonstrating namespace-scan behavior, but it is not a service benchmark.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The 42-Second Demo
&lt;/h3&gt;

&lt;p&gt;Our complete demo runs all 8 steps in 42 seconds:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://asciinema.org/a/LA6F0QCkZP8fk3ZT" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fasciinema.org%2Fa%2FLA6F0QCkZP8fk3ZT.svg" alt="asciicast" width="1140" height="92"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: Before/After search comparison     ✅ (ListObjectsV2 vs Athena)
Step 2: Infrastructure deploy              ✅ (CloudFormation, skippable)
Step 3: Metadata scan (40 files)           ✅ (3 seconds)
Step 4: AI enrichment (Bedrock Vision)     ✅ (invoice → 0.95 confidence)
Step 5: Athena query + Time Travel         ✅ (&amp;lt; 2 seconds)
Step 6: Vector similarity search           ✅ (kNN score 0.67)
Step 7: PII detection + anonymization      ✅ (7/7 entities, all redacted)
Step 8: Cost &amp;amp; ROI analysis                ✅ ($0.07 total demo cost)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Total demo cost: $0.07&lt;/strong&gt;. After the demo, the compute/search components can scale to zero. If you retain S3 Tables metadata, logs, or audit trails, small storage/logging charges may still apply.&lt;/p&gt;

&lt;h3&gt;
  
  
  AI Classification Results
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;Classification&lt;/th&gt;
&lt;th&gt;Confidence&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;invoice_sample.png&lt;/td&gt;
&lt;td&gt;Invoice&lt;/td&gt;
&lt;td&gt;0.95&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;product_inspection.png&lt;/td&gt;
&lt;td&gt;Pie Chart&lt;/td&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sensor_dashboard.png&lt;/td&gt;
&lt;td&gt;IoT Sensor Dashboard&lt;/td&gt;
&lt;td&gt;0.9&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In this demo, Bedrock Claude Vision classified sample images at roughly $0.01/file with sub-10-second latency. Production cost and latency depend on image size, prompt length, model version, and retry behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vector Similarity Search
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Query: "find invoice or payment documents"
→ invoice_sample.png (score: 0.6749)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenSearch Serverless with scale-to-zero capability (GA May 2026) provides kNN search — no minimum cost when idle. Cold start is ~10-30 seconds, warm queries are ~54ms.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Verified in this PoC environment on 2026-05-31. Check the latest &lt;a href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless.html" rel="noopener noreferrer"&gt;OpenSearch Serverless documentation&lt;/a&gt; and regional availability before deployment.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Governance: Lake Formation Access Control
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: Authorized query    → ✅ SUCCEEDED (3 rows)
Step 2: Revoke SELECT       → 🔒 BLOCKED (access denied)
Step 3: Restore SELECT      → ✅ SUCCEEDED
Step 4: CloudTrail audit    → All queries logged with user identity
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Metadata queries are governed and audited. Raw file access remains governed separately by FSx file-system permissions, S3 Access Point policies, and application access paths.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost Analysis
&lt;/h2&gt;

&lt;h3&gt;
  
  
  This Demo
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Bedrock AI (5 files)&lt;/td&gt;
&lt;td&gt;$0.05&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenSearch (~6 min)&lt;/td&gt;
&lt;td&gt;$0.024&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda + Athena&lt;/td&gt;
&lt;td&gt;$0.001&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.07&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Projected Monthly (10TB, 100K files, 1000 changes/day)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Monthly&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;S3 Tables (metadata)&lt;/td&gt;
&lt;td&gt;$5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda (sync + AI)&lt;/td&gt;
&lt;td&gt;$36&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bedrock (AI enrichment)&lt;/td&gt;
&lt;td&gt;$30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenSearch (business hours)&lt;/td&gt;
&lt;td&gt;$42&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SQS + misc&lt;/td&gt;
&lt;td&gt;$1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$114/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 copy eliminated&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;-$230-256/month&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Net effect&lt;/strong&gt;: The AI-powered catalog costs less than the S3 copy it eliminates.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Without AI enrichment&lt;/strong&gt; (metadata scan + Athena only): ~$42/month. AI processing is optional and can be enabled per-file-type.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;S3 Standard pricing: us-east-1 $0.023/GB, ap-northeast-1 $0.025/GB. Verified 2026-06-01 via AWS Pricing API.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For reproducibility, see: &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/verification-evidence/evidence-record.yaml" rel="noopener noreferrer"&gt;evidence-record.yaml&lt;/a&gt;, &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/verification-evidence/cost-assumptions.yaml" rel="noopener noreferrer"&gt;cost-assumptions.yaml&lt;/a&gt;, &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/verification-evidence/2026-05-31/comprehensive-test-results.yaml" rel="noopener noreferrer"&gt;comprehensive-test-results.yaml&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Known Limitations (Honest Assessment)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Limitation&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;th&gt;Workaround&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Databricks SQL Warehouse &lt;code&gt;CREATE CONNECTION TYPE iceberg_rest&lt;/code&gt; to S3 Tables REST failed in this validation (2026-05-31)&lt;/td&gt;
&lt;td&gt;SQL Warehouse direct path unavailable in tested method&lt;/td&gt;
&lt;td&gt;Retested 2026-06-09; still blocked in tested UC path. Use curated metadata sync to UC Delta as practical workaround; support case submitted.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databricks Spark cluster: UC blocks external catalog registration (2026-06-01)&lt;/td&gt;
&lt;td&gt;Cannot use spark.conf.set or cluster config for external Iceberg catalogs&lt;/td&gt;
&lt;td&gt;UC Foreign Catalog tested 2026-06-09 — External Location validation fails against S3 Tables internal bucket. Sync metadata to UC Delta table instead.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databricks Delta Sharing: cannot bypass S3 AP session policy (2026-06-01)&lt;/td&gt;
&lt;td&gt;Sharing server uses same UC credentials&lt;/td&gt;
&lt;td&gt;DataSync → S3 → UC → Delta Sharing works for copied data; validate target table format and catalog support separately&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Databricks NFS mount: cannot register as UC External Volume (2026-06-01)&lt;/td&gt;
&lt;td&gt;NFS/FUSE paths not supported for UC Volumes&lt;/td&gt;
&lt;td&gt;DataSync → S3 → UC External Location; internal feature request exists&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snowflake External Iceberg Table with S3 Tables REST endpoint was not a supported catalog type in this validation (2026-05-31)&lt;/td&gt;
&lt;td&gt;Direct S3 Tables REST path unavailable in tested method&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;Resolved (2026-06-05)&lt;/strong&gt;: Use Glue REST + explicit &lt;code&gt;ACCESS_DELEGATION_MODE = VENDED_CREDENTIALS&lt;/code&gt;. Schema must have no default External Volume. AWS prerequisite: &lt;code&gt;register-resource --with-federation&lt;/code&gt;. Lake Formation column-level filtering NOT enforced via this path.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LF column exclusion grant failed in tested S3 Tables federated catalog path&lt;/td&gt;
&lt;td&gt;Can't hide specific columns via tested grant pattern&lt;/td&gt;
&lt;td&gt;Athena Views; track AWS support status&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;At 40 files, ListObjectsV2 is faster than Athena&lt;/td&gt;
&lt;td&gt;Architecture value is at scale (100K+)&lt;/td&gt;
&lt;td&gt;Expected — Athena has cold start overhead&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Naming note&lt;/strong&gt;: Use lowercase table, namespace, and column names for S3 Tables integrated with AWS analytics services. Mixed-case names may not be visible to Athena / Glue / Lake Formation. See &lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-buckets-naming.html" rel="noopener noreferrer"&gt;S3 Tables naming rules&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Performance Boundaries Not Yet Validated
&lt;/h2&gt;

&lt;p&gt;This PoC validates the architecture shape, not production scale limits. The following require separate testing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;FSx throughput impact under concurrent NFS/SMB/S3 access&lt;/li&gt;
&lt;li&gt;S3 Access Point metadata operation impact under large namespace scans&lt;/li&gt;
&lt;li&gt;S3 API request concurrency vs FSx provisioned throughput capacity&lt;/li&gt;
&lt;li&gt;Impact of scan jobs on production SMB/NFS latency&lt;/li&gt;
&lt;li&gt;ListObjectsV2 pagination behavior at 1M+ files&lt;/li&gt;
&lt;li&gt;Lambda concurrency and S3 AP request throttling&lt;/li&gt;
&lt;li&gt;Iceberg manifest growth and compaction behavior&lt;/li&gt;
&lt;li&gt;Athena query latency with high snapshot counts&lt;/li&gt;
&lt;li&gt;OpenSearch indexing throughput during bulk backfill&lt;/li&gt;
&lt;li&gt;File size distribution and small-file amplification effects&lt;/li&gt;
&lt;li&gt;Cold vs warm namespace access behavior (capacity pool reads during backfill)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  ONTAP Object Model Mapping
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;ONTAP / FSx object&lt;/th&gt;
&lt;th&gt;Role in this pattern&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;FSx file system&lt;/td&gt;
&lt;td&gt;Performance / HA boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SVM&lt;/td&gt;
&lt;td&gt;Protocol and administrative boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Volume&lt;/td&gt;
&lt;td&gt;Catalog scope and S3 Access Point attachment target&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Junction path / SMB share&lt;/td&gt;
&lt;td&gt;Existing application namespace&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 Access Point&lt;/td&gt;
&lt;td&gt;S3 API boundary for AI/analytics (with associated file-system identity)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Iceberg table&lt;/td&gt;
&lt;td&gt;Metadata catalog, not raw data store&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Each S3 Access Point has an associated &lt;code&gt;OntapFileSystemIdentity&lt;/code&gt; (UNIX UID/GID or Windows domain user) that authorizes all file access through that AP. IAM policy is evaluated first, then ONTAP file-system permissions. See &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/security/s3-access-point-identity-matrix.yaml" rel="noopener noreferrer"&gt;&lt;code&gt;security/s3-access-point-identity-matrix.yaml&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Iceberg Table Maintenance Plan
&lt;/h2&gt;

&lt;p&gt;For production, define:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Snapshot retention period and table maintenance behavior — verify S3 Tables service-managed policies and any configurable retention settings&lt;/li&gt;
&lt;li&gt;Manifest rewrite cadence (if metadata table grows large)&lt;/li&gt;
&lt;li&gt;Orphan file cleanup policy&lt;/li&gt;
&lt;li&gt;Deduplication view or materialized latest-record table&lt;/li&gt;
&lt;li&gt;Time travel retention policy&lt;/li&gt;
&lt;li&gt;Athena engine version and Iceberg version compatibility&lt;/li&gt;
&lt;li&gt;Append-only dedup query as default named query for analysts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For operational steps, see &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/ops/iceberg-maintenance-runbook.md" rel="noopener noreferrer"&gt;&lt;code&gt;ops/iceberg-maintenance-runbook.md&lt;/code&gt;&lt;/a&gt;. For details on Iceberg spec vs S3 Tables service behavior, see &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/docs/standards-vs-service-behavior.md" rel="noopener noreferrer"&gt;&lt;code&gt;docs/standards-vs-service-behavior.md&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Iceberg does not enforce primary-key uniqueness in this PoC. Consumers should query curated latest-record views instead of the append-only base table. See &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/ops/athena-named-queries/latest_records.sql" rel="noopener noreferrer"&gt;&lt;code&gt;ops/athena-named-queries/latest_records.sql&lt;/code&gt;&lt;/a&gt; in the repo.&lt;/p&gt;

&lt;p&gt;Apache Iceberg is the open table format. Amazon S3 Tables is an AWS managed table bucket service that uses Apache Iceberg. Some operational behavior, endpoint support, and governance integration are AWS service-specific and should be validated separately from the Iceberg specification itself.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  File Identity Strategy
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;file_id method&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;th&gt;Tradeoff&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;hash(volume_id + normalized_path)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;General purpose&lt;/td&gt;
&lt;td&gt;Rename = new file_id&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;hash(volume_id + file_handle/inode)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Rename tracking&lt;/td&gt;
&lt;td&gt;Requires inode access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Content hash (SHA-256)&lt;/td&gt;
&lt;td&gt;Immutable documents&lt;/td&gt;
&lt;td&gt;Expensive for large files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;path + last_modified + size&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Lightweight PoC only&lt;/td&gt;
&lt;td&gt;Fragile under overwrites&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Production should define how rename, overwrite, delete, and permission changes are represented in the metadata table.&lt;/p&gt;

&lt;p&gt;Recommended production columns: &lt;code&gt;source_system_id&lt;/code&gt;, &lt;code&gt;volume_id&lt;/code&gt;, &lt;code&gt;normalized_path&lt;/code&gt;, &lt;code&gt;path_hash&lt;/code&gt;, &lt;code&gt;content_hash&lt;/code&gt;, &lt;code&gt;scan_run_id&lt;/code&gt;, &lt;code&gt;change_type&lt;/code&gt; (created / modified / deleted / renamed / permission_changed).&lt;/p&gt;

&lt;p&gt;For FlexClone-based dev/test datasets, decide whether cloned files should retain lineage to source files. If lineage matters, store &lt;code&gt;clone_parent_volume_id&lt;/code&gt;, &lt;code&gt;clone_parent_snapshot_id&lt;/code&gt;, and &lt;code&gt;catalog_environment&lt;/code&gt; (prod / dev / test / dr). See &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/dr/snapmirror-catalog-rebinding.md" rel="noopener noreferrer"&gt;&lt;code&gt;dr/snapmirror-catalog-rebinding.md&lt;/code&gt;&lt;/a&gt; for DR failover considerations.&lt;/p&gt;

&lt;p&gt;For manufacturing and engineering workloads, see &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/schema/extensions/manufacturing_metadata.yaml" rel="noopener noreferrer"&gt;&lt;code&gt;schema/extensions/manufacturing_metadata.yaml&lt;/code&gt;&lt;/a&gt; for domain-specific metadata fields such as part number, revision, plant, machine, and inspection lot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Tenant Deployment Considerations
&lt;/h2&gt;

&lt;p&gt;If this pattern is provided by a partner or platform team to multiple business units or customers, define the isolation boundary explicitly.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Isolation model&lt;/th&gt;
&lt;th&gt;Recommended when&lt;/th&gt;
&lt;th&gt;Tradeoff&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Table bucket per tenant&lt;/td&gt;
&lt;td&gt;Strong isolation required&lt;/td&gt;
&lt;td&gt;Higher operational overhead&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Namespace per tenant&lt;/td&gt;
&lt;td&gt;Balanced isolation and operations&lt;/td&gt;
&lt;td&gt;Shared table bucket governance required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;tenant_id column in one table&lt;/td&gt;
&lt;td&gt;Internal multi-BU catalog&lt;/td&gt;
&lt;td&gt;Requires strict LF-Tags / row filters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenSearch index per tenant&lt;/td&gt;
&lt;td&gt;Search isolation required&lt;/td&gt;
&lt;td&gt;More index management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shared OpenSearch index + tenant filter&lt;/td&gt;
&lt;td&gt;Lower cost&lt;/td&gt;
&lt;td&gt;Must enforce filter in every query path&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For partner-led deployments, document tenant onboarding automation, offboarding deletion/retention policy, per-tenant cost allocation tags, and audit evidence location.&lt;/p&gt;

&lt;h2&gt;
  
  
  Business KPI Mapping
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Business problem&lt;/th&gt;
&lt;th&gt;Baseline metric&lt;/th&gt;
&lt;th&gt;Target metric&lt;/th&gt;
&lt;th&gt;How this PoC measures it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Employees cannot find documents&lt;/td&gt;
&lt;td&gt;Average search time&lt;/td&gt;
&lt;td&gt;&amp;lt; 10 sec&lt;/td&gt;
&lt;td&gt;Search latency + result relevance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manual classification is slow&lt;/td&gt;
&lt;td&gt;Files classified/day/person&lt;/td&gt;
&lt;td&gt;10x improvement&lt;/td&gt;
&lt;td&gt;AI enrichment throughput&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sensitive files are unknown&lt;/td&gt;
&lt;td&gt;% files classified for PII&lt;/td&gt;
&lt;td&gt;95%+ coverage target&lt;/td&gt;
&lt;td&gt;PII scan completion rate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Duplicate S3 copy is costly&lt;/td&gt;
&lt;td&gt;Monthly duplicate storage cost&lt;/td&gt;
&lt;td&gt;Reduce by 50%+&lt;/td&gt;
&lt;td&gt;Metadata-only architecture cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI projects lack data inventory&lt;/td&gt;
&lt;td&gt;Dataset discovery lead time&lt;/td&gt;
&lt;td&gt;Days → hours&lt;/td&gt;
&lt;td&gt;Catalog completeness&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Business users need governed discovery&lt;/td&gt;
&lt;td&gt;% searchable assets in BI/AI tools&lt;/td&gt;
&lt;td&gt;80%+ of approved metadata visible&lt;/td&gt;
&lt;td&gt;Expose curated metadata views to Athena, Databricks, Snowflake, or BI tools&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;FSx for ONTAP prerequisites&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SVM and volume selected as catalog scope&lt;/li&gt;
&lt;li&gt;S3 Access Point attached to the target volume&lt;/li&gt;
&lt;li&gt;Associated UNIX or Windows identity documented&lt;/li&gt;
&lt;li&gt;NFS/SMB production workload impact reviewed&lt;/li&gt;
&lt;li&gt;CloudWatch metrics dashboard enabled
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone the repo&lt;/span&gt;
git clone https://github.com/Yoshiki0705/fsxn-lakehouse-integrations.git
&lt;span class="nb"&gt;cd &lt;/span&gt;fsxn-lakehouse-integrations/integrations/iceberg-metadata-catalog

&lt;span class="c"&gt;# Install dependencies&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;# Run the demo (requires FSx for ONTAP with S3 Access Point)&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;demo/scripts
./run-demo.sh &lt;span class="nt"&gt;--ap-alias&lt;/span&gt; &amp;lt;your-ap-alias-ext-s3alias&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Don't have FSx for ONTAP?&lt;/strong&gt; You can still explore the architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/docs/en/iceberg-metadata-catalog.md" rel="noopener noreferrer"&gt;Architecture Document&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/docs/poc-results-summary.md" rel="noopener noreferrer"&gt;PoC Results Summary&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/demo/docs/demo-guide.md" rel="noopener noreferrer"&gt;Demo Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;This is Part 1 of a 3-part series:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Part 1&lt;/strong&gt; (this article): Architecture &amp;amp; PoC Results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 2&lt;/strong&gt;: AI Enrichment Pipeline — Bedrock Vision + Titan Embeddings + OpenSearch NextGen&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 3&lt;/strong&gt;: Governance &amp;amp; Cross-Platform Access — Lake Formation, PII Anonymization, Databricks/Snowflake Integration&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Don't copy data to make it searchable&lt;/strong&gt; — catalog the metadata instead. Apache Iceberg + S3 Tables gives you a managed metadata layer with time travel.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Selective AI enrichment plus scale-to-zero search&lt;/strong&gt; can keep PoC and low-traffic environments cost-efficient — compute/search components idle near $0; persistent metadata and logs may incur small charges.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;42 seconds, $0.07&lt;/strong&gt; — that's the barrier to entry for an AI-powered data catalog on your existing NAS storage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start small, grow incrementally&lt;/strong&gt; — from metadata-only scan (Level 1) to full business workflow integration (Level 5). See the &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations/blob/main/integrations/iceberg-metadata-catalog/genai/production-maturity-model.md" rel="noopener noreferrer"&gt;Production Maturity Model&lt;/a&gt; for the progression path.&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;em&gt;All code and documentation is available at &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/fsxn-lakehouse-integrations&lt;/a&gt;. Feedback welcome via GitHub Issues.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>iceberg</category>
      <category>datalake</category>
      <category>amazonfsxfornetappontap</category>
    </item>
    <item>
      <title>28 Industry Reference Patterns with FSx for ONTAP S3 Access Points — Phase 15</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 07 Jun 2026 03:00:23 +0000</pubDate>
      <link>https://dev.to/aws-builders/28-industry-reference-patterns-with-fsx-for-ontap-s3-access-points-phase-15-4g2l</link>
      <guid>https://dev.to/aws-builders/28-industry-reference-patterns-with-fsx-for-ontap-s3-access-points-phase-15-4g2l</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Phase 15 expands the pattern library from 17 to &lt;strong&gt;28 industry-specific use cases&lt;/strong&gt;, providing reference implementations across major AWS Industry verticals where FSx for ONTAP file processing is relevant. Each new pattern includes a CloudFormation template, Step Functions workflow, Python Lambda functions, 8-language documentation, and property-based tests. Combined with 6 FlexCache/FlexClone patterns and 1 SAP/ERP pattern, the repository now offers &lt;strong&gt;35 deployable reference patterns&lt;/strong&gt; for enterprise file processing on FSx for ONTAP.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The SAP/ERP pattern focuses on controlled document/report processing around ERP-adjacent file exports (IDoc, spool), not direct transactional SAP data manipulation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: These are reference implementations with production-readiness guidance, not fully certified production systems. Customers must validate against their own regulatory, security, and operational requirements before production use.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For S3 standard bucket users&lt;/strong&gt;: This library is not a replacement for S3 data lake patterns. It is a file-data integration pattern for customers who want to process FSx ONTAP-resident data through S3-compatible APIs while preserving NAS access paths. See &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/s3-bucket-user-guide.md" rel="noopener noreferrer"&gt;&lt;code&gt;docs/s3-bucket-user-guide.md&lt;/code&gt;&lt;/a&gt; for a detailed comparison.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Serverless boundary&lt;/strong&gt;: Compute (Lambda), orchestration (Step Functions), eventing (EventBridge), and AI services (Bedrock, Textract, Rekognition) are serverless/managed. FSx for ONTAP is a fully managed file system with provisioned capacity and operational considerations — it is not scale-to-zero storage. This is a &lt;strong&gt;serverless processing pattern over existing enterprise file data&lt;/strong&gt;, not a pure serverless storage pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When NOT to use this&lt;/strong&gt;: If your workload is already object-native, does not require NFS/SMB coexistence, and can use standard S3 data lake patterns — prefer S3-native serverless architecture.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why 28 Use Cases?
&lt;/h2&gt;

&lt;p&gt;AWS organizes customers into 22 industry verticals. When we mapped our existing 17 patterns against these verticals, several gaps stood out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Telecommunications&lt;/strong&gt; — No CDR/network log processing pattern&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Advertising &amp;amp; Marketing&lt;/strong&gt; — No creative asset management&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Travel &amp;amp; Hospitality&lt;/strong&gt; — No document processing for reservations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agriculture &amp;amp; Food&lt;/strong&gt; — No traceability or crop monitoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sustainability/ESG&lt;/strong&gt; — No ESG metrics extraction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nonprofit&lt;/strong&gt; — No grant management automation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Utilities&lt;/strong&gt; — No drone/SCADA-based asset inspection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real Estate&lt;/strong&gt; — No portfolio analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HR&lt;/strong&gt; — No resume screening (with PII protection)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chemicals&lt;/strong&gt; — No SDS/lab notebook processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transportation&lt;/strong&gt; (railway) — No deterioration detection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Phase 15 fills all of these, covering &lt;strong&gt;19 of 22 AWS Industry verticals&lt;/strong&gt; (remaining 3 — Consumer Packaged Goods, Mining, Software/Internet — have limited file-processing relevance for this pattern type). Combined with 11 Japan-market focus areas (all covered), the repository addresses the vast majority of enterprise file processing scenarios.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 11 New Patterns
&lt;/h2&gt;

&lt;h3&gt;
  
  
  P0: Foundation Patterns
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;UC&lt;/th&gt;
&lt;th&gt;Industry&lt;/th&gt;
&lt;th&gt;Key AWS Services&lt;/th&gt;
&lt;th&gt;Differentiator&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;UC18&lt;/td&gt;
&lt;td&gt;Telecom&lt;/td&gt;
&lt;td&gt;Athena, Bedrock&lt;/td&gt;
&lt;td&gt;CDR/syslog anomaly detection with 7-day baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC19&lt;/td&gt;
&lt;td&gt;AdTech&lt;/td&gt;
&lt;td&gt;Rekognition, Textract, Bedrock&lt;/td&gt;
&lt;td&gt;Brand compliance scoring + moderation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  P1: Document Intelligence
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;UC&lt;/th&gt;
&lt;th&gt;Industry&lt;/th&gt;
&lt;th&gt;Key AWS Services&lt;/th&gt;
&lt;th&gt;Differentiator&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;UC20&lt;/td&gt;
&lt;td&gt;Travel&lt;/td&gt;
&lt;td&gt;Textract, Comprehend, Rekognition&lt;/td&gt;
&lt;td&gt;Multilingual reservation extraction + facility inspection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC21&lt;/td&gt;
&lt;td&gt;Agriculture&lt;/td&gt;
&lt;td&gt;Rekognition, Textract, Bedrock&lt;/td&gt;
&lt;td&gt;GeoTIFF crop analysis + lot traceability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC22&lt;/td&gt;
&lt;td&gt;Transportation&lt;/td&gt;
&lt;td&gt;Rekognition, Textract, Bedrock&lt;/td&gt;
&lt;td&gt;Safety-critical escalation trigger + deterioration trends&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  P2: Specialized Processing
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;UC&lt;/th&gt;
&lt;th&gt;Industry&lt;/th&gt;
&lt;th&gt;Key AWS Services&lt;/th&gt;
&lt;th&gt;Differentiator&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;UC23&lt;/td&gt;
&lt;td&gt;Sustainability&lt;/td&gt;
&lt;td&gt;Textract, Bedrock&lt;/td&gt;
&lt;td&gt;ESG metric extraction + GRI/TCFD/ISSB mapping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC24&lt;/td&gt;
&lt;td&gt;Nonprofit&lt;/td&gt;
&lt;td&gt;Textract, Comprehend, Bedrock&lt;/td&gt;
&lt;td&gt;Grant application + outcome matching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC25&lt;/td&gt;
&lt;td&gt;Utilities&lt;/td&gt;
&lt;td&gt;Rekognition, Bedrock, Athena&lt;/td&gt;
&lt;td&gt;Drone + SCADA + thermal tri-modal inspection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC26&lt;/td&gt;
&lt;td&gt;Real Estate&lt;/td&gt;
&lt;td&gt;Rekognition, Textract, Bedrock&lt;/td&gt;
&lt;td&gt;Property analysis + lease extraction + PII flagging&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC27&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;td&gt;Textract, Comprehend, Bedrock&lt;/td&gt;
&lt;td&gt;Recruiting document triage with PII protection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC28&lt;/td&gt;
&lt;td&gt;Chemicals&lt;/td&gt;
&lt;td&gt;Textract, Rekognition, Bedrock&lt;/td&gt;
&lt;td&gt;SDS hazard extraction + GHS compliance + lab notebook&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Architecture: One Pattern, Many Industries
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Architecture Classification
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Classification&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Workflow orchestration&lt;/td&gt;
&lt;td&gt;Serverless (Step Functions)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compute&lt;/td&gt;
&lt;td&gt;Serverless (Lambda)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Eventing / scheduling&lt;/td&gt;
&lt;td&gt;Serverless (EventBridge)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI/ML services&lt;/td&gt;
&lt;td&gt;Managed service consumption (Bedrock, Textract, Rekognition, Comprehend)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File storage&lt;/td&gt;
&lt;td&gt;Managed/provisioned (FSx for ONTAP)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operations model&lt;/td&gt;
&lt;td&gt;Hybrid: serverless processing + managed file storage&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Lambda concurrency must be bounded by FSx ONTAP S3 AP throughput behavior. Do not treat Lambda concurrency as the only scaling control.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Common Workflow Pattern
&lt;/h3&gt;

&lt;p&gt;Every pattern follows the same proven architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EventBridge Scheduler
       │
       ▼
Step Functions State Machine
       │
       ├── Discovery Lambda (VPC-internal, ONTAP API)
       │        │
       │        ▼
       │   S3 Access Point (list + classify files)
       │
       ├── Processing Map (parallel, Retry + Catch)
       │        │
       │        ▼
       │   [Rekognition | Textract | Comprehend | Bedrock | Athena]
       │
       └── Report Lambda
                │
                ├── Output → S3 AP (FSx ONTAP volume)
                └── SNS Notification
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What changes per industry:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;File prefixes and extensions&lt;/strong&gt; (Discovery Lambda configuration)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI/ML service selection&lt;/strong&gt; (Rekognition for images, Textract for documents, Bedrock for reasoning)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain-specific schemas&lt;/strong&gt; (ESG metrics, GHS sections, CDR fields)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review thresholds&lt;/strong&gt; (60% escalation trigger for safety-critical defects, 80% standard detection, 90% auto-approve threshold)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance requirements&lt;/strong&gt; (PII filtering for HR, data classification labels, audit trails)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For production deployments, validate how S3 AP-generated output files appear from existing NFS/SMB clients, including ownership, permissions, naming convention, and Snapshot/SnapMirror policy impact. See &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/ontap-integration-notes.md" rel="noopener noreferrer"&gt;ONTAP Integration Notes&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Shared Modules: The Productivity Multiplier
&lt;/h2&gt;

&lt;p&gt;The 11 new patterns reuse the same &lt;code&gt;shared/&lt;/code&gt; modules that power the original 17:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Module&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Used By&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;s3ap_helper.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;S3 Access Point abstraction (alias + ARN)&lt;/td&gt;
&lt;td&gt;All 28 UCs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;exceptions.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Domain exceptions + error handler decorator&lt;/td&gt;
&lt;td&gt;All 28 UCs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;observability.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;EMF metrics + structured logging&lt;/td&gt;
&lt;td&gt;All 28 UCs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;human_review.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Confidence-based review decisions&lt;/td&gt;
&lt;td&gt;UC22, UC25, UC27&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;data_classification.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Output data labeling (INTERNAL/CUI/etc.)&lt;/td&gt;
&lt;td&gt;UC23, UC24, UC27, UC28&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;schemas/events.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;TypedDict event/response schemas&lt;/td&gt;
&lt;td&gt;All 28 UCs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Adding a new industry pattern takes &lt;strong&gt;2-3 hours&lt;/strong&gt; (not days) because the infrastructure is already solved. A new pattern is considered field-shareable only after DemoMode execution, cfn-lint validation, unit/property tests, success metrics, data classification, and human review thresholds are documented.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Design Decisions for New Patterns
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Safety-Critical Thresholds (UC22)
&lt;/h3&gt;

&lt;p&gt;Railway infrastructure inspection cannot accept false negatives. We use a &lt;strong&gt;dual-threshold&lt;/strong&gt; approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;STANDARD_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;       &lt;span class="c1"&gt;# General defect detection trigger
&lt;/span&gt;&lt;span class="n"&gt;SAFETY_CRITICAL_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;  &lt;span class="c1"&gt;# Bridges, signaling, rail joints — lower to reduce false negatives
&lt;/span&gt;&lt;span class="n"&gt;HUMAN_REVIEW_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;    &lt;span class="c1"&gt;# Auto-approve only above this
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Critical design intent&lt;/strong&gt;: 60% is NOT an auto-approval threshold. It is an &lt;strong&gt;escalation trigger&lt;/strong&gt; — any signal above 60% for safety-critical categories triggers mandatory human review. The system is designed to surface potential defects for expert evaluation, not to automate safety decisions. All detections below 90% confidence require human review regardless of category.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. PII-First Design (UC27)
&lt;/h3&gt;

&lt;p&gt;Recruiting document triage handles personal data. The pattern enforces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No PII in logs&lt;/strong&gt; — structured logging strips personal identifiers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Protected characteristic exclusion&lt;/strong&gt; — Bedrock prompt explicitly excludes age, gender, ethnicity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Encrypted output&lt;/strong&gt; — all results written with data classification labels&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit trail&lt;/strong&gt; — every scoring decision is logged with justification (not content)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Regulatory notice&lt;/strong&gt;: UC27 is a &lt;strong&gt;document triage and summarization workflow&lt;/strong&gt;, not an automated hiring decision system. Final hiring decisions must remain with qualified human reviewers. Customers must validate against local labor law, privacy regulations (GDPR, APPI, CCPA), and anti-discrimination requirements before any use in recruitment processes. Output must not include ranking by protected attributes, and explanation fields must cite only job-relevant qualifications.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  3. Tri-Modal Inspection (UC25)
&lt;/h3&gt;

&lt;p&gt;Utilities asset inspection combines three data modalities in a single workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Visual&lt;/strong&gt; (drone images) → Rekognition defect detection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Temporal&lt;/strong&gt; (SCADA logs) → Athena time-series anomaly detection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Thermal&lt;/strong&gt; (FLIR images) → Hot-spot classification (≥10°C differential)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Step Functions workflow processes all three in parallel Map states, then merges results for a unified maintenance priority report.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. ESG Framework Mapping (UC23)
&lt;/h3&gt;

&lt;p&gt;Sustainability reporting requires mapping extracted metrics to multiple frameworks simultaneously:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GRI&lt;/strong&gt; (Global Reporting Initiative)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TCFD&lt;/strong&gt; (Task Force on Climate-related Financial Disclosures)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ISSB&lt;/strong&gt; (International Sustainability Standards Board)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bedrock performs the mapping using structured prompts with framework-specific indicator definitions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Testing: 1,499+ Tests Across 28 Patterns
&lt;/h2&gt;

&lt;p&gt;Each new pattern includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unit tests&lt;/strong&gt; with moto for AWS service mocking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Property-based tests&lt;/strong&gt; (Hypothesis) for invariant verification&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;cfn-lint validation&lt;/strong&gt; for all CloudFormation templates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ruff linting&lt;/strong&gt; for Python code quality&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Notable property tests:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;UC22: &lt;code&gt;severity_level ∈ {critical, major, minor, observation}&lt;/code&gt; for all inputs&lt;/li&gt;
&lt;li&gt;UC25: SCADA thresholds within physical bounds (voltage ±5%, frequency ±0.5 Hz)&lt;/li&gt;
&lt;li&gt;UC27: No protected characteristics appear in any output field&lt;/li&gt;
&lt;li&gt;UC28: All GHS mandatory sections validated for completeness&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Responsible AI and Human Review
&lt;/h2&gt;

&lt;p&gt;These patterns are &lt;strong&gt;reference workflows&lt;/strong&gt;, not fully automated decision systems. For regulated or safety-critical domains (healthcare, finance, transportation, HR, public sector), customers must define:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Human review thresholds&lt;/strong&gt; — what confidence level requires expert validation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Appeal/escalation process&lt;/strong&gt; — how incorrect classifications are corrected&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit trail requirements&lt;/strong&gt; — what decisions need immutable logging&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data retention policy&lt;/strong&gt; — how long intermediate results are kept&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model evaluation criteria&lt;/strong&gt; — accuracy, hallucination rate, bias testing on domain data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local regulatory review&lt;/strong&gt; — jurisdiction-specific compliance (FISC, HIPAA, GDPR, NARA, labor law)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;shared/human_review.py&lt;/code&gt; module provides a framework for confidence-based routing, but &lt;strong&gt;threshold values and escalation procedures must be defined by domain experts&lt;/strong&gt;, not by template defaults.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Customers are responsible for validating these workflows against their own policies, risk classification, and regulatory obligations before production use.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Pattern Selection Guide
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Customer Situation&lt;/th&gt;
&lt;th&gt;Recommended Starting Pattern&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;FSx ONTAP already used for shared files&lt;/td&gt;
&lt;td&gt;UC by industry + DemoMode=false&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No FSx ONTAP yet, wants to evaluate workflow&lt;/td&gt;
&lt;td&gt;Any UC + DemoMode=true&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Document-heavy workload (PDF, contracts, reports)&lt;/td&gt;
&lt;td&gt;UC20 / UC23 / UC24 / UC26 / UC27 / UC28&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Image-heavy inspection workload&lt;/td&gt;
&lt;td&gt;UC19 / UC21 / UC22 / UC25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Logs / time-series / analytics workload&lt;/td&gt;
&lt;td&gt;UC18 / UC25 (SCADA)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Safety-critical review required&lt;/td&gt;
&lt;td&gt;UC22 / UC25 with human_review module&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PII-sensitive workflow&lt;/td&gt;
&lt;td&gt;UC27 / UC26 with data_classification module&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESG / sustainability reporting&lt;/td&gt;
&lt;td&gt;UC23 with framework mapping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Greenfield object-native workload (no NAS)&lt;/td&gt;
&lt;td&gt;Prefer standard S3 + serverless-native architecture&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  DemoMode to Production Path
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Area&lt;/th&gt;
&lt;th&gt;DemoMode (evaluation)&lt;/th&gt;
&lt;th&gt;Production (FSx ONTAP)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Input source&lt;/td&gt;
&lt;td&gt;Regular S3 bucket&lt;/td&gt;
&lt;td&gt;FSx ONTAP S3 Access Point&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Permissions&lt;/td&gt;
&lt;td&gt;S3 IAM only&lt;/td&gt;
&lt;td&gt;IAM + S3 AP policy + ONTAP file identity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network&lt;/td&gt;
&lt;td&gt;Public AWS service path&lt;/td&gt;
&lt;td&gt;Internet-origin or VPC-origin design decision (&lt;strong&gt;NetworkOrigin is immutable after creation&lt;/strong&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data&lt;/td&gt;
&lt;td&gt;Sample/synthetic data&lt;/td&gt;
&lt;td&gt;Customer-controlled NAS data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Governance&lt;/td&gt;
&lt;td&gt;Demo labels only&lt;/td&gt;
&lt;td&gt;Data classification + lineage + retention&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;~$0.10/execution&lt;/td&gt;
&lt;td&gt;+ FSx ONTAP infrastructure (~$194/month base)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code compatibility&lt;/td&gt;
&lt;td&gt;Standard S3 bucket semantics&lt;/td&gt;
&lt;td&gt;Validate the FSx ONTAP S3 AP API subset and unsupported S3 bucket features before production&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Access point lifecycle&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;NetworkOrigin changes require creating a new S3 AP&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Cost varies by region, deployment type, SSD capacity, throughput capacity, backups, and data transfer; the figure above is a baseline estimate for Single-AZ / 128 MBps / 1 TB SSD. This cost model is not scale-to-zero storage. Use this pattern when the value of processing existing NAS data in place outweighs the baseline FSx ONTAP infrastructure cost.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Deployment: 30 Minutes to First Result
&lt;/h2&gt;

&lt;p&gt;Every pattern includes a &lt;code&gt;samconfig.toml.example&lt;/code&gt; and step-by-step deployment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Copy and configure&lt;/span&gt;
&lt;span class="nb"&gt;cp &lt;/span&gt;samconfig.toml.example samconfig.toml
&lt;span class="c"&gt;# Edit: S3AccessPointAlias, VpcId, SubnetIds, etc.&lt;/span&gt;

&lt;span class="c"&gt;# 2. Deploy&lt;/span&gt;
sam build &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sam deploy &lt;span class="nt"&gt;--guided&lt;/span&gt;

&lt;span class="c"&gt;# 3. Execute&lt;/span&gt;
aws stepfunctions start-execution &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--state-machine-arn&lt;/span&gt; &amp;lt;ARN from outputs&amp;gt;

&lt;span class="c"&gt;# 4. Verify&lt;/span&gt;
aws stepfunctions describe-execution &lt;span class="nt"&gt;--execution-arn&lt;/span&gt; &amp;lt;ARN&amp;gt;
&lt;span class="c"&gt;# Status: SUCCEEDED&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For patterns without FSx for ONTAP, &lt;strong&gt;DemoMode=true&lt;/strong&gt; uses a regular S3 bucket — ideal for evaluation without infrastructure commitment.&lt;/p&gt;




&lt;h2&gt;
  
  
  Benchmark Insight: Small Files Don't Need More Throughput
&lt;/h2&gt;

&lt;p&gt;During Phase 15 deployment verification, we ran benchmarks at 128/256/512 MBps throughput capacity with a 202-byte JSON manifest:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;th&gt;P50 @ conc=1&lt;/th&gt;
&lt;th&gt;P50 @ conc=25&lt;/th&gt;
&lt;th&gt;P50 @ conc=50&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;256 MBps&lt;/td&gt;
&lt;td&gt;56.9 ms&lt;/td&gt;
&lt;td&gt;60.3 ms&lt;/td&gt;
&lt;td&gt;257.9 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;512 MBps&lt;/td&gt;
&lt;td&gt;59.8 ms&lt;/td&gt;
&lt;td&gt;59.9 ms&lt;/td&gt;
&lt;td&gt;246.1 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;: For metadata-heavy workloads (JSON manifests, small config files, document headers), throughput capacity increase has zero effect on latency. The bottleneck is connection overhead (TLS + S3 AP routing), not bandwidth. Save costs by staying at 128 MBps for these workloads.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Sizing reference from a specific test environment, not a service limit.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Documentation: 8 Languages × 28 Patterns
&lt;/h2&gt;

&lt;p&gt;Every pattern includes documentation in:&lt;br&gt;
🇯🇵 Japanese (primary) · 🇺🇸 English · 🇰🇷 Korean · 🇨🇳 Chinese (Simplified) · 🇹🇼 Chinese (Traditional) · 🇫🇷 French · 🇩🇪 German · 🇪🇸 Spanish&lt;/p&gt;

&lt;p&gt;Each language includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;README.md&lt;/code&gt; — Overview, deployment, success metrics&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docs/architecture.md&lt;/code&gt; — Mermaid data flow diagram&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docs/demo-guide.md&lt;/code&gt; — Step-by-step demo with verification checklist&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each UC README includes &lt;strong&gt;Success Metrics&lt;/strong&gt; with Business Outcome, Technical KPI, Quality KPI, Cost KPI, and Go/No-Go criteria. This article summarizes the portfolio; detailed success criteria live with each pattern.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Changed Since Phase 14
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Phase 14&lt;/th&gt;
&lt;th&gt;Phase 15&lt;/th&gt;
&lt;th&gt;Delta&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Use cases&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;28&lt;/td&gt;
&lt;td&gt;+11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total patterns&lt;/td&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;td&gt;35&lt;/td&gt;
&lt;td&gt;+11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test count&lt;/td&gt;
&lt;td&gt;~800&lt;/td&gt;
&lt;td&gt;1,499+&lt;/td&gt;
&lt;td&gt;+699&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Industries covered&lt;/td&gt;
&lt;td&gt;14/22&lt;/td&gt;
&lt;td&gt;19/22&lt;/td&gt;
&lt;td&gt;+5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Languages&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shared modules&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;+3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Documentation files&lt;/td&gt;
&lt;td&gt;~400&lt;/td&gt;
&lt;td&gt;~700&lt;/td&gt;
&lt;td&gt;+300&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Who Should Use Each New Pattern?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Recommended Starting Patterns
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Start here if...&lt;/th&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;You want document intelligence&lt;/td&gt;
&lt;td&gt;UC20 or UC26&lt;/td&gt;
&lt;td&gt;Multilingual extraction + property/lease analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;You want log analytics&lt;/td&gt;
&lt;td&gt;UC18&lt;/td&gt;
&lt;td&gt;CDR/syslog anomaly detection with baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;You need PII-safe document triage&lt;/td&gt;
&lt;td&gt;UC27&lt;/td&gt;
&lt;td&gt;Protected characteristic exclusion built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;You need inspection workflows&lt;/td&gt;
&lt;td&gt;UC22 or UC25&lt;/td&gt;
&lt;td&gt;Safety-critical escalation + tri-modal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;You want ESG extraction&lt;/td&gt;
&lt;td&gt;UC23&lt;/td&gt;
&lt;td&gt;Multi-framework mapping (GRI/TCFD/ISSB)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Full Pattern List
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;If you are...&lt;/th&gt;
&lt;th&gt;Start with...&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Telecom operator with CDR data&lt;/td&gt;
&lt;td&gt;UC18&lt;/td&gt;
&lt;td&gt;Anomaly detection across network logs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ad agency managing creative assets&lt;/td&gt;
&lt;td&gt;UC19&lt;/td&gt;
&lt;td&gt;Automated brand compliance scoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hotel chain with inspection photos&lt;/td&gt;
&lt;td&gt;UC20&lt;/td&gt;
&lt;td&gt;Facility condition monitoring at scale&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agricultural cooperative&lt;/td&gt;
&lt;td&gt;UC21&lt;/td&gt;
&lt;td&gt;Crop health + traceability in one workflow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Railway/transit operator&lt;/td&gt;
&lt;td&gt;UC22&lt;/td&gt;
&lt;td&gt;Safety-critical deterioration detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESG reporting team&lt;/td&gt;
&lt;td&gt;UC23&lt;/td&gt;
&lt;td&gt;Multi-framework metric extraction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Grant-making foundation&lt;/td&gt;
&lt;td&gt;UC24&lt;/td&gt;
&lt;td&gt;Application processing + outcome matching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Power utility with drone programs&lt;/td&gt;
&lt;td&gt;UC25&lt;/td&gt;
&lt;td&gt;Tri-modal inspection (visual + SCADA + thermal)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Real estate portfolio manager&lt;/td&gt;
&lt;td&gt;UC26&lt;/td&gt;
&lt;td&gt;Property analysis + lease extraction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recruiting team (APAC/EMEA)&lt;/td&gt;
&lt;td&gt;UC27&lt;/td&gt;
&lt;td&gt;PII-compliant recruiting document triage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chemical manufacturer&lt;/td&gt;
&lt;td&gt;UC28&lt;/td&gt;
&lt;td&gt;SDS compliance + lab notebook digitization&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;VPC-internal Lambda benchmark&lt;/strong&gt; — True VPC path performance (eliminates Internet latency)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FPolicy TCP-level Replay Storm&lt;/strong&gt; — Real ONTAP event replay (requires ECS rebuild)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-repository integration&lt;/strong&gt; — Link patterns to &lt;a href="https://github.com/Yoshiki0705/fsxn-lakehouse-integrations" rel="noopener noreferrer"&gt;fsxn-lakehouse-integrations&lt;/a&gt; for analytics pipelines&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Glue Data Catalog integration&lt;/strong&gt; — Schema versioning and data quality checks for output datasets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Community contributions&lt;/strong&gt; — Pattern template for community-submitted industry use cases&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Resolved from Phase 14&lt;/strong&gt;: FlexCache × S3 AP integration confirmed as not currently supported by AWS — tracked in Field Feedback Log. FC1 Recovery Metrics depend on this feature. Both remain pending AWS feature availability.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Ownership Model
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Recommended Owner&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Shared modules (&lt;code&gt;shared/&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Platform / DevOps team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UC business logic (&lt;code&gt;functions/&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Application / data team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FSx ONTAP and S3 AP infrastructure&lt;/td&gt;
&lt;td&gt;Storage / platform team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IAM, data classification, encryption&lt;/td&gt;
&lt;td&gt;Security team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Success metrics and Go/No-Go&lt;/td&gt;
&lt;td&gt;Business owner&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regulatory compliance mapping&lt;/td&gt;
&lt;td&gt;GRC / legal team&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Compliance Positioning
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;These templates &lt;strong&gt;do not certify compliance&lt;/strong&gt; with any specific regulation. They provide implementation hooks for audit logging, retention, classification, and human review that customers can map to their regulatory controls. Each organization must independently validate compliance with applicable regulations (FISC, HIPAA, GDPR, NARA, local labor law, etc.).&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  NetApp / ONTAP Operational Notes
&lt;/h2&gt;

&lt;p&gt;For production deployments on FSx for ONTAP, review the ONTAP-specific guidance in &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/ontap-integration-notes.md" rel="noopener noreferrer"&gt;&lt;code&gt;docs/ontap-integration-notes.md&lt;/code&gt;&lt;/a&gt;, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SVM / volume / protocol scope assumptions&lt;/li&gt;
&lt;li&gt;NFS/SMB visibility of S3 AP-generated outputs (file ownership = AP file system identity)&lt;/li&gt;
&lt;li&gt;IAM + S3 AP policy + ONTAP file identity behavior, separate from NFS export policy evaluation&lt;/li&gt;
&lt;li&gt;Snapshot / SnapMirror / retention impact on output artifacts&lt;/li&gt;
&lt;li&gt;Scheduler vs FPolicy trigger mode selection&lt;/li&gt;
&lt;li&gt;FlexCache / FlexClone combination patterns per UC&lt;/li&gt;
&lt;li&gt;NetApp support diagnostic bundle&lt;/li&gt;
&lt;li&gt;OT/manufacturing safety caveat&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;FlexCache/FlexClone note&lt;/strong&gt;: UC × FC combination patterns describe adjacent architecture patterns. Validate current AWS/FSx feature support before assuming direct S3 AP access to cached or cloned paths.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benchmark scope&lt;/strong&gt;: Results are from Single-AZ, First-generation FSx ONTAP. Validate separately for Multi-AZ or newer generation file systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Regulated research workflows&lt;/strong&gt; (UC7, UC28, FC5): Capture input dataset version, model/prompt version, reviewer action, and output checksum as lineage metadata. See &lt;code&gt;shared/lineage.py&lt;/code&gt; v2 fields.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Stats
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;New patterns&lt;/strong&gt;: 11 (UC18-UC28)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New Lambda functions&lt;/strong&gt;: 44 (4 per pattern average)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New tests&lt;/strong&gt;: 699&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New documentation files&lt;/strong&gt;: ~300 (across 8 languages)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New shared modules&lt;/strong&gt;: &lt;code&gt;data_classification.py&lt;/code&gt;, &lt;code&gt;human_review.py&lt;/code&gt;, &lt;code&gt;schemas/events.py&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment verified&lt;/strong&gt;: All 28 UCs achieved SUCCEEDED status in ap-northeast-1&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmark runs&lt;/strong&gt;: 2 additional (256/512 MBps small-file comparison)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost&lt;/strong&gt;: ~$10 total for deployment verification (Lambda + Step Functions + Bedrock Nova Lite)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try It Today
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns.git
&lt;span class="nb"&gt;cd &lt;/span&gt;FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns

&lt;span class="c"&gt;# Quick test (no AWS account needed)&lt;/span&gt;
make test-quick

&lt;span class="c"&gt;# Deploy any pattern with DemoMode (no FSx ONTAP needed)&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;telecom-network-analytics
&lt;span class="nb"&gt;cp &lt;/span&gt;samconfig.toml.example samconfig.toml
sam build &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; sam deploy &lt;span class="nt"&gt;--guided&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full series&lt;/strong&gt;: &lt;a href="https://dev.to/yoshikifujiwara/series/39652"&gt;FSx for ONTAP S3 Access Points on DEV.to&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>amazonfsxfornetappontap</category>
      <category>s3accesspoints</category>
    </item>
    <item>
      <title>Evidence Expansion, Presigned URL Discovery, and Operational Surprises — FSx for ONTAP S3 Access Points, Phase 14</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 07 Jun 2026 02:53:08 +0000</pubDate>
      <link>https://dev.to/aws-builders/evidence-expansion-presigned-url-discovery-and-operational-surprises-fsx-for-ontap-s3-access-514o</link>
      <guid>https://dev.to/aws-builders/evidence-expansion-presigned-url-discovery-and-operational-surprises-fsx-for-ontap-s3-access-514o</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Phase 14 shifts from building patterns to &lt;strong&gt;hardening the evidence base&lt;/strong&gt;. After publishing Phase 13's field-ready reference architecture, I focused on post-publication refinement: Partner/SI delivery assets, benchmark methodology standardization, S3 AP compatibility clarification (Presigned URLs work despite documentation), and an unexpected operational discovery — S3 Access Points become unavailable during FSx throughput capacity changes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Phase 14?
&lt;/h2&gt;

&lt;p&gt;Phase 13 delivered the field-ready baseline. Phase 14 answers the question: &lt;strong&gt;"Now that the patterns exist, how do we make them easier to evaluate, adopt, and operate?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The work falls into four categories:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Partner/SI delivery acceleration&lt;/strong&gt; — one-pager, improved PoC templates, FC1-FC6 conversation starters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmark methodology&lt;/strong&gt; — standardized run IDs, hypothesis-driven testing, Range GET plans&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compatibility clarification&lt;/strong&gt; — Presigned URL behavior confirmed with AWS Support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operational discovery&lt;/strong&gt; — S3 AP unavailability during throughput capacity changes&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  1. Partner/SI One-Pager: What / When / How / Where
&lt;/h2&gt;

&lt;p&gt;Partners and SIs told us the existing 7-step delivery checklist was comprehensive but too long for a first conversation. Phase 14 adds a &lt;strong&gt;single-page overview&lt;/strong&gt; that answers four questions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Section&lt;/th&gt;
&lt;th&gt;Content&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;What&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;28 UCs + 6 FC patterns, CloudFormation templates, 4-level maturity model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;When&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Customer has FSx for ONTAP + needs serverless file processing + permission-aware access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;How&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Identify UC → Deploy template → Measure baseline → Evaluate Go/No-Go&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Where&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Links to Success Metrics, Governance, Production Readiness, Benchmarks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Available in both &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/partner-si-one-pager.md" rel="noopener noreferrer"&gt;Japanese&lt;/a&gt; and &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/partner-si-one-pager.en.md" rel="noopener noreferrer"&gt;English&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  FC1-FC6 Recommended First Questions
&lt;/h3&gt;

&lt;p&gt;Each FlexCache/FlexClone pattern now has a &lt;strong&gt;recommended first conversation question&lt;/strong&gt; — the question a Partner/SI should ask to determine if the pattern is relevant:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;First Question&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;FC1 (Anycast/DR)&lt;/td&gt;
&lt;td&gt;"What is your current read latency from remote sites, and what target would justify a caching layer?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FC2 (Render)&lt;/td&gt;
&lt;td&gt;"How many concurrent render jobs share the same source data, and what is the job lifecycle?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FC3 (RAG)&lt;/td&gt;
&lt;td&gt;"Which file shares contain the knowledge base, and do access permissions need to be preserved in RAG results?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FC4 (CAE)&lt;/td&gt;
&lt;td&gt;"What is the typical solver output size and how quickly must results be available for post-processing?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FC5 (Life Sciences)&lt;/td&gt;
&lt;td&gt;"How do you currently share research datasets between teams while maintaining data governance?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FC6 (Gaming)&lt;/td&gt;
&lt;td&gt;"What is your current build pipeline duration and which asset validation steps are bottlenecks?"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  2. Presigned URLs: "Not Supported" but Working
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Production Warning&lt;/strong&gt;: AWS Support explicitly states that operations marked "Not supported" should NOT be relied upon for production workloads, even when they return success today. The behavior may change without deprecation notice, return inconsistent results across regions, or stop working after service updates. &lt;strong&gt;Design alternatives for any workflow that requires presigned URL access to FSx for ONTAP S3 Access Points.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Discovery
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/access-points-for-fsxn-object-api-support.html" rel="noopener noreferrer"&gt;FSx for ONTAP S3 AP compatibility table&lt;/a&gt; lists &lt;code&gt;Presign — Not supported&lt;/code&gt;. However, testing showed presigned URLs for GetObject work successfully.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Support Clarification
&lt;/h3&gt;

&lt;p&gt;After raising this with AWS Support, the explanation was clear:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Presigning is client-side only&lt;/strong&gt; — &lt;code&gt;aws s3 presign&lt;/code&gt; computes a SigV4 signature locally. No network request is made.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The presigned URL executes a standard GetObject&lt;/strong&gt; — signature is in query parameters instead of the Authorization header.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Since GetObject is Supported, presigned URLs cannot be blocked&lt;/strong&gt; without breaking GetObject itself.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The documentation likely intended&lt;/strong&gt; to indicate that presigned URL workflows are not officially tested.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Production Guidance
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Guidance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GetObject, PutObject, ListObjectsV2&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Supported&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Build on freely&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conditional writes (If-None-Match)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Blocked&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Returns NotImplemented&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Presigned URLs&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Not supported (doc)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Works but do not rely on for production&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;AWS Support has escalated documentation clarification to the FSx for ONTAP service team. The distinction between "Not supported + hard-blocked" (returns error) and "Not supported + may incidentally work" (no guarantees) is being reviewed.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Readers should verify the &lt;a href="https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/access-points-for-fsxn-object-api-support.html" rel="noopener noreferrer"&gt;latest AWS documentation&lt;/a&gt; before relying on this behavior, as the status may change.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Alternatives to Presigned URLs
&lt;/h3&gt;

&lt;p&gt;If you need time-limited or delegated file access without relying on unsupported behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API Gateway + Lambda proxy&lt;/strong&gt; with IAM or JWT authorization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CloudFront signed URLs&lt;/strong&gt; backed by a controlled Lambda@Edge origin&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Temporary STS credentials&lt;/strong&gt; with scoped IAM permissions (time-limited, per-object or prefix)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Application-level download broker&lt;/strong&gt; with audit logging and access revocation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a broader comparison with standard S3 bucket semantics, see the &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/s3-bucket-user-guide.md" rel="noopener noreferrer"&gt;S3 Bucket User Guide&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Benchmark Methodology: Hypothesis-Driven Testing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why 1769 MB Lambda Memory?
&lt;/h3&gt;

&lt;p&gt;Lambda memory directly controls CPU and network bandwidth allocation. At 1769 MB, Lambda receives exactly 1 vCPU equivalent, providing consistent and reproducible network throughput for benchmark measurements. Lower memory settings would introduce variable network bandwidth as a confounding factor.&lt;/p&gt;

&lt;h3&gt;
  
  
  Benchmark Run ID Convention
&lt;/h3&gt;

&lt;p&gt;Every benchmark run now follows a standardized format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;s3ap-bench-{YYYY-MM-DD}-{seq}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With mandatory fixed conditions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ap-northeast-1&lt;/span&gt;
&lt;span class="na"&gt;Lambda memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1769 MB (1 vCPU)&lt;/span&gt;
&lt;span class="na"&gt;Lambda architecture&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arm64&lt;/span&gt;
&lt;span class="na"&gt;FSx Throughput Capacity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;128 / 256 / 512&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt; &lt;span class="s"&gt;MBps&lt;/span&gt;
&lt;span class="na"&gt;Iterations per data point&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;
&lt;span class="na"&gt;Statistics&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;p50, p90, p95, p99, min, max&lt;/span&gt;
&lt;span class="na"&gt;Concurrent NFS/SMB workload&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;None / Light / Production-level&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Hypothesis: Throughput Capacity vs Practical Concurrency Point
&lt;/h3&gt;

&lt;p&gt;Based on 128 MBps observations where I observed concurrency=10 as the practical upper limit &lt;strong&gt;in this specific test environment&lt;/strong&gt; (1 MB objects, single Lambda invocation pattern, no concurrent NFS/SMB workload), I hypothesize:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Practical concurrency point may shift with FSx throughput capacity increase.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;FSx Capacity&lt;/th&gt;
&lt;th&gt;Predicted Practical Concurrency&lt;/th&gt;
&lt;th&gt;Rationale&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;128 MBps&lt;/td&gt;
&lt;td&gt;10 (observed)&lt;/td&gt;
&lt;td&gt;Baseline — P99 exceeded 420 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;256 MBps&lt;/td&gt;
&lt;td&gt;~15-25&lt;/td&gt;
&lt;td&gt;Sub-linear scaling is plausible due to ONTAP WAFL overhead and TCP connection management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;512 MBps&lt;/td&gt;
&lt;td&gt;~25-45&lt;/td&gt;
&lt;td&gt;Step-function behavior possible if a different bottleneck emerges&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Linear scaling (2x capacity = 2x concurrency) is one possible outcome, but sub-linear or step-function behavior is equally plausible. The actual relationship depends on ONTAP data plane queuing, TCP connection overhead, and whether the bottleneck shifts from throughput to IOPS or latency at higher capacities.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Initial verification was blocked by the S3 AP issue described below. Results were published after recovery on 2026-05-25.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Update (2026-05-25)&lt;/strong&gt;: S3 AP recovered. Benchmarks completed. See Section 7 for results and hypothesis verification.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  4. Operational Discovery: S3 AP Unavailability During Throughput Changes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What Happened
&lt;/h3&gt;

&lt;p&gt;While preparing to run 256 MBps benchmarks, I changed the FSx throughput capacity from 128 to 256 MBps. After the change completed successfully:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;All S3 Access Points&lt;/strong&gt; on the file system returned &lt;code&gt;ServiceUnavailable&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;All SVMs&lt;/strong&gt; were affected (not just one)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reverting to 128 MBps&lt;/strong&gt; did not immediately restore S3 AP access&lt;/li&gt;
&lt;li&gt;The file system itself remained &lt;code&gt;AVAILABLE&lt;/code&gt; throughout&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Timeline
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Time&lt;/th&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;T+0&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;update-file-system&lt;/code&gt; ThroughputCapacity 128 → 256&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T+25 min&lt;/td&gt;
&lt;td&gt;Change completed (256 MBps confirmed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T+25+ min&lt;/td&gt;
&lt;td&gt;All S3 APs return ServiceUnavailable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T+40 min&lt;/td&gt;
&lt;td&gt;Revert initiated (256 → 128)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T+65 min&lt;/td&gt;
&lt;td&gt;Revert completed, S3 APs still unavailable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Impact and Recommendation
&lt;/h3&gt;

&lt;p&gt;This is now tracked as an AWS Support case. Key takeaways:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Plan throughput capacity changes during maintenance windows. S3 AP workloads may be disrupted for an extended period.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Unlike standard S3 buckets, FSx for ONTAP S3 AP availability can be affected by FSx file system operational changes such as throughput capacity updates.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Important context&lt;/strong&gt;: AWS documentation states that NFS/SMB access typically remains available during throughput capacity changes. The S3 AP disruption I observed appears to be specific to the S3 Access Point data plane — not the file system's NFS/SMB data LIFs. This distinction matters for environments that use both protocols.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For regulated environments&lt;/strong&gt; (FISC, healthcare, government): Throughput capacity changes must be included in change management procedures. If S3 AP-based workloads have SLA requirements, the change should be approved through the organization's change advisory board with documented rollback procedures.&lt;/p&gt;

&lt;p&gt;This finding has been added to the &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/production-readiness.md" rel="noopener noreferrer"&gt;Production Readiness&lt;/a&gt; document as a Level 3+ operational consideration.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. DEV.to Series Cleanup
&lt;/h2&gt;

&lt;p&gt;Phase 14 also cleaned up the article series:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Update Notes&lt;/strong&gt; added to Phase 1, 9, 10, 12 articles linking to Phase 13&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Permission-Aware RAG&lt;/strong&gt; articles moved to a separate series (was incorrectly mixed into FSx S3AP series)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Series now has 14 articles&lt;/strong&gt; in "FSx for ONTAP S3 Access Points" (down from 16 after RAG separation)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Blockers and Current Status
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Blocker&lt;/th&gt;
&lt;th&gt;Resolution&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;del&gt;256/512 MBps benchmark&lt;/del&gt;&lt;/td&gt;
&lt;td&gt;&lt;del&gt;S3 AP ServiceUnavailable&lt;/del&gt;&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;Resolved 2026-05-25&lt;/strong&gt; — Results below&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FC1 Recovery Metrics&lt;/td&gt;
&lt;td&gt;FlexCache × S3 AP integration&lt;/td&gt;
&lt;td&gt;Pending AWS feature availability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;del&gt;Hypothesis verification&lt;/del&gt;&lt;/td&gt;
&lt;td&gt;&lt;del&gt;Depends on benchmark&lt;/del&gt;&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;Partially confirmed&lt;/strong&gt; — see results&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  6. Benchmark Results: 128 / 256 / 512 MBps Concurrency Comparison
&lt;/h2&gt;

&lt;p&gt;S3 AP ServiceUnavailable was resolved on 2026-05-25. We immediately executed the planned benchmark across all three throughput tiers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test Environment
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note on methodology divergence&lt;/strong&gt;: The benchmark methodology (Section 3) defines 1769 MB Lambda + 50 iterations as the standard. The Internet tests below used a macOS client + 10 iterations per concurrency level due to the initial exploratory nature of these measurements. The Lambda egress test (Section 8) follows the 1769 MB / 50 iteration standard. Treat Internet results as directional sizing guidance, not as statistically rigorous benchmarks.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Region&lt;/td&gt;
&lt;td&gt;ap-northeast-1 (Tokyo)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FSx for ONTAP&lt;/td&gt;
&lt;td&gt;Single-AZ, First-generation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S3 AP&lt;/td&gt;
&lt;td&gt;NetworkOrigin=Internet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Client&lt;/td&gt;
&lt;td&gt;macOS, boto3, Python 3.9 (public Internet)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Object sizes&lt;/td&gt;
&lt;td&gt;1 KB, 100 KB, 1 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Concurrency&lt;/td&gt;
&lt;td&gt;1, 5, 10, 20, 50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Iterations&lt;/td&gt;
&lt;td&gt;10 per concurrency level (exploratory)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Key Results: 1 MB GetObject P99
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Concurrency&lt;/th&gt;
&lt;th&gt;128 MBps&lt;/th&gt;
&lt;th&gt;256 MBps&lt;/th&gt;
&lt;th&gt;512 MBps&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;76 ms&lt;/td&gt;
&lt;td&gt;93 ms&lt;/td&gt;
&lt;td&gt;96 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;160 ms&lt;/td&gt;
&lt;td&gt;175 ms&lt;/td&gt;
&lt;td&gt;308 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;239 ms&lt;/td&gt;
&lt;td&gt;236 ms&lt;/td&gt;
&lt;td&gt;229 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;20&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;981 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;481 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;738 ms&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;850 ms&lt;/td&gt;
&lt;td&gt;4,495 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Analysis
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;P50 (median) is largely independent of throughput capacity&lt;/strong&gt; — Internet baseline latency (connection + TLS) dominates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;P99 (tail latency) shows the difference&lt;/strong&gt; — 128→256 MBps improved P99 by 51% at concurrency=20&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;512 MBps shows no improvement over 256 MBps via Internet&lt;/strong&gt; — client-side bandwidth (~100 Mbps) becomes the bottleneck&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hypothesis partially confirmed&lt;/strong&gt;: Practical concurrency point does shift with throughput capacity, but the relationship is non-linear and bounded by client bandwidth in Internet-origin tests&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Sizing Guidance
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workload&lt;/th&gt;
&lt;th&gt;128 MBps&lt;/th&gt;
&lt;th&gt;256 MBps&lt;/th&gt;
&lt;th&gt;512 MBps&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Small files (&amp;lt; 10 KB)&lt;/td&gt;
&lt;td&gt;MaxConcurrency=20&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Medium files (100 KB)&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Large files (1 MB+)&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;These are sizing references from a specific test environment, not service limits. A VPC-internal Lambda + VPC-origin S3 AP path is expected to reduce public Internet overhead, but remains untested and must be validated separately. Always validate with your own workload profile.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  What This Means for Production
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;For PoC (128 MBps)&lt;/strong&gt;: Keep Step Functions Map state MaxConcurrency ≤ 5 for 1 MB+ files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For Production (256+ MBps)&lt;/strong&gt;: MaxConcurrency=10-20 is safe for most workloads&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For VPC-internal Lambda (untested)&lt;/strong&gt;: Expected to further reduce latency by eliminating public Internet path, but requires VPC-origin S3 AP (not yet measured)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Throughput capacity changes&lt;/strong&gt;: Plan during maintenance windows (S3 AP disruption risk confirmed)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Small files (&amp;lt; 1 KB)&lt;/strong&gt;: Throughput capacity increase has no effect — bottleneck is connection overhead, not bandwidth. Save costs by staying at 128 MBps for metadata-heavy workloads&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  7. Lambda Egress Path Benchmark: Reducing Connection Overhead
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Terminology clarification&lt;/strong&gt;: This test uses a &lt;strong&gt;VPC-external Lambda&lt;/strong&gt; (no VpcConfig) accessing an &lt;strong&gt;Internet-origin S3 AP&lt;/strong&gt;. The Lambda egress path goes through AWS-managed networking, which is faster than public Internet but is NOT a VPC-internal path. A true VPC-internal test would require a VPC-origin S3 AP + VPC-internal Lambda — that remains untested.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Path&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;What it measures&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Public Internet client → Internet-origin S3 AP&lt;/td&gt;
&lt;td&gt;✅ Measured (Section 7)&lt;/td&gt;
&lt;td&gt;End-user/CI baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VPC-external Lambda → Internet-origin S3 AP&lt;/td&gt;
&lt;td&gt;✅ Measured (this section)&lt;/td&gt;
&lt;td&gt;AWS-managed Lambda egress, NOT VPC-private&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VPC-internal Lambda → VPC-origin S3 AP&lt;/td&gt;
&lt;td&gt;❌ Not yet measured&lt;/td&gt;
&lt;td&gt;True private path (requires new AP)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;We deployed a benchmark Lambda (1769 MB, ARM64, &lt;strong&gt;no VpcConfig&lt;/strong&gt;) to measure GetObject latency via AWS-managed Lambda egress.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda Egress vs Internet: 1 MB GetObject P50
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Concurrency&lt;/th&gt;
&lt;th&gt;Internet P50&lt;/th&gt;
&lt;th&gt;Lambda P50&lt;/th&gt;
&lt;th&gt;Improvement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;68 ms&lt;/td&gt;
&lt;td&gt;62 ms&lt;/td&gt;
&lt;td&gt;9%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;117 ms&lt;/td&gt;
&lt;td&gt;61 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;48%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;175 ms&lt;/td&gt;
&lt;td&gt;73 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;58%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;256 ms&lt;/td&gt;
&lt;td&gt;122 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;52%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;128 ms&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Key Findings
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;P50 dramatically improved at concurrency &amp;gt; 1&lt;/strong&gt;: Lambda egress eliminates public Internet TCP connection overhead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;P99 remains high (~1s)&lt;/strong&gt;: Even from Lambda, concurrency=20 shows P99 of 1,318 ms — this is the S3 AP data plane's internal queuing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;concurrency=50 P50 is only 128 ms&lt;/strong&gt;: Lambda threads are efficient against S3 AP&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The bottleneck is the FSx for ONTAP S3 AP data plane&lt;/strong&gt;, not Lambda network bandwidth&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Production Sizing (Lambda)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workload&lt;/th&gt;
&lt;th&gt;Recommended MaxConcurrency&lt;/th&gt;
&lt;th&gt;Expected P50&lt;/th&gt;
&lt;th&gt;Expected P99&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Small files (1 KB)&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;~63 ms&lt;/td&gt;
&lt;td&gt;~994 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Medium files (100 KB)&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;~79 ms&lt;/td&gt;
&lt;td&gt;~1,044 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Large files (1 MB)&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;~73 ms&lt;/td&gt;
&lt;td&gt;~928 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Set Lambda timeout to 30s+ and use Step Functions Retry to handle P99 spikes. These results are from VPC-external Lambda (AWS-managed egress), not true VPC-internal path.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  8. SQS Replay Storm Simulation: Zero Message Loss Under Load
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Scope clarification&lt;/strong&gt;: This test validates the &lt;strong&gt;downstream SQS ingestion and Lambda consumer drain path&lt;/strong&gt; under replay-like burst conditions. It does NOT validate ONTAP Persistent Store buffering or FPolicy TCP-level server reconnection replay. Those require a live FPolicy server environment (future work).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We simulated FPolicy server reconnection by injecting 1,000 and 10,000 events directly into SQS, mimicking the burst that occurs when a Persistent Store replays buffered events after server reconnection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Results
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Events&lt;/th&gt;
&lt;th&gt;Loss Rate&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;th&gt;Batch P99&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;5 min downtime&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;188 eps&lt;/td&gt;
&lt;td&gt;177 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;30 min downtime&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;464 eps&lt;/td&gt;
&lt;td&gt;79 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consumer drain&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;341 msgs/sec&lt;/td&gt;
&lt;td&gt;85 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  SLO Validation
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;SLO Metric&lt;/th&gt;
&lt;th&gt;Threshold&lt;/th&gt;
&lt;th&gt;Observed&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Event loss rate&lt;/td&gt;
&lt;td&gt;&amp;lt; 0.1%&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Injection throughput&lt;/td&gt;
&lt;td&gt;&amp;gt; 100 eps&lt;/td&gt;
&lt;td&gt;464 eps&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consumer drain rate&lt;/td&gt;
&lt;td&gt;&amp;gt; injection rate&lt;/td&gt;
&lt;td&gt;341 &amp;gt; 188&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch latency P99&lt;/td&gt;
&lt;td&gt;&amp;lt; 200 ms&lt;/td&gt;
&lt;td&gt;79 ms&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DLQ messages&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Implications
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;30-min downtime&lt;/strong&gt; accumulates ~835K events at 464 eps. With Lambda auto-scaling (10 consumers), drain completes in &amp;lt; 5 minutes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persistent Store sizing estimate&lt;/strong&gt;: Based on simulated event payload size, 10K events ≈ 5 MB. Real ONTAP Persistent Store sizing must be validated with live FPolicy replay.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No backpressure issues&lt;/strong&gt;: SQS Standard queue handles burst without message loss.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  9. Operational Runbook: S3 AP Disruption Response
&lt;/h2&gt;

&lt;p&gt;When S3 AP becomes unavailable (e.g., during throughput capacity changes):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Check S3 AP health&lt;/strong&gt;: &lt;code&gt;ListObjectsV2&lt;/code&gt; / &lt;code&gt;GetObject&lt;/code&gt; against S3 AP alias&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check NFS/SMB separately&lt;/strong&gt;: mount + read test (may still be functional)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check FSx file system status&lt;/strong&gt;: &lt;code&gt;describe-file-systems&lt;/code&gt; → Lifecycle&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check CloudWatch alarms&lt;/strong&gt;: Lambda errors, Step Functions failures&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pause ingestion&lt;/strong&gt;: Disable EventBridge Schedules for affected UC pipelines&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wait and retry&lt;/strong&gt;: S3 AP recovery may take 15-60 min after throughput changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Escalate&lt;/strong&gt;: If unavailable &amp;gt; 60 min, contact AWS Support with file system ID&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;See &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/incident-response-playbook.md" rel="noopener noreferrer"&gt;Incident Response Playbook&lt;/a&gt; for full procedures.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What's Next (Phase 15 candidates)
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: Phase 15 expanded the pattern library from 17 to 28 industry-specific use cases. Items 1-2 below remain pending AWS feature availability. Items 3-4 are carried forward in Phase 15's What's Next.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;FlexCache × S3 AP integration&lt;/strong&gt; — pending AWS feature availability (not yet supported)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FC1 Recovery Metrics&lt;/strong&gt; — route decision latency, cache health detection, failover timing (depends on #1)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replay Storm with real FPolicy server&lt;/strong&gt; — TCP-level replay characteristics (requires ECS re-deploy)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VPC-internal Lambda with VPC Origin S3 AP&lt;/strong&gt; — true VPC-internal path (requires new AP with NetworkOrigin=VPC)&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;Multi-Account OAM validation completed 2026-05-25 — cross-account CloudWatch Metrics, Logs, and X-Ray Traces confirmed working.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Field feedback tracked&lt;/strong&gt;: S3 AP disruption during throughput change, presigned URL documentation gap, VPC-origin benchmark gap, and FlexCache × S3 AP feature dependency. Details in &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/ontap-integration-notes.md#field-feedback-log" rel="noopener noreferrer"&gt;&lt;code&gt;docs/ontap-integration-notes.md&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Stats
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Files changed&lt;/strong&gt;: 200+ (documentation, translations, shared modules, templates)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New documents&lt;/strong&gt;: Partner/SI one-pager (JP/EN/KO/ZH-CN), cost calculator, customization guide, incident response playbook, demo mode guide, comparison alternatives, PoC Go/No-Go template&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;New shared modules&lt;/strong&gt;: &lt;code&gt;data_classification.py&lt;/code&gt;, &lt;code&gt;human_review.py&lt;/code&gt;, &lt;code&gt;schemas/events.py&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Benchmark runs&lt;/strong&gt;: 7 (128/256/512 MBps Internet × 2 file sizes + Lambda egress + SQS replay storm simulation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Templates fixed&lt;/strong&gt;: 5 (cfn-lint errors: RecursiveDeleteOption, SNSPublishMessagePolicy, Handler path)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Translations added&lt;/strong&gt;: 20 files (FC1-FC6 ko/zh-CN + FC1/FC3 full 8-lang)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;samconfig.toml.example&lt;/strong&gt;: 24 patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output JSON samples&lt;/strong&gt;: 24 patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DEV.to articles updated&lt;/strong&gt;: 6 (4 Update Notes + 2 Series changes)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS Support cases&lt;/strong&gt;: 1 resolved (S3 AP ServiceUnavailable — throughput change related)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operational discoveries&lt;/strong&gt;: 1 (throughput change → S3 AP disruption, now resolved)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost savings&lt;/strong&gt;: ~$346/month (v4-test-demo + FPolicy server + VPC Endpoints + EC2 停止)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQS Replay Storm Simulation&lt;/strong&gt;: 10,000 events, 0% loss in downstream SQS/consumer path&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Who Should Care About Phase 14?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Partners and SIs&lt;/strong&gt; get a one-pager for first conversations and recommended questions for each FC pattern&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operations teams&lt;/strong&gt; learn that throughput capacity changes can disrupt S3 AP access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architects&lt;/strong&gt; get standardized benchmark methodology with hypothesis-driven testing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developers&lt;/strong&gt; get Presigned URL clarification — works but don't depend on it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard S3 bucket users&lt;/strong&gt; learn where FSx for ONTAP S3 AP differs from S3 bucket semantics, especially presigned URLs, availability, and operational dependencies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Serverless-first teams&lt;/strong&gt; learn where the serverless processing plane ends and FSx for ONTAP operational considerations begin&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What You Can Do Today
&lt;/h2&gt;

&lt;p&gt;Phase 14 delivers immediately usable assets:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Use the &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/partner-si-one-pager.en.md" rel="noopener noreferrer"&gt;Partner/SI one-pager&lt;/a&gt;&lt;/strong&gt; for your next customer conversation about FSx for ONTAP + serverless&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check the &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/s3ap-compatibility-notes.md" rel="noopener noreferrer"&gt;S3AP Compatibility Notes&lt;/a&gt;&lt;/strong&gt; for the latest Presigned URL and troubleshooting guidance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan throughput changes carefully&lt;/strong&gt; — add S3 AP health checks to your maintenance runbook&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use the Sizing Guidance tables&lt;/strong&gt; (Sections 7-8) to set MaxConcurrency for your workload&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review the &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/s3-bucket-user-guide.md" rel="noopener noreferrer"&gt;S3 Bucket User Guide&lt;/a&gt;&lt;/strong&gt; before porting existing S3 applications to FSx for ONTAP S3 AP&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review the &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/blob/main/docs/ontap-integration-notes.md" rel="noopener noreferrer"&gt;ONTAP Integration Notes&lt;/a&gt;&lt;/strong&gt; before attaching S3 AP workflows to production SVMs and volumes&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full series&lt;/strong&gt;: &lt;a href="https://dev.to/yoshikifujiwara/series/39652"&gt;FSx for ONTAP S3 Access Points on DEV.to&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>amazonfsxfornetappontap</category>
      <category>s3accesspoints</category>
    </item>
    <item>
      <title>9 Services, One Architecture: What We Learned Shipping FSx for ONTAP Logs to Every Major Observability Platform</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 31 May 2026 01:42:23 +0000</pubDate>
      <link>https://dev.to/aws-builders/9-services-one-architecture-what-we-learned-shipping-fsx-for-ontap-logs-to-every-major-19ig</link>
      <guid>https://dev.to/aws-builders/9-services-one-architecture-what-we-learned-shipping-fsx-for-ontap-logs-to-every-major-19ig</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;We built and E2E-verified serverless integrations shipping FSx for ONTAP audit logs to &lt;strong&gt;9 observability platforms&lt;/strong&gt; — all from the same architecture:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For decision makers&lt;/strong&gt;: 90% cost reduction vs EC2-based collectors ($66/month → $5-8/month), 9 vendor choices instead of 1, 30-minute deploy instead of hours, zero operational burden. Four vendors offer permanent free tiers covering most FSx for ONTAP deployments (New Relic 100 GB, Grafana Cloud 50 GB, Honeycomb 20M events, Sumo Logic 500 MB/day).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    ┌─────────────────────────────────────────────┐
                    │         One Architecture, 9 Backends        │
                    ├─────────────────────────────────────────────┤
                    │                                             │
                    │  FSx for ONTAP ──→ S3 Access Point          │
                    │       │                                     │
                    │       ▼                                     │
                    │  EventBridge Scheduler (5 min)              │
                    │       │                                     │
                    │       ▼                                     │
                    │  Lambda (vendor-specific handler)           │
                    │       │                                     │
                    │       ├──→ Datadog (Logs API v2)            │
                    │       ├──→ New Relic (Log API v1)           │
                    │       ├──→ Splunk (HEC)                     │
                    │       ├──→ Grafana Cloud (OTLP Gateway)     │
                    │       ├──→ Elastic (Bulk API)               │
                    │       ├──→ Dynatrace (Log Ingest v2)        │
                    │       ├──→ Sumo Logic (HTTP Source)         │
                    │       ├──→ Honeycomb (Events Batch API)     │
                    │       └──→ OTel Collector (OTLP/HTTP)       │
                    │                                             │
                    └─────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;12 articles, 9 vendors, 3 event sources (audit logs, EMS webhooks, FPolicy), all CloudFormation-templated, all tested with real FSx for ONTAP data. This post distills what we learned.&lt;/p&gt;

&lt;p&gt;This is Part 13 — the series finale — of &lt;a href="https://dev.to/aws-builders/why-your-fsx-for-ontap-audit-logs-deserve-better-than-ec2-kod"&gt;Serverless Observability for FSx for ONTAP&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture That Survived 9 Integrations
&lt;/h2&gt;

&lt;p&gt;After implementing 9 vendor integrations, the core pattern remained unchanged:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Get cached credentials (Secrets Manager + TTL, default 5 min)
&lt;/span&gt;    &lt;span class="n"&gt;creds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. List new files since checkpoint (S3 AP + SSM)
&lt;/span&gt;    &lt;span class="n"&gt;new_keys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list_new_keys&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s3_ap_arn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;checkpoint&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 3. Read, parse, format, ship per file (vendor-specific)
&lt;/span&gt;    &lt;span class="c1"&gt;#    (Simplified — actual implementation batches events across files
&lt;/span&gt;    &lt;span class="c1"&gt;#     and respects vendor-specific batch size limits)
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;new_keys&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;read_and_parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;format_for_vendor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Only this changes per vendor
&lt;/span&gt;
    &lt;span class="c1"&gt;# 4. Ship with retry (vendor API)
&lt;/span&gt;        &lt;span class="nf"&gt;ship_to_vendor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;creds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 5. Advance checkpoint (only after confirmed delivery)
&lt;/span&gt;        &lt;span class="nf"&gt;update_checkpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What changes per vendor: &lt;strong&gt;only the formatting and HTTP call&lt;/strong&gt; (~50-100 lines). Everything else — S3 AP access, checkpoint management, DLQ handling, credential caching, retry logic — is shared.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cross-Vendor Comparison: The Numbers
&lt;/h2&gt;

&lt;h3&gt;
  
  
  API Characteristics
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Endpoint&lt;/th&gt;
&lt;th&gt;Auth Model&lt;/th&gt;
&lt;th&gt;Max Batch&lt;/th&gt;
&lt;th&gt;Success Code&lt;/th&gt;
&lt;th&gt;Firehose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Datadog&lt;/td&gt;
&lt;td&gt;Logs API v2&lt;/td&gt;
&lt;td&gt;Header (&lt;code&gt;DD-API-KEY&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;5 MB / 1000 items&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New Relic&lt;/td&gt;
&lt;td&gt;Log API v1&lt;/td&gt;
&lt;td&gt;Header (&lt;code&gt;Api-Key&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;1 MB&lt;/td&gt;
&lt;td&gt;202&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Splunk&lt;/td&gt;
&lt;td&gt;HEC&lt;/td&gt;
&lt;td&gt;Header (&lt;code&gt;Splunk &amp;lt;token&amp;gt;&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;No hard limit&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;Yes (built-in)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Grafana&lt;/td&gt;
&lt;td&gt;OTLP Gateway&lt;/td&gt;
&lt;td&gt;Basic Auth (base64)&lt;/td&gt;
&lt;td&gt;~4 MB&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Elastic&lt;/td&gt;
&lt;td&gt;Bulk API&lt;/td&gt;
&lt;td&gt;Header (&lt;code&gt;ApiKey &amp;lt;b64&amp;gt;&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;~10 MB&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dynatrace&lt;/td&gt;
&lt;td&gt;Log Ingest v2&lt;/td&gt;
&lt;td&gt;Header (&lt;code&gt;Api-Token&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;1 MB&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;204&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Via ActiveGate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sumo Logic&lt;/td&gt;
&lt;td&gt;HTTP Source&lt;/td&gt;
&lt;td&gt;URL-embedded token&lt;/td&gt;
&lt;td&gt;1 MB&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Honeycomb&lt;/td&gt;
&lt;td&gt;Events Batch&lt;/td&gt;
&lt;td&gt;Header (&lt;code&gt;x-honeycomb-team&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;5 MB (impl: 100/batch)&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OTel Collector&lt;/td&gt;
&lt;td&gt;OTLP/HTTP&lt;/td&gt;
&lt;td&gt;Configurable&lt;/td&gt;
&lt;td&gt;Configurable&lt;/td&gt;
&lt;td&gt;200&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Cost at 10 GB/month
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Vendor Cost&lt;/th&gt;
&lt;th&gt;AWS Infra&lt;/th&gt;
&lt;th&gt;Total&lt;/th&gt;
&lt;th&gt;Free Tier&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sumo Logic&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~$5&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;500 MB/day&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Honeycomb&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~$5&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;20M events/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New Relic&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~$5&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;100 GB/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Grafana Cloud&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~$5&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;50 GB logs/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Datadog&lt;/td&gt;
&lt;td&gt;~$15&lt;/td&gt;
&lt;td&gt;~$5&lt;/td&gt;
&lt;td&gt;~$20&lt;/td&gt;
&lt;td&gt;Logs: 14-day trial only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dynatrace&lt;/td&gt;
&lt;td&gt;~$25&lt;/td&gt;
&lt;td&gt;~$5&lt;/td&gt;
&lt;td&gt;~$30&lt;/td&gt;
&lt;td&gt;14-day trial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Elastic Cloud&lt;/td&gt;
&lt;td&gt;~$95&lt;/td&gt;
&lt;td&gt;~$5&lt;/td&gt;
&lt;td&gt;~$100&lt;/td&gt;
&lt;td&gt;14-day trial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Splunk Cloud&lt;/td&gt;
&lt;td&gt;~$150+&lt;/td&gt;
&lt;td&gt;~$5&lt;/td&gt;
&lt;td&gt;~$155+&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;AWS infrastructure cost is consistent across all vendors (~$5/month for Lambda + EventBridge + Secrets Manager). The vendor platform cost is the differentiator.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Data Residency
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Tokyo (JP)&lt;/th&gt;
&lt;th&gt;US&lt;/th&gt;
&lt;th&gt;EU&lt;/th&gt;
&lt;th&gt;Self-Hosted&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sumo Logic&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Elastic&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dynatrace&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Yes&lt;/strong&gt; (region-specific)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Yes&lt;/strong&gt; (Managed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Datadog&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New Relic&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;No&lt;/strong&gt; (July 2026 planned)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Grafana Cloud&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Dedicated only&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No (Alloy self-hosted)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Splunk&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Honeycomb&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Governance note&lt;/strong&gt;: This table provides technical awareness for vendor selection. Grafana Cloud offers Tokyo region on Dedicated tier (not Free/Pro). Data residency alone does not constitute regulatory compliance. Evaluate your specific requirements (APPI, GDPR, FISC, ISMAP) with your compliance team. See the &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/retention-policy-matrix.md" rel="noopener noreferrer"&gt;Retention Policy Matrix&lt;/a&gt; for regulation-to-vendor mapping.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Unique Strengths
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Datadog&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full-stack APM correlation, broadest feature set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;New Relic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Generous free tier (100 GB), NRQL power&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Splunk&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Existing Splunk shops, SPL expertise, Firehose native&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Grafana Cloud&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OTLP-native, LogQL, open-source ecosystem&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Elastic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Data sovereignty (self-hosted), ECS/SIEM, Kibana&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dynatrace&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Davis AI root cause analysis, APM correlation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Sumo Logic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;JP region data residency, generous free tier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Honeycomb&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High-cardinality analysis (BubbleUp, Heatmaps)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OTel Collector&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Multi-backend, vendor portability, redaction&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note on Grafana ecosystem&lt;/strong&gt;: Grafana Alloy (formerly Grafana Agent) provides a Grafana-native alternative to the OpenTelemetry Collector with the same OTLP compatibility. Grafana Cloud's OTLP Gateway is available on all tiers including Free (US/EU regions only). For Tokyo data residency, Grafana Cloud Dedicated is required.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  7 Patterns That Survived All 9 Integrations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Polling &amp;gt; Event-Driven (for FSx for ONTAP S3 AP)
&lt;/h3&gt;

&lt;p&gt;FSx for ONTAP S3 Access Points don't support S3 Event Notifications. We evaluated CloudTrail data events as an alternative — however, CloudTrail data events for FSx for ONTAP S3 AP access are not consistently available across all configurations. The 5-minute EventBridge Scheduler poll is simpler, cheaper, and sufficient for audit log use cases where near-real-time (not real-time) delivery is acceptable.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Checkpoint-After-Delivery
&lt;/h3&gt;

&lt;p&gt;Never advance the checkpoint before confirming vendor delivery. This single rule prevents data loss across all failure modes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# CORRECT: checkpoint after confirmed delivery
&lt;/span&gt;&lt;span class="nf"&gt;ship_to_vendor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Raises on failure
&lt;/span&gt;&lt;span class="nf"&gt;update_checkpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# Only reached on success
&lt;/span&gt;
&lt;span class="c1"&gt;# WRONG: checkpoint before delivery
&lt;/span&gt;&lt;span class="nf"&gt;update_checkpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# What if ship_to_vendor fails next?
&lt;/span&gt;&lt;span class="nf"&gt;ship_to_vendor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Data loss if this fails
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Credential Caching with Reload-on-401
&lt;/h3&gt;

&lt;p&gt;Every vendor integration uses the same &lt;code&gt;SecretBackedAuth&lt;/code&gt; pattern: cache credentials at cold start, reload on TTL expiry or 401/403. This handles credential rotation without Lambda redeployment.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Reserved Concurrency = 1
&lt;/h3&gt;

&lt;p&gt;The audit poller must not run concurrently (checkpoint race condition). &lt;code&gt;ReservedConcurrentExecutions: 1&lt;/code&gt; is the simplest guard. For higher throughput, move to DynamoDB-based per-object locking.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. DLQ for Every Async Path
&lt;/h3&gt;

&lt;p&gt;Every template includes a KMS-encrypted DLQ. In 9 integrations, the DLQ caught: vendor outages, credential expiry, malformed files, and Lambda timeouts. Without it, these failures would be silent data loss.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Vendor-Specific Batch Limits Matter
&lt;/h3&gt;

&lt;p&gt;The biggest implementation difference across vendors is batch size handling:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Limit&lt;/th&gt;
&lt;th&gt;Lambda Behavior&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Honeycomb&lt;/td&gt;
&lt;td&gt;100 events&lt;/td&gt;
&lt;td&gt;Split into chunks of 100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dynatrace / Sumo Logic&lt;/td&gt;
&lt;td&gt;1 MB&lt;/td&gt;
&lt;td&gt;Measure payload size, split at boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Datadog&lt;/td&gt;
&lt;td&gt;5 MB / 1000 items&lt;/td&gt;
&lt;td&gt;Dual limit check&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Elastic&lt;/td&gt;
&lt;td&gt;~10 MB&lt;/td&gt;
&lt;td&gt;Rarely hit with audit logs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  7. OTLP as the Universal Format
&lt;/h3&gt;

&lt;p&gt;If you're unsure which vendor you'll use long-term, start with OTLP. The OTel Collector integration (Part 5) proved that a single Lambda producing OTLP can feed Datadog, Grafana, and Honeycomb simultaneously — with zero code changes when adding or removing backends.&lt;/p&gt;

&lt;p&gt;Beyond multi-backend delivery, the OTel Collector provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Enrichment&lt;/strong&gt;: Resource detection, Kubernetes attributes, custom metadata injection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sampling&lt;/strong&gt;: Tail-based sampling for high-volume environments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redaction&lt;/strong&gt;: PII field removal/masking before data leaves your account (see &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/integrations/otel-collector/docs/en/pii-redaction-cookbook.md" rel="noopener noreferrer"&gt;PII Redaction Cookbook&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Format conversion&lt;/strong&gt;: OTLP ↔ vendor-native format translation&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Verified version&lt;/strong&gt;: All OTel Collector configurations in this series were tested with &lt;strong&gt;OpenTelemetry Collector Contrib v0.152.0&lt;/strong&gt;. OTel Collector has frequent releases with potential breaking changes — pin your version in production and test before upgrading.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We'd Do Differently
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Start with OTel Collector for Multi-Vendor Evaluation
&lt;/h3&gt;

&lt;p&gt;If evaluating multiple vendors, deploy the OTel Collector path first. It lets you send the same data to 2-3 vendors simultaneously for comparison, without deploying separate Lambda stacks per vendor.&lt;/p&gt;

&lt;h3&gt;
  
  
  Define SLOs Before Building
&lt;/h3&gt;

&lt;p&gt;We defined Pipeline SLOs after building all 9 integrations. In hindsight, defining "&amp;lt; 10 min delivery latency" and "&amp;lt; 0.01% data loss" upfront would have guided design decisions earlier (e.g., checkpoint granularity, retry policy).&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Classification First
&lt;/h3&gt;

&lt;p&gt;Audit logs contain PII (usernames, file paths). We documented this in the Data Classification Guide after implementation. For regulated environments, classify fields before choosing a vendor — it may eliminate options that don't support your data residency requirements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Production Readiness Framework
&lt;/h2&gt;

&lt;p&gt;After 9 integrations, we formalized a 4-level production readiness model:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;Go/No-Go to Next&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Level 1&lt;/strong&gt;: Quickstart&lt;/td&gt;
&lt;td&gt;Audit poller + DLQ&lt;/td&gt;
&lt;td&gt;Logs arrive, checkpoint advances, DLQ empty 24h&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Level 2&lt;/strong&gt;: Operational PoC&lt;/td&gt;
&lt;td&gt;+ Dashboard + alerts&lt;/td&gt;
&lt;td&gt;SLOs met 7 days, security review done&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Level 3&lt;/strong&gt;: Production&lt;/td&gt;
&lt;td&gt;+ DynamoDB ledger + poison-pill&lt;/td&gt;
&lt;td&gt;SLOs met 30 days, compliance pack&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Level 4&lt;/strong&gt;: Enterprise&lt;/td&gt;
&lt;td&gt;+ OTel Collector + redaction&lt;/td&gt;
&lt;td&gt;Multi-backend, PII redaction, DR tested&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most PoCs should target Level 2. Production deployments need Level 3. Enterprise pipelines with compliance requirements need Level 4.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended transition timeline&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Level 1 → Level 2: ~1 week (add dashboards, define SLOs, validate 7-day stability)&lt;/li&gt;
&lt;li&gt;Level 2 → Level 3: ~2-4 weeks (deploy DynamoDB ledger, implement poison-pill handling, complete security review)&lt;/li&gt;
&lt;li&gt;Level 3 → Level 4: ~1-2 months (deploy OTel Collector, implement PII redaction, test DR failover, complete compliance evidence pack)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full criteria: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/pipeline-slo.md" rel="noopener noreferrer"&gt;Pipeline SLO Definitions&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Vendor Selection Decision Tree
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Start
  |
  +-- Need JP data residency?
  |   +-- Yes -&amp;gt; Sumo Logic (JP) or Elastic (self-hosted in Tokyo VPC)
  |   +-- No  |
  |           v
  +-- Need self-hosted (air-gapped)?
  |   +-- Yes -&amp;gt; Elastic or Splunk
  |   +-- No  |
  |           v
  +-- Already have an observability platform?
  |   +-- Yes -&amp;gt; Use that vendor (all 9 are supported)
  |   +-- No  |
  |           v
  +-- Budget constraint (free tier needed)?
  |   +-- Yes -&amp;gt; Sumo Logic (500 MB/day) or Honeycomb (20M events) or New Relic (100 GB)
  |   +-- No  |
  |           v
  +-- Need AI-powered root cause analysis?
  |   +-- Yes -&amp;gt; Dynatrace (Davis AI)
  |   +-- No  |
  |           v
  +-- Need high-cardinality analysis?
  |   +-- Yes -&amp;gt; Honeycomb (BubbleUp)
  |   +-- No  |
  |           v
  +-- Need multi-backend / vendor portability?
  |   +-- Yes -&amp;gt; OTel Collector
  |   +-- No  |
  |           v
  +-- Default -&amp;gt; Datadog (broadest) or Grafana (OTLP-native, open ecosystem)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The FSx for ONTAP S3 AP Constraint That Shaped Everything
&lt;/h2&gt;

&lt;p&gt;The single most impactful technical constraint: &lt;strong&gt;FSx for ONTAP S3 Access Points do not support S3 Event Notifications&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This one fact drove:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;EventBridge Scheduler polling pattern (not event-driven)&lt;/li&gt;
&lt;li&gt;SSM Parameter Store checkpointing (track what's been processed)&lt;/li&gt;
&lt;li&gt;Reserved concurrency = 1 (prevent checkpoint races)&lt;/li&gt;
&lt;li&gt;Safety threshold (stop before Lambda timeout)&lt;/li&gt;
&lt;li&gt;MAX_KEYS_PER_RUN (bound processing per invocation)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If FSx for ONTAP S3 APs add event notification support in the future, the architecture could simplify significantly. As of May 2026, this feature is not supported, and the polling pattern is battle-tested across 9 vendors.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost Reality: EC2 vs Serverless
&lt;/h2&gt;

&lt;p&gt;The original motivation: replace the &lt;a href="https://aws.amazon.com/jp/blogs/news/auditing-user-and-administrative-actions-on-amazon-fsx-for-netapp-ontap-using-splunk/" rel="noopener noreferrer"&gt;EC2-based Splunk pattern&lt;/a&gt; (2x EC2 instances) with serverless.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;EC2 Pattern&lt;/th&gt;
&lt;th&gt;Serverless Pattern&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Monthly AWS cost&lt;/td&gt;
&lt;td&gt;~$66&lt;/td&gt;
&lt;td&gt;~$5-8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OS patching&lt;/td&gt;
&lt;td&gt;Required&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scaling&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;Automatic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vendor support&lt;/td&gt;
&lt;td&gt;Splunk only&lt;/td&gt;
&lt;td&gt;9 vendors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deploy time&lt;/td&gt;
&lt;td&gt;Hours&lt;/td&gt;
&lt;td&gt;30 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recovery from failure&lt;/td&gt;
&lt;td&gt;Manual restart&lt;/td&gt;
&lt;td&gt;Automatic (DLQ + retry)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;90% cost reduction&lt;/strong&gt; with zero operational burden. The serverless pattern wins on every dimension except one: real-time latency (EC2 syslog can be sub-second; our poller is 5-minute intervals). For audit logs, 5 minutes is acceptable. For real-time needs, use the FPolicy path (&amp;lt; 30 seconds).&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;This series covered the foundation. The project continues with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Phase 3 (delivered)&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/multi-account-deployment.md" rel="noopener noreferrer"&gt;Multi-account deployment&lt;/a&gt; (AWS Organizations + StackSets)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 3 (delivered)&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/shared/python/object_ledger.py" rel="noopener noreferrer"&gt;DynamoDB object ledger&lt;/a&gt; for per-object processing state&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 3 (delivered)&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/shared/templates/sqs-buffering.yaml" rel="noopener noreferrer"&gt;SQS buffering pattern&lt;/a&gt; for backpressure handling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 3 (delivered)&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/cross-region-replication.md" rel="noopener noreferrer"&gt;Cross-region DR&lt;/a&gt; with Active-Passive failover&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 3 (delivered)&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/integrations/otel-collector/docs/en/pii-redaction-cookbook.md" rel="noopener noreferrer"&gt;OTel Collector PII redaction cookbook&lt;/a&gt; (7 recipes for APPI/GDPR)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 4&lt;/strong&gt;: Terraform module equivalents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 4&lt;/strong&gt;: CDK construct library&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See the full &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/ROADMAP.md" rel="noopener noreferrer"&gt;ROADMAP&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/fsxn-observability-integrations&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pipeline SLO&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/pipeline-slo.md" rel="noopener noreferrer"&gt;docs/en/pipeline-slo.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Classification&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/data-classification.md" rel="noopener noreferrer"&gt;docs/en/data-classification.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;S3 AP Throughput Benchmark&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/s3ap-throughput-benchmark.md" rel="noopener noreferrer"&gt;docs/en/s3ap-throughput-benchmark.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor Comparison&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/vendor-comparison.md" rel="noopener noreferrer"&gt;docs/en/vendor-comparison.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Partner FAQ&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/partner-faq.md" rel="noopener noreferrer"&gt;docs/en/partner-faq.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workshop Guide&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/workshop-hands-on-half-day.md" rel="noopener noreferrer"&gt;docs/en/workshop-hands-on-half-day.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance Evidence Pack&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/compliance-evidence-pack.md" rel="noopener noreferrer"&gt;docs/en/compliance-evidence-pack.md&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Series Navigation
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Part 1&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/why-your-fsx-for-ontap-audit-logs-deserve-better-than-ec2-kod"&gt;Why Your FSx for ONTAP Audit Logs Deserve Better Than EC2&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 2&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/shipping-fsx-for-ontap-logs-to-datadog-the-serverless-way-n9c"&gt;Shipping FSx for ONTAP Logs to Datadog — The Serverless Way&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 3&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/event-driven-ransomware-detection-with-ontap-arp-datadog-4cda"&gt;Event-Driven Ransomware Detection with ONTAP ARP + Datadog&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 4&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/fpolicy-file-activity-pipeline-ontap-to-datadog-via-ecs-fargate-2ing"&gt;FPolicy File Activity Pipeline — ONTAP to Datadog via ECS Fargate&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 5&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/escape-vendor-lock-in-multi-backend-log-delivery-with-otel-collector-for-fsx-for-ontap-2inb"&gt;Escape Vendor Lock-in: Multi-Backend Log Delivery with OTel Collector for FSx for ONTAP.&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 6&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/direct-to-grafana-shipping-fsx-for-ontap-logs-to-grafana-cloud-loki-via-otlp-gateway-33hk"&gt;Direct-to-Grafana: Shipping FSx for ONTAP Logs to Grafana Cloud Loki via OTLP Gateway&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 7&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/ship-fsx-for-ontap-audit-logs-to-new-relic-via-serverless-lambda-pipeline-3g87"&gt;Ship FSx for ONTAP Audit Logs to New Relic via Serverless Lambda Pipeline&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 8&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/ec2-to-serverless-modernizing-fsx-for-ontap-splunk-integration-e8l"&gt;EC2 to Serverless: Modernizing FSx for ONTAP Splunk Integration&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 9&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/data-sovereignty-fsx-for-ontap-logs-in-your-vpc-with-elastic-1mog"&gt;Data Sovereignty: FSx for ONTAP Logs in Your VPC with Elastic&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 10&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/high-cardinality-file-access-analysis-with-honeycomb-otel-1962"&gt;High-Cardinality File Access Analysis with Honeycomb + OTel&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 11&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/ai-powered-root-cause-correlating-file-access-with-apm-via-dynatrace-4ffl"&gt;AI-Powered Root Cause: Correlating File Access with APM via Dynatrace&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 12&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/fsx-for-ontap-audit-logs-with-data-residency-in-your-region-46e6"&gt;FSx for ONTAP Audit Logs with Data Residency in your region with Sumo Logic&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 13&lt;/strong&gt;: 9 Vendors, One Architecture (this post)&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Thank you for following this series. If you've deployed any of these integrations, I'd love to hear about your experience — drop a comment or open a GitHub issue.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/fsxn-observability-integrations&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>observability</category>
      <category>amazonfsxfornetappontap</category>
      <category>serverless</category>
    </item>
    <item>
      <title>FSx for ONTAP Audit Logs with Data Residency in your region with Sumo Logic</title>
      <dc:creator>Yoshiki Fujiwara(藤原 善基)@AWS Community Builder</dc:creator>
      <pubDate>Sun, 31 May 2026 00:37:19 +0000</pubDate>
      <link>https://dev.to/aws-builders/fsx-for-ontap-audit-logs-with-data-residency-in-your-region-46e6</link>
      <guid>https://dev.to/aws-builders/fsx-for-ontap-audit-logs-with-data-residency-in-your-region-46e6</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;We built a serverless Lambda pipeline that ships FSx for ONTAP audit logs to Sumo Logic's &lt;strong&gt;JP (Tokyo) region&lt;/strong&gt; deployment. For Japanese enterprises with data residency requirements under APPI (Act on the Protection of Personal Information), this means audit logs never leave Japan.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FSx for ONTAP → S3 Access Point → EventBridge Scheduler → Lambda → Sumo Logic HTTP Source (JP)
                                                                         │
                                                                         ▼
                                                              ┌───────────────────┐
                                                              │ Sumo Logic JP     │
                                                              │ (Tokyo)           │
                                                              │                   │
                                                              │ • 500 MB/day FREE │
                                                              │ • Data stays in   │
                                                              │   Japan           │
                                                              │ • 7-day retention │
                                                              │   (free tier)     │
                                                              └───────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;500 MB/day free tier&lt;/strong&gt; (~15 GB/month) — covers most FSx for ONTAP deployments at zero vendor cost&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JP region deployment&lt;/strong&gt; — data residency in Tokyo&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simplest auth model&lt;/strong&gt; — URL-embedded token, no header management&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;30-minute end-to-end&lt;/strong&gt; — HTTP Source URL is the only credential needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Verified on Sumo Logic JP region. Logs searchable via &lt;code&gt;_sourceCategory=aws/fsxn/audit&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is Part 12 of the &lt;a href="https://dev.to/aws-builders/why-your-fsx-for-ontap-audit-logs-deserve-better-than-ec2-kod"&gt;Serverless Observability for FSx for ONTAP&lt;/a&gt; series.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Sumo Logic for Japanese Enterprises?
&lt;/h2&gt;

&lt;p&gt;For organizations operating under Japanese data protection regulations, the choice of observability platform often comes down to one question: &lt;strong&gt;where does the data physically reside?&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Sumo Logic JP&lt;/th&gt;
&lt;th&gt;Other Options&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Data residency in Japan&lt;/td&gt;
&lt;td&gt;✅ Tokyo deployment&lt;/td&gt;
&lt;td&gt;Varies by vendor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;APPI compliance consideration&lt;/td&gt;
&lt;td&gt;✅ Data stays in JP&lt;/td&gt;
&lt;td&gt;May require cross-border assessment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Free tier for validation&lt;/td&gt;
&lt;td&gt;✅ 500 MB/day&lt;/td&gt;
&lt;td&gt;Most offer 14-day trials only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No agent installation&lt;/td&gt;
&lt;td&gt;✅ HTTP Source (agentless)&lt;/td&gt;
&lt;td&gt;Some require collectors&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Sumo Logic's JP deployment (&lt;code&gt;service.jp.sumologic.com&lt;/code&gt;) processes and stores all data within Japan, making it a straightforward choice for organizations that need to demonstrate data residency compliance.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Compliance note&lt;/strong&gt;: This integration provides a technical path for data residency. Evaluate your specific regulatory requirements with your compliance team — data residency alone does not constitute full regulatory compliance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────┐
│ Event Sources                                           │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  EventBridge Scheduler                                  │
│  rate(5 minutes) ──→ Lambda                             │
│                       │ lists new files via             │
│                       │ S3 Access Point                 │
│                       │ (checkpoint in SSM)             │
│                       ▼                                 │
│           Sumo Logic HTTP Source                        │
│           (URL-embedded auth)                           │
│                       │                                 │
│  EMS Webhook          │                                 │
│  ──→ API GW ──→ Lambda ─────────────┤                   │
│     (ems_handler)                   │                   │
│                                     ▼                   │
│  FPolicy                       Sumo Logic               │
│  ──→ ECS Fargate ──→ SQS      (Log Search,              │
│  ──→ Bridge Lambda              Dashboards,             │
│  ──→ EventBridge                Alerts)                 │
│  ──→ Lambda (fpolicy_handler) ──────────────────────────┤
└─────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Simplest Auth Model
&lt;/h2&gt;

&lt;p&gt;Sumo Logic's HTTP Source embeds the authentication token directly in the URL. No separate auth headers, no API key management, no token rotation complexity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;https://collectors.jp.sumologic.com/receiver/v1/http/&amp;lt;TOKEN&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lambda just POSTs JSON to this URL. That's it. The metadata (source category, name, host) is sent via &lt;code&gt;X-Sumo-*&lt;/code&gt; headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;X-Sumo-Category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aws/fsxn/audit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;X-Sumo-Name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fsxn-ontap-audit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;X-Sumo-Host&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fsxn-ontap&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Security&lt;/strong&gt;: The HTTP Source URL contains the auth token. Store it in Secrets Manager, never log it, and never expose it in environment variables or source control.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Quick Start (30 Minutes)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Create Sumo Logic Account (JP Region)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Sign up at &lt;a href="https://service.jp.sumologic.com/" rel="noopener noreferrer"&gt;service.jp.sumologic.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;APAC: Tokyo (JP)&lt;/strong&gt; as your deployment region&lt;/li&gt;
&lt;li&gt;Free tier: 500 MB/day, 7-day retention, full feature access for 30 days&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  2. Create Hosted Collector + HTTP Source
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;strong&gt;Manage Data&lt;/strong&gt; → &lt;strong&gt;Collection&lt;/strong&gt; → &lt;strong&gt;Add Collector&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Hosted Collector&lt;/strong&gt; (name: &lt;code&gt;fsxn-audit-collector&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Add &lt;strong&gt;HTTP Logs &amp;amp; Metrics Source&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Name: &lt;code&gt;fsxn-ontap-audit&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Source Category: &lt;code&gt;aws/fsxn/audit&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Timestamp Parsing: Auto-detect&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Copy the generated HTTP Source URL&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  3. Store HTTP Source URL
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws secretsmanager create-secret &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"sumo-logic/fsxn-http-source"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--secret-string&lt;/span&gt; &lt;span class="s1"&gt;'{"url":"https://collectors.jp.sumologic.com/receiver/v1/http/&amp;lt;YOUR_TOKEN&amp;gt;"}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; ap-northeast-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Deploy CloudFormation Stack
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws cloudformation deploy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--template-file&lt;/span&gt; integrations/sumo-logic/template.yaml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--stack-name&lt;/span&gt; fsxn-sumo-logic-integration &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--parameter-overrides&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;S3AccessPointArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;arn:aws:s3:ap-northeast-1:123456789012:accesspoint/fsxn-audit-ap &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;SumoLogicHttpSourceSecretArn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;arn:aws:secretsmanager:ap-northeast-1:123456789012:secret:sumo-logic/fsxn-http-source-XXXXXX &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nv"&gt;S3BucketName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;my-fsxn-audit-bucket &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--capabilities&lt;/span&gt; CAPABILITY_NAMED_IAM &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; ap-northeast-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Verify in Sumo Logic
&lt;/h3&gt;

&lt;p&gt;Run this search query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;_sourceCategory=aws/fsxn/audit
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: First-time indexing on a new JP region account takes ~10 minutes. Subsequent ingestion is near-instant.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Query Examples
&lt;/h2&gt;

&lt;p&gt;Sumo Logic uses a pipe-based query language:&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic Investigation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- All failed access attempts with user and path&lt;/span&gt;
&lt;span class="n"&gt;_sourceCategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;fsxn&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;audit&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="nv"&gt;"Result"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"UserName"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"ObjectName"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="k"&gt;Result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;"Failure"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;UserName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ObjectName&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;_count&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;

&lt;span class="c1"&gt;-- Top operations by volume&lt;/span&gt;
&lt;span class="n"&gt;_sourceCategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;fsxn&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;audit&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="nv"&gt;"Operation"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="k"&gt;Operation&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;_count&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;

&lt;span class="c1"&gt;-- Access pattern timeline (5-minute buckets)&lt;/span&gt;
&lt;span class="n"&gt;_sourceCategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;fsxn&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;audit&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="nv"&gt;"Operation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"UserName"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;timeslice&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;_timeslice&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;Operation&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Security Investigation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- After-hours access (outside 9am-6pm JST)&lt;/span&gt;
&lt;span class="n"&gt;_sourceCategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;fsxn&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;audit&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="nv"&gt;"UserName"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"Operation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"ObjectName"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;_messagetime&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;_messagetime&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;UserName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;Operation&lt;/span&gt;

&lt;span class="c1"&gt;-- Bulk delete detection (potential data exfiltration)&lt;/span&gt;
&lt;span class="n"&gt;_sourceCategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;fsxn&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;audit&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="nv"&gt;"Operation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"UserName"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="k"&gt;Operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;"Delete"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;timeslice&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;_timeslice&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;UserName&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;

&lt;span class="c1"&gt;-- Sensitive path access&lt;/span&gt;
&lt;span class="n"&gt;_sourceCategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;fsxn&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;audit&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="nv"&gt;"ObjectName"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"UserName"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"Result"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;ObjectName&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt; &lt;span class="nv"&gt;"*confidential*"&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;ObjectName&lt;/span&gt; &lt;span class="n"&gt;matches&lt;/span&gt; &lt;span class="nv"&gt;"*restricted*"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;UserName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;Result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Operational Monitoring
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Log volume trend (capacity planning)&lt;/span&gt;
&lt;span class="n"&gt;_sourceCategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;fsxn&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;audit&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;timeslice&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;_timeslice&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;outlier&lt;/span&gt; &lt;span class="n"&gt;_count&lt;/span&gt;

&lt;span class="c1"&gt;-- SVM activity comparison&lt;/span&gt;
&lt;span class="n"&gt;_sourceCategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;fsxn&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;audit&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt; &lt;span class="nv"&gt;"SVMName"&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;timeslice&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;_timeslice&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SVMName&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Sumo Logic Metadata Headers
&lt;/h2&gt;

&lt;p&gt;Lambda sends structured metadata with each request:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Header&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;X-Sumo-Category&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;aws/fsxn/audit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Primary search dimension&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;X-Sumo-Name&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;fsxn-ontap-audit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Source identification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;X-Sumo-Host&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;fsxn-ontap&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Host-level grouping&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These headers enable efficient searching without parsing the log body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Search by metadata (fast, no JSON parsing)&lt;/span&gt;
&lt;span class="n"&gt;_sourceCategory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;fsxn&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;audit&lt;/span&gt; &lt;span class="n"&gt;_sourceHost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fsxn&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ontap&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Cost Analysis: The Free Tier Advantage
&lt;/h2&gt;

&lt;p&gt;Sumo Logic's free tier is the most generous for small-to-medium FSx for ONTAP deployments:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Monthly Log Volume&lt;/th&gt;
&lt;th&gt;Daily Average&lt;/th&gt;
&lt;th&gt;Sumo Logic Tier&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1 GB&lt;/td&gt;
&lt;td&gt;~33 MB/day&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Free&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5 GB&lt;/td&gt;
&lt;td&gt;~167 MB/day&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Free&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10 GB&lt;/td&gt;
&lt;td&gt;~333 MB/day&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Free&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15 GB&lt;/td&gt;
&lt;td&gt;~500 MB/day&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Free&lt;/strong&gt; (at limit)&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;30 GB&lt;/td&gt;
&lt;td&gt;~1 GB/day&lt;/td&gt;
&lt;td&gt;Professional&lt;/td&gt;
&lt;td&gt;~$108/month&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Monthly Cost (10 GB/month)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lambda (5-min polling)&lt;/td&gt;
&lt;td&gt;~$3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EventBridge Scheduler&lt;/td&gt;
&lt;td&gt;~$1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Secrets Manager&lt;/td&gt;
&lt;td&gt;~$1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Sumo Logic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;$0&lt;/strong&gt; (within free tier)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$5&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;For most FSx for ONTAP deployments generating &amp;lt; 15 GB/month of audit logs, the total cost is just the AWS infrastructure (~$5/month). The observability platform itself is free.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Sumo Logic Deployment Regions
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Region&lt;/th&gt;
&lt;th&gt;URL&lt;/th&gt;
&lt;th&gt;Data Residency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JP (Tokyo)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;service.jp.sumologic.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Japan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;service.sumologic.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US2&lt;/td&gt;
&lt;td&gt;&lt;code&gt;service.us2.sumologic.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;US&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EU (Ireland)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;service.eu.sumologic.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;EU&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AU (Sydney)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;service.au.sumologic.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Australia&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IN (Mumbai)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;service.in.sumologic.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;India&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CA (Montreal)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;service.ca.sumologic.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Canada&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Select the deployment matching your data residency requirements at account creation time. &lt;strong&gt;Region cannot be changed after account creation.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Gotchas &amp;amp; Lessons Learned
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Discovery&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;JP region new accounts have ~10 minute initial indexing lag&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;First search returns empty; wait and retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Search queries use &lt;code&gt;_sourceCategory&lt;/code&gt; (underscore prefix)&lt;/td&gt;
&lt;td&gt;Common mistake: &lt;code&gt;sourceCategory&lt;/code&gt; without underscore returns nothing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;HTTP Source URL contains embedded auth token&lt;/td&gt;
&lt;td&gt;Rotate by creating a new HTTP Source, updating the Secrets Manager secret, then deleting the old Source&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Free tier has 7-day retention only&lt;/td&gt;
&lt;td&gt;Sufficient for real-time monitoring; archive to S3 for long-term&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;No built-in Firehose support&lt;/td&gt;
&lt;td&gt;Lambda direct delivery only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Max 1MB per request (newline-delimited JSON)&lt;/td&gt;
&lt;td&gt;Lambda batches accordingly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Region is permanent — cannot migrate data between deployments&lt;/td&gt;
&lt;td&gt;Choose JP at signup for data residency&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Free Tier vs Professional
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Free (500 MB/day)&lt;/th&gt;
&lt;th&gt;Professional&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Daily ingestion&lt;/td&gt;
&lt;td&gt;500 MB&lt;/td&gt;
&lt;td&gt;1+ GB (configurable)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Retention&lt;/td&gt;
&lt;td&gt;7 days&lt;/td&gt;
&lt;td&gt;30-365 days&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Users&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Alerts&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dashboards&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API access&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data residency&lt;/td&gt;
&lt;td&gt;✅ (region-specific)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Tip: Enable Field Extraction Rules (FER)&lt;/strong&gt;: After first ingestion, create a FER for &lt;code&gt;_sourceCategory=aws/fsxn/audit&lt;/code&gt; with "Auto-parse JSON" enabled. This automatically extracts all JSON fields (UserName, Operation, ObjectName, etc.) as searchable metadata — no manual field definition needed. Go to &lt;strong&gt;Manage Data&lt;/strong&gt; → &lt;strong&gt;Logs&lt;/strong&gt; → &lt;strong&gt;Field Extraction Rules&lt;/strong&gt; → &lt;strong&gt;Add Rule&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For most PoC and small production deployments, the free tier is sufficient. Upgrade when you need longer retention or higher volume.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Residency &amp;amp; Classification
&lt;/h2&gt;

&lt;p&gt;Sumo Logic JP deployment keeps all data in Japan, but audit logs still contain PII fields:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Classification&lt;/th&gt;
&lt;th&gt;Sumo Logic Handling&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;UserName&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;PII&lt;/td&gt;
&lt;td&gt;Use RBAC to restrict search access; consider field extraction rules for masking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ObjectName&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Sensitive&lt;/td&gt;
&lt;td&gt;Path may reveal business context; restrict dashboard sharing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ClientIP&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Internal&lt;/td&gt;
&lt;td&gt;Generally acceptable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For APPI compliance considerations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data stays in JP region (no cross-border transfer for log data)&lt;/li&gt;
&lt;li&gt;Configure retention policies matching your regulatory requirements (7 days free tier vs 30-365 days paid)&lt;/li&gt;
&lt;li&gt;Implement Sumo Logic RBAC to restrict PII field access by role&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;See the &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/data-classification.md" rel="noopener noreferrer"&gt;Data Classification Guide&lt;/a&gt; for full field classification and regulatory mapping.&lt;/p&gt;

&lt;h2&gt;
  
  
  Production Readiness
&lt;/h2&gt;

&lt;p&gt;This integration follows the project's &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations#production-readiness-levels--%E6%9C%AC%E7%95%AA%E6%BA%96%E5%82%99%E3%83%AC%E3%83%99%E3%83%AB" rel="noopener noreferrer"&gt;Production Readiness Levels&lt;/a&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;What You Get&lt;/th&gt;
&lt;th&gt;Go/No-Go to Next&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Level 1 (this Quick Start)&lt;/td&gt;
&lt;td&gt;Audit poller + DLQ&lt;/td&gt;
&lt;td&gt;Logs arrive, checkpoint advances, DLQ empty 24h&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Level 2&lt;/td&gt;
&lt;td&gt;+ Sumo dashboards + alerts&lt;/td&gt;
&lt;td&gt;SLOs met 7 days, security review done&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Level 3&lt;/td&gt;
&lt;td&gt;+ DynamoDB ledger + poison-pill&lt;/td&gt;
&lt;td&gt;SLOs met 30 days, compliance pack&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Level 4&lt;/td&gt;
&lt;td&gt;+ OTel Collector + redaction&lt;/td&gt;
&lt;td&gt;Multi-backend, PII redaction, DR tested&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Full criteria: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/pipeline-slo.md" rel="noopener noreferrer"&gt;Pipeline SLO Definitions&lt;/a&gt; | &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/runbooks/dlq-replay.md" rel="noopener noreferrer"&gt;DLQ Replay Runbook&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Enterprise scale&lt;/strong&gt;: For multi-account deployments across your Organization, see the &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/multi-account-deployment.md" rel="noopener noreferrer"&gt;StackSets deployment guide&lt;/a&gt;. For compliance evidence collection (ISMAP, FISC, SOC 2), see the &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/compliance-evidence-pack.md" rel="noopener noreferrer"&gt;Compliance Evidence Pack&lt;/a&gt;. For regulation-to-vendor retention mapping, see the &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/blob/main/docs/en/retention-policy-matrix.md" rel="noopener noreferrer"&gt;Retention Policy Matrix&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  CloudFormation Templates
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Template&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Key Parameters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;template.yaml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;FSx audit log poller&lt;/td&gt;
&lt;td&gt;S3AccessPointArn, SumoLogicHttpSourceSecretArn&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;template-ems.yaml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;EMS webhook handler&lt;/td&gt;
&lt;td&gt;SumoLogicHttpSourceSecretArn&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;template-fpolicy.yaml&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;FPolicy EventBridge handler&lt;/td&gt;
&lt;td&gt;SumoLogicHttpSourceSecretArn, EventBusName&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations/tree/main/integrations/sumo-logic" rel="noopener noreferrer"&gt;integrations/sumo-logic/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sumo Logic JP&lt;/strong&gt;: &lt;a href="https://service.jp.sumologic.com/" rel="noopener noreferrer"&gt;service.jp.sumologic.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP Source Docs&lt;/strong&gt;: &lt;a href="https://help.sumologic.com/docs/send-data/hosted-collectors/http-source/logs-metrics/" rel="noopener noreferrer"&gt;Configure HTTP Source&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search Query Language&lt;/strong&gt;: &lt;a href="https://help.sumologic.com/docs/search/search-query-language/" rel="noopener noreferrer"&gt;Search Operators&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Series GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/fsxn-observability-integrations&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Series Navigation
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Part 1&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/why-your-fsx-for-ontap-audit-logs-deserve-better-than-ec2-kod"&gt;Why Your FSx for ONTAP Logs Deserve Better&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 2&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/shipping-fsx-for-ontap-logs-to-datadog-the-serverless-way-n9c"&gt;Shipping FSx for ONTAP Logs to Datadog — The Serverless Way&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 3&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/event-driven-ransomware-detection-with-ontap-arp-datadog-4cda"&gt;Event-Driven Ransomware Detection with ONTAP ARP + Datadog&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 4&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/fpolicy-file-activity-pipeline-ontap-to-datadog-via-ecs-fargate-2ing"&gt;FPolicy File Activity Pipeline — ONTAP to Datadog via ECS Fargate&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 5&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/escape-vendor-lock-in-multi-backend-log-delivery-with-otel-collector-for-fsx-for-ontap-2inb"&gt;Escape Vendor Lock-in: Multi-Backend Log Delivery with OTel Collector for FSx for ONTAP.&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 6&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/direct-to-grafana-shipping-fsx-for-ontap-logs-to-grafana-cloud-loki-via-otlp-gateway-33hk"&gt;Direct-to-Grafana: Shipping FSx for ONTAP Logs to Grafana Cloud Loki via OTLP Gateway&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 7&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/ship-fsx-for-ontap-audit-logs-to-new-relic-via-serverless-lambda-pipeline-3g87"&gt;Ship FSx for ONTAP Audit Logs to New Relic via Serverless Lambda Pipeline&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 8&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/ec2-to-serverless-modernizing-fsx-for-ontap-splunk-integration-e8l"&gt;EC2 to Serverless: Modernizing FSx for ONTAP Splunk Integration&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 9&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/data-sovereignty-fsx-for-ontap-logs-in-your-vpc-with-elastic-1mog"&gt;Data Sovereignty: FSx for ONTAP Logs in Your VPC with Elastic&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 10&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/high-cardinality-file-access-analysis-with-honeycomb-otel-1962"&gt;High-Cardinality File Access Analysis with Honeycomb + OTel&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 11&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/ai-powered-root-cause-correlating-file-access-with-apm-via-dynatrace-4ffl"&gt;AI-Powered Root Cause: Correlating File Access with APM via Dynatrace&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 12&lt;/strong&gt;: FSx for ONTAP Audit Logs with Data Residency in your region with Sumo Logic (this post)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part 13&lt;/strong&gt;: &lt;a href="https://dev.to/aws-builders/9-services-one-architecture-what-we-learned-shipping-fsx-for-ontap-logs-to-every-major-19ig"&gt;9 Vendors, One Architecture&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Questions about the Sumo Logic JP integration or data residency? Drop a comment below.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Yoshiki0705/fsxn-observability-integrations" rel="noopener noreferrer"&gt;github.com/Yoshiki0705/fsxn-observability-integrations&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>sumologic</category>
      <category>observability</category>
      <category>amazonfsxfornetappontap</category>
    </item>
  </channel>
</rss>
