TL;DR
Phase 15 expands the pattern library from 17 to 28 industry-specific use cases, providing reference implementations across major AWS Industry verticals where FSx for ONTAP file processing is relevant. Each new pattern includes a CloudFormation template, Step Functions workflow, Python Lambda functions, 8-language documentation, and property-based tests. Combined with 6 FlexCache/FlexClone patterns and 1 SAP/ERP pattern, the repository now offers 35 deployable reference patterns for enterprise file processing on FSx for ONTAP.
The SAP/ERP pattern focuses on controlled document/report processing around ERP-adjacent file exports (IDoc, spool), not direct transactional SAP data manipulation.
Important: These are reference implementations with production-readiness guidance, not fully certified production systems. Customers must validate against their own regulatory, security, and operational requirements before production use.
For S3 standard bucket users: This library is not a replacement for S3 data lake patterns. It is a file-data integration pattern for customers who want to process FSx ONTAP-resident data through S3-compatible APIs while preserving NAS access paths. See
docs/s3-bucket-user-guide.mdfor a detailed comparison.Serverless boundary: Compute (Lambda), orchestration (Step Functions), eventing (EventBridge), and AI services (Bedrock, Textract, Rekognition) are serverless/managed. FSx for ONTAP is a fully managed file system with provisioned capacity and operational considerations — it is not scale-to-zero storage. This is a serverless processing pattern over existing enterprise file data, not a pure serverless storage pattern.
When NOT to use this: If your workload is already object-native, does not require NFS/SMB coexistence, and can use standard S3 data lake patterns — prefer S3-native serverless architecture.
Repository: github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns
Why 28 Use Cases?
AWS organizes customers into 22 industry verticals. When we mapped our existing 17 patterns against these verticals, several gaps stood out:
- Telecommunications — No CDR/network log processing pattern
- Advertising & Marketing — No creative asset management
- Travel & Hospitality — No document processing for reservations
- Agriculture & Food — No traceability or crop monitoring
- Sustainability/ESG — No ESG metrics extraction
- Nonprofit — No grant management automation
- Utilities — No drone/SCADA-based asset inspection
- Real Estate — No portfolio analysis
- HR — No resume screening (with PII protection)
- Chemicals — No SDS/lab notebook processing
- Transportation (railway) — No deterioration detection
Phase 15 fills all of these, covering 19 of 22 AWS Industry verticals (remaining 3 — Consumer Packaged Goods, Mining, Software/Internet — have limited file-processing relevance for this pattern type). Combined with 11 Japan-market focus areas (all covered), the repository addresses the vast majority of enterprise file processing scenarios.
The 11 New Patterns
P0: Foundation Patterns
| UC | Industry | Key AWS Services | Differentiator |
|---|---|---|---|
| UC18 | Telecom | Athena, Bedrock | CDR/syslog anomaly detection with 7-day baseline |
| UC19 | AdTech | Rekognition, Textract, Bedrock | Brand compliance scoring + moderation |
P1: Document Intelligence
| UC | Industry | Key AWS Services | Differentiator |
|---|---|---|---|
| UC20 | Travel | Textract, Comprehend, Rekognition | Multilingual reservation extraction + facility inspection |
| UC21 | Agriculture | Rekognition, Textract, Bedrock | GeoTIFF crop analysis + lot traceability |
| UC22 | Transportation | Rekognition, Textract, Bedrock | Safety-critical escalation trigger + deterioration trends |
P2: Specialized Processing
| UC | Industry | Key AWS Services | Differentiator |
|---|---|---|---|
| UC23 | Sustainability | Textract, Bedrock | ESG metric extraction + GRI/TCFD/ISSB mapping |
| UC24 | Nonprofit | Textract, Comprehend, Bedrock | Grant application + outcome matching |
| UC25 | Utilities | Rekognition, Bedrock, Athena | Drone + SCADA + thermal tri-modal inspection |
| UC26 | Real Estate | Rekognition, Textract, Bedrock | Property analysis + lease extraction + PII flagging |
| UC27 | HR | Textract, Comprehend, Bedrock | Recruiting document triage with PII protection |
| UC28 | Chemicals | Textract, Rekognition, Bedrock | SDS hazard extraction + GHS compliance + lab notebook |
Architecture: One Pattern, Many Industries
Architecture Classification
| Layer | Classification |
|---|---|
| Workflow orchestration | Serverless (Step Functions) |
| Compute | Serverless (Lambda) |
| Eventing / scheduling | Serverless (EventBridge) |
| AI/ML services | Managed service consumption (Bedrock, Textract, Rekognition, Comprehend) |
| File storage | Managed/provisioned (FSx for ONTAP) |
| Operations model | Hybrid: serverless processing + managed file storage |
Lambda concurrency must be bounded by FSx ONTAP S3 AP throughput behavior. Do not treat Lambda concurrency as the only scaling control.
Common Workflow Pattern
Every pattern follows the same proven architecture:
EventBridge Scheduler
│
▼
Step Functions State Machine
│
├── Discovery Lambda (VPC-internal, ONTAP API)
│ │
│ ▼
│ S3 Access Point (list + classify files)
│
├── Processing Map (parallel, Retry + Catch)
│ │
│ ▼
│ [Rekognition | Textract | Comprehend | Bedrock | Athena]
│
└── Report Lambda
│
├── Output → S3 AP (FSx ONTAP volume)
└── SNS Notification
What changes per industry:
- File prefixes and extensions (Discovery Lambda configuration)
- AI/ML service selection (Rekognition for images, Textract for documents, Bedrock for reasoning)
- Domain-specific schemas (ESG metrics, GHS sections, CDR fields)
- Review thresholds (60% escalation trigger for safety-critical defects, 80% standard detection, 90% auto-approve threshold)
- Compliance requirements (PII filtering for HR, data classification labels, audit trails)
For production deployments, validate how S3 AP-generated output files appear from existing NFS/SMB clients, including ownership, permissions, naming convention, and Snapshot/SnapMirror policy impact. See ONTAP Integration Notes.
Shared Modules: The Productivity Multiplier
The 11 new patterns reuse the same shared/ modules that power the original 17:
| Module | Purpose | Used By |
|---|---|---|
s3ap_helper.py |
S3 Access Point abstraction (alias + ARN) | All 28 UCs |
exceptions.py |
Domain exceptions + error handler decorator | All 28 UCs |
observability.py |
EMF metrics + structured logging | All 28 UCs |
human_review.py |
Confidence-based review decisions | UC22, UC25, UC27 |
data_classification.py |
Output data labeling (INTERNAL/CUI/etc.) | UC23, UC24, UC27, UC28 |
schemas/events.py |
TypedDict event/response schemas | All 28 UCs |
Adding a new industry pattern takes 2-3 hours (not days) because the infrastructure is already solved. A new pattern is considered field-shareable only after DemoMode execution, cfn-lint validation, unit/property tests, success metrics, data classification, and human review thresholds are documented.
Key Design Decisions for New Patterns
1. Safety-Critical Thresholds (UC22)
Railway infrastructure inspection cannot accept false negatives. We use a dual-threshold approach:
STANDARD_THRESHOLD = 80 # General defect detection trigger
SAFETY_CRITICAL_THRESHOLD = 60 # Bridges, signaling, rail joints — lower to reduce false negatives
HUMAN_REVIEW_THRESHOLD = 90 # Auto-approve only above this
Critical design intent: 60% is NOT an auto-approval threshold. It is an escalation trigger — any signal above 60% for safety-critical categories triggers mandatory human review. The system is designed to surface potential defects for expert evaluation, not to automate safety decisions. All detections below 90% confidence require human review regardless of category.
2. PII-First Design (UC27)
Recruiting document triage handles personal data. The pattern enforces:
- No PII in logs — structured logging strips personal identifiers
- Protected characteristic exclusion — Bedrock prompt explicitly excludes age, gender, ethnicity
- Encrypted output — all results written with data classification labels
- Audit trail — every scoring decision is logged with justification (not content)
Regulatory notice: UC27 is a document triage and summarization workflow, not an automated hiring decision system. Final hiring decisions must remain with qualified human reviewers. Customers must validate against local labor law, privacy regulations (GDPR, APPI, CCPA), and anti-discrimination requirements before any use in recruitment processes. Output must not include ranking by protected attributes, and explanation fields must cite only job-relevant qualifications.
3. Tri-Modal Inspection (UC25)
Utilities asset inspection combines three data modalities in a single workflow:
- Visual (drone images) → Rekognition defect detection
- Temporal (SCADA logs) → Athena time-series anomaly detection
- Thermal (FLIR images) → Hot-spot classification (≥10°C differential)
The Step Functions workflow processes all three in parallel Map states, then merges results for a unified maintenance priority report.
4. ESG Framework Mapping (UC23)
Sustainability reporting requires mapping extracted metrics to multiple frameworks simultaneously:
- GRI (Global Reporting Initiative)
- TCFD (Task Force on Climate-related Financial Disclosures)
- ISSB (International Sustainability Standards Board)
Bedrock performs the mapping using structured prompts with framework-specific indicator definitions.
Testing: 1,499+ Tests Across 28 Patterns
Each new pattern includes:
- Unit tests with moto for AWS service mocking
- Property-based tests (Hypothesis) for invariant verification
- cfn-lint validation for all CloudFormation templates
- ruff linting for Python code quality
Notable property tests:
- UC22:
severity_level ∈ {critical, major, minor, observation}for all inputs - UC25: SCADA thresholds within physical bounds (voltage ±5%, frequency ±0.5 Hz)
- UC27: No protected characteristics appear in any output field
- UC28: All GHS mandatory sections validated for completeness
Responsible AI and Human Review
These patterns are reference workflows, not fully automated decision systems. For regulated or safety-critical domains (healthcare, finance, transportation, HR, public sector), customers must define:
- Human review thresholds — what confidence level requires expert validation
- Appeal/escalation process — how incorrect classifications are corrected
- Audit trail requirements — what decisions need immutable logging
- Data retention policy — how long intermediate results are kept
- Model evaluation criteria — accuracy, hallucination rate, bias testing on domain data
- Local regulatory review — jurisdiction-specific compliance (FISC, HIPAA, GDPR, NARA, labor law)
The shared/human_review.py module provides a framework for confidence-based routing, but threshold values and escalation procedures must be defined by domain experts, not by template defaults.
Customers are responsible for validating these workflows against their own policies, risk classification, and regulatory obligations before production use.
Pattern Selection Guide
| Customer Situation | Recommended Starting Pattern |
|---|---|
| FSx ONTAP already used for shared files | UC by industry + DemoMode=false |
| No FSx ONTAP yet, wants to evaluate workflow | Any UC + DemoMode=true |
| Document-heavy workload (PDF, contracts, reports) | UC20 / UC23 / UC24 / UC26 / UC27 / UC28 |
| Image-heavy inspection workload | UC19 / UC21 / UC22 / UC25 |
| Logs / time-series / analytics workload | UC18 / UC25 (SCADA) |
| Safety-critical review required | UC22 / UC25 with human_review module |
| PII-sensitive workflow | UC27 / UC26 with data_classification module |
| ESG / sustainability reporting | UC23 with framework mapping |
| Greenfield object-native workload (no NAS) | Prefer standard S3 + serverless-native architecture |
DemoMode to Production Path
| Area | DemoMode (evaluation) | Production (FSx ONTAP) |
|---|---|---|
| Input source | Regular S3 bucket | FSx ONTAP S3 Access Point |
| Permissions | S3 IAM only | IAM + S3 AP policy + ONTAP file identity |
| Network | Public AWS service path | Internet-origin or VPC-origin design decision (NetworkOrigin is immutable after creation) |
| Data | Sample/synthetic data | Customer-controlled NAS data |
| Governance | Demo labels only | Data classification + lineage + retention |
| Cost | ~$0.10/execution | + FSx ONTAP infrastructure (~$194/month base) |
| Code compatibility | Standard S3 bucket semantics | Validate the FSx ONTAP S3 AP API subset and unsupported S3 bucket features before production |
| Access point lifecycle | N/A | NetworkOrigin changes require creating a new S3 AP |
Cost varies by region, deployment type, SSD capacity, throughput capacity, backups, and data transfer; the figure above is a baseline estimate for Single-AZ / 128 MBps / 1 TB SSD. This cost model is not scale-to-zero storage. Use this pattern when the value of processing existing NAS data in place outweighs the baseline FSx ONTAP infrastructure cost.
Deployment: 30 Minutes to First Result
Every pattern includes a samconfig.toml.example and step-by-step deployment:
# 1. Copy and configure
cp samconfig.toml.example samconfig.toml
# Edit: S3AccessPointAlias, VpcId, SubnetIds, etc.
# 2. Deploy
sam build && sam deploy --guided
# 3. Execute
aws stepfunctions start-execution \
--state-machine-arn <ARN from outputs>
# 4. Verify
aws stepfunctions describe-execution --execution-arn <ARN>
# Status: SUCCEEDED
For patterns without FSx for ONTAP, DemoMode=true uses a regular S3 bucket — ideal for evaluation without infrastructure commitment.
Benchmark Insight: Small Files Don't Need More Throughput
During Phase 15 deployment verification, we ran benchmarks at 128/256/512 MBps throughput capacity with a 202-byte JSON manifest:
| Throughput | P50 @ conc=1 | P50 @ conc=25 | P50 @ conc=50 |
|---|---|---|---|
| 256 MBps | 56.9 ms | 60.3 ms | 257.9 ms |
| 512 MBps | 59.8 ms | 59.9 ms | 246.1 ms |
Conclusion: For metadata-heavy workloads (JSON manifests, small config files, document headers), throughput capacity increase has zero effect on latency. The bottleneck is connection overhead (TLS + S3 AP routing), not bandwidth. Save costs by staying at 128 MBps for these workloads.
Sizing reference from a specific test environment, not a service limit.
Documentation: 8 Languages × 28 Patterns
Every pattern includes documentation in:
🇯🇵 Japanese (primary) · 🇺🇸 English · 🇰🇷 Korean · 🇨🇳 Chinese (Simplified) · 🇹🇼 Chinese (Traditional) · 🇫🇷 French · 🇩🇪 German · 🇪🇸 Spanish
Each language includes:
-
README.md— Overview, deployment, success metrics -
docs/architecture.md— Mermaid data flow diagram -
docs/demo-guide.md— Step-by-step demo with verification checklist
Each UC README includes Success Metrics with Business Outcome, Technical KPI, Quality KPI, Cost KPI, and Go/No-Go criteria. This article summarizes the portfolio; detailed success criteria live with each pattern.
What Changed Since Phase 14
| Metric | Phase 14 | Phase 15 | Delta |
|---|---|---|---|
| Use cases | 17 | 28 | +11 |
| Total patterns | 24 | 35 | +11 |
| Test count | ~800 | 1,499+ | +699 |
| Industries covered | 14/22 | 19/22 | +5 |
| Languages | 8 | 8 | — |
| Shared modules | 8 | 11 | +3 |
| Documentation files | ~400 | ~700 | +300 |
Who Should Use Each New Pattern?
Recommended Starting Patterns
| Start here if... | Pattern | Why |
|---|---|---|
| You want document intelligence | UC20 or UC26 | Multilingual extraction + property/lease analysis |
| You want log analytics | UC18 | CDR/syslog anomaly detection with baseline |
| You need PII-safe document triage | UC27 | Protected characteristic exclusion built-in |
| You need inspection workflows | UC22 or UC25 | Safety-critical escalation + tri-modal |
| You want ESG extraction | UC23 | Multi-framework mapping (GRI/TCFD/ISSB) |
Full Pattern List
| If you are... | Start with... | Why |
|---|---|---|
| Telecom operator with CDR data | UC18 | Anomaly detection across network logs |
| Ad agency managing creative assets | UC19 | Automated brand compliance scoring |
| Hotel chain with inspection photos | UC20 | Facility condition monitoring at scale |
| Agricultural cooperative | UC21 | Crop health + traceability in one workflow |
| Railway/transit operator | UC22 | Safety-critical deterioration detection |
| ESG reporting team | UC23 | Multi-framework metric extraction |
| Grant-making foundation | UC24 | Application processing + outcome matching |
| Power utility with drone programs | UC25 | Tri-modal inspection (visual + SCADA + thermal) |
| Real estate portfolio manager | UC26 | Property analysis + lease extraction |
| Recruiting team (APAC/EMEA) | UC27 | PII-compliant recruiting document triage |
| Chemical manufacturer | UC28 | SDS compliance + lab notebook digitization |
What's Next
- VPC-internal Lambda benchmark — True VPC path performance (eliminates Internet latency)
- FPolicy TCP-level Replay Storm — Real ONTAP event replay (requires ECS rebuild)
- Cross-repository integration — Link patterns to fsxn-lakehouse-integrations for analytics pipelines
- Glue Data Catalog integration — Schema versioning and data quality checks for output datasets
- Community contributions — Pattern template for community-submitted industry use cases
Resolved from Phase 14: FlexCache × S3 AP integration confirmed as not currently supported by AWS — tracked in Field Feedback Log. FC1 Recovery Metrics depend on this feature. Both remain pending AWS feature availability.
Ownership Model
| Layer | Recommended Owner |
|---|---|
Shared modules (shared/) |
Platform / DevOps team |
UC business logic (functions/) |
Application / data team |
| FSx ONTAP and S3 AP infrastructure | Storage / platform team |
| IAM, data classification, encryption | Security team |
| Success metrics and Go/No-Go | Business owner |
| Regulatory compliance mapping | GRC / legal team |
Compliance Positioning
These templates do not certify compliance with any specific regulation. They provide implementation hooks for audit logging, retention, classification, and human review that customers can map to their regulatory controls. Each organization must independently validate compliance with applicable regulations (FISC, HIPAA, GDPR, NARA, local labor law, etc.).
NetApp / ONTAP Operational Notes
For production deployments on FSx for ONTAP, review the ONTAP-specific guidance in docs/ontap-integration-notes.md, including:
- SVM / volume / protocol scope assumptions
- NFS/SMB visibility of S3 AP-generated outputs (file ownership = AP file system identity)
- IAM + S3 AP policy + ONTAP file identity behavior, separate from NFS export policy evaluation
- Snapshot / SnapMirror / retention impact on output artifacts
- Scheduler vs FPolicy trigger mode selection
- FlexCache / FlexClone combination patterns per UC
- NetApp support diagnostic bundle
- OT/manufacturing safety caveat
FlexCache/FlexClone note: UC × FC combination patterns describe adjacent architecture patterns. Validate current AWS/FSx feature support before assuming direct S3 AP access to cached or cloned paths.
Benchmark scope: Results are from Single-AZ, First-generation FSx ONTAP. Validate separately for Multi-AZ or newer generation file systems.
Regulated research workflows (UC7, UC28, FC5): Capture input dataset version, model/prompt version, reviewer action, and output checksum as lineage metadata. See
shared/lineage.pyv2 fields.
Stats
- New patterns: 11 (UC18-UC28)
- New Lambda functions: 44 (4 per pattern average)
- New tests: 699
- New documentation files: ~300 (across 8 languages)
-
New shared modules:
data_classification.py,human_review.py,schemas/events.py - Deployment verified: All 28 UCs achieved SUCCEEDED status in ap-northeast-1
- Benchmark runs: 2 additional (256/512 MBps small-file comparison)
- Cost: ~$10 total for deployment verification (Lambda + Step Functions + Bedrock Nova Lite)
Try It Today
git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns.git
cd FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns
# Quick test (no AWS account needed)
make test-quick
# Deploy any pattern with DemoMode (no FSx ONTAP needed)
cd telecom-network-analytics
cp samconfig.toml.example samconfig.toml
sam build && sam deploy --guided
Repository: github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns
Full series: FSx for ONTAP S3 Access Points on DEV.to
Top comments (0)