DEV Community

Cover image for Building an Agentic Access-Aware RAG System with Amazon FSx for NetApp ONTAP, S3 Vectors, and S3 Access Pointsโ€” Where AI Respects File Permissions

Building an Agentic Access-Aware RAG System with Amazon FSx for NetApp ONTAP, S3 Vectors, and S3 Access Pointsโ€” Where AI Respects File Permissions

๐Ÿ†• Updated April 2026: v4.0.0 released with 6 new features โ€” Agent Registry, Multimodal RAG, Guardrails, Episodic Memory, Voice Chat, and AgentCore Policy. See what's new.

Introduction

Enterprise data lives on file servers. And on those file servers, not everyone can see everything โ€” NTFS ACLs, UNIX permissions, and group policies control who accesses what. But when you plug that data into a Retrieval-Augmented Generation (RAG) system, those permission boundaries tend to disappear. Suddenly, anyone can ask the AI about another team's, division's, or board member's confidential information.

But there's a flip side to this problem that's equally important: without permission awareness, the AI can't fully help the people it should be helping.

Think about it. An engineer has years of design docs, project specs, and team-internal notes in their department's shared folder. A sales lead has pipeline data, customer contracts, and regional forecasts in theirs. When you strip away permissions and dump everything into one vector store, the AI doesn't just leak confidential data โ€” it also drowns each user's results in irrelevant noise from every other team. The engineer gets sales forecasts mixed into their search results. The sales lead gets CI/CD pipeline docs they'll never need.

Permission-aware RAG flips this around. Because the system knows exactly which files each user can access, it delivers personalized, noise-free AI assistance grounded in the data each person actually works with day to day. Your personal folder, your team's shared drive, the cross-functional project space you're part of โ€” the AI sees what you see, nothing more, nothing less.

I built Agentic Access-Aware RAG to make this real. It's an open-source system that lets AI agents autonomously search, analyze, and respond to enterprise data stored on Amazon FSx for NetApp ONTAP โ€” while respecting per-user file-level access permissions. The same question yields different answers depending on who's asking: an admin gets the full financial report, a project member gets their project's restricted docs, and a general user gets public information only. Each user gets an AI assistant that's effectively customized to their role and responsibilities โ€” without any manual configuration.

The entire stack deploys with a single npx cdk deploy --all command.

๐Ÿ‘‰ GitHub: Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG
๐Ÿ“ฆ Latest Release: v4.0.0 โ€” 6 new features added


Architecture at a Glance

Browser โ†’ AWS WAF โ†’ CloudFront (OAC+Geo) โ†’ Lambda Web Adapter (Next.js 15)
                                                    โ”‚
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ–ผ             โ–ผ                       โ–ผ                  โ–ผ
        Cognito       Bedrock KB              DynamoDB            DynamoDB
       User Pool    + S3 Vectors /          user-access          perm-cache
                    OpenSearch SL           (SID Data)         (Perm Cache)
                         โ”‚
                         โ–ผ
                  FSx for ONTAP
                  (SVM + Volume)
                + S3 Access Point
Enter fullscreen mode Exit fullscreen mode

The system is organized into 7 CDK stacks: WAF, Networking, Security (Cognito), Storage (FSx ONTAP + DynamoDB), AI (Bedrock KB + vector store), WebApp (Lambda + CloudFront), and an optional Embedding stack.

Architecture โ€” KB Mode Card Grid


The Core Idea: Permission-Aware RAG

Traditional RAG retrieves documents based on semantic similarity alone. This system adds a second dimension: SID-based permission filtering.

Here's the flow:

  1. User sends a question via the chat UI
  2. The app retrieves the user's SID list (personal SID + group SIDs) from DynamoDB
  3. Bedrock KB Retrieve API performs vector search โ€” each result carries allowed_group_sids metadata
  4. The app matches each document's SIDs against the user's SIDs
  5. Only permitted documents are passed to the Converse API for answer generation
  6. The user sees a filtered response with citation badges showing access levels
โ–  Admin user: SIDs = [...-512 (Domain Admins), S-1-1-0 (Everyone)]
  public/          โ†’ S-1-1-0 match  โ†’ โœ… Permitted
  confidential/    โ†’ ...-512 match  โ†’ โœ… Permitted
  engineering/     โ†’ No match       โ†’ โŒ Filtered out (no noise from other teams)

โ–  Engineer (Engineering group member): SIDs = [...-1100 (Engineering), S-1-1-0 (Everyone)]
  public/          โ†’ S-1-1-0 match  โ†’ โœ… Permitted
  confidential/    โ†’ No match       โ†’ โŒ Denied
  engineering/     โ†’ ...-1100 match โ†’ โœ… Their team's docs, front and center

โ–  Sales user: SIDs = [...-1200 (Sales), S-1-1-0 (Everyone)]
  public/          โ†’ S-1-1-0 match  โ†’ โœ… Permitted
  confidential/    โ†’ No match       โ†’ โŒ Denied
  engineering/     โ†’ No match       โ†’ โŒ No engineering noise in their results
Enter fullscreen mode Exit fullscreen mode

The engineer asking "What's the status of Project X?" gets answers from their team's internal docs โ€” not from sales forecasts or HR policies. The sales lead asking "What are our Q3 targets?" gets their regional data without wading through engineering specs. Each user's AI experience is naturally scoped to the data they work with every day.

Chat Response with Citation + Access Level Badge


S3 Access Points: The Bridge Between FSx ONTAP and Bedrock KB

One of the most impactful recent additions is S3 Access Point integration with FSx for ONTAP. This creates a clean, single-path data ingestion architecture:

FSx ONTAP Volume (/data)
  โ”œโ”€โ”€ public/company-overview.md
  โ”œโ”€โ”€ public/company-overview.md.metadata.json
  โ”œโ”€โ”€ confidential/financial-report.md
  โ”œโ”€โ”€ confidential/financial-report.md.metadata.json
      โ”‚
      โ”‚  S3 Access Point
      โ–ผ
  Bedrock KB Data Source (S3 AP alias)
      โ”‚  Ingestion Job (chunking + Titan Embed v2)
      โ–ผ
  Vector Store (S3 Vectors or OpenSearch Serverless)
Enter fullscreen mode Exit fullscreen mode

Before S3 Access Points, getting data from FSx ONTAP into Bedrock KB required either a custom Embedding server with CIFS mounts or manual S3 uploads. Now, Bedrock KB reads documents directly from the FSx ONTAP volume through the S3 Access Point โ€” no intermediate copies, no sync scripts.

The S3 AP user type is automatically selected based on your AD configuration:

AD Configuration Volume Style S3 AP User Type Behavior
AD configured NTFS WINDOWS (Admin) NTFS ACLs automatically applied
No AD NTFS/UNIX UNIX (root) All files accessible; permission control via .metadata.json

One gotcha I discovered: the S3 AP WindowsUser must not include the domain prefix. DEMO\Admin works for CLI operations but causes AccessDenied on data plane APIs (ListObjects, GetObject). Always specify just Admin.


S3 Vectors: Low-Cost Vector Storage

The default vector store is Amazon S3 Vectors โ€” a relatively new service that brings vector search costs down to a few dollars per month, compared to ~$700/month for OpenSearch Serverless.

Configuration Cost Latency Best For
S3 Vectors (default) ~$2-5/month Sub-second to 100ms Demo, dev, cost optimization
OpenSearch Serverless ~$700/month ~10ms High-performance production

S3 Vectors does have a 2KB filterable metadata limit per vector. Since Bedrock KB's internal metadata already consumes ~1KB, custom metadata is effectively limited to ~1KB. The system handles this by setting all metadata keys (including allowed_group_sids) as non-filterable and performing SID matching on the application side after retrieval.

If you start with S3 Vectors and later need higher performance, you can export on-demand to OpenSearch Serverless using the included export-to-opensearch.sh script.


Embedding Design: .metadata.json and the Ingestion Pipeline

Permission metadata follows the standard Bedrock KB metadata file specification. Each document has a companion .metadata.json file:

product-catalog.md                    โ† Document body
product-catalog.md.metadata.json      โ† Permission metadata
Enter fullscreen mode Exit fullscreen mode

The metadata format:

{
  "metadataAttributes": {
    "allowed_group_sids": "[\"S-1-1-0\"]",
    "access_level": "public",
    "doc_type": "catalog"
  }
}
Enter fullscreen mode Exit fullscreen mode

The allowed_group_sids field is a JSON array string of Windows SIDs that are allowed to access the document. S-1-1-0 is the well-known "Everyone" SID.

Bedrock KB Ingestion Jobs automatically read these .metadata.json files alongside documents, chunk the content, vectorize with Amazon Titan Text Embeddings v2 (1024 dimensions), and store everything in the vector store. No custom ETL pipeline needed.

Design Decisions and Trade-offs

At scale (thousands of documents), managing individual .metadata.json files becomes a maintenance burden. The system supports three approaches:

Approach Status Pros Cons
.metadata.json (current default) โœ… Production Bedrock KB native, no extra infra Doubles file count, manual management
ONTAP REST API auto-generation โœ… Partially implemented File server ACLs as source of truth Requires Embedding server
DynamoDB permission master ๐Ÿ”œ Recommended for scale DB-driven, easy auditing Requires pre-Ingestion generation pipeline

The recommended direction for large-scale environments:

ONTAP REST API (ACL retrieval)
  โ†’ DynamoDB document-permissions table
  โ†’ Auto-generate .metadata.json before Ingestion Job
  โ†’ Ingest via S3 AP into Bedrock KB
Enter fullscreen mode Exit fullscreen mode

Multiple Authentication Modes

The system supports 5 authentication configurations, all driven by cdk.context.json parameters:

Mode Authentication Permission Source Configuration
A: Email/Password Cognito native Manual DynamoDB SID registration Default (no extra config)
B: SAML AD Federation Cognito + SAML IdP AD Sync Lambda โ†’ auto SID retrieval enableAdFederation=true
C: OIDC + LDAP Cognito + OIDC IdP LDAP query โ†’ auto UID/GID retrieval oidcProviderConfig + ldapConfig
D: OIDC Claims Only Cognito + OIDC IdP OIDC token claims โ†’ group mapping oidcProviderConfig + groupClaimName
E: SAML + OIDC Hybrid Both IdPs simultaneously Combined SID + UID/GID Both configs + permissionMappingStrategy=hybrid

Sign-in Page โ€” SAML + OIDC Hybrid

The OIDC/LDAP federation enables zero-touch user provisioning: when a user signs in via the OIDC IdP for the first time, the Identity Sync Lambda automatically queries LDAP for their UID/GID/groups and stores them in DynamoDB. No admin intervention required.

For environments with FSx ONTAP UNIX volumes, the system also supports ONTAP name-mapping โ€” automatically resolving UNIX usernames to Windows users via the ONTAP REST API.


Agentic AI: Beyond Document Search

The system isn't just a search engine. Toggle between three modes with one click:

  • KB Mode: Permission-aware document search and Q&A
  • Single Agent Mode: Permission-aware autonomous multi-step reasoning via a single Bedrock Agent
  • Multi Agent Mode: Supervisor + Collaborator pattern for complex multi-agent workflows

Agent Mode Card Grid

Agent mode includes an Agent Directory โ€” a catalog-style management screen where you can create, edit, share, and schedule Bedrock Agents from templates. The directory now includes a Registry tab for importing agents from AWS Agent Registry, and a Teams tab for creating multi-agent teams.

Agent Directory with Registry Tab

Permission filtering works in all modes. Even when agents autonomously search and reason across multiple documents, only documents the user is authorized to see are included.

AgentCore Memory (v3.3.0)

With enableAgentCoreMemory=true, the system integrates Amazon Bedrock AgentCore Memory for conversation context maintenance:

  • Short-term memory: In-session conversation history (TTL: 3 days)
  • Long-term memory: Cross-session user preferences and summaries (semantic + summary strategies)

AgentCore Memory Sidebar

Episodic Memory (v4.0.0)

Building on AgentCore Memory, enableEpisodicMemory=true adds a new dimension: the agent remembers how it solved problems, not just what it knows.

While semantic memory stores facts and summaries, episodic memory records complete task episodes โ€” the goal, reasoning steps, actions taken, outcomes, and reflections. When a similar task comes up later, the agent automatically retrieves the top 3 most relevant past episodes and injects them into its reasoning context.

Think of it as giving the agent a "lessons learned" database that grows with every interaction:

  • Episode recording: After each conversation, a Background Reflection process automatically extracts episodes
  • Similar episode injection: Before executing a task, the agent searches for similar past episodes and uses them to inform its approach
  • Episode management UI: Browse, search (semantic, 300ms debounce), and delete episodes from the sidebar
  • Graceful degradation: If episodic memory fails, core agent functionality continues uninterrupted

The UI shows an "๐Ÿ“š Referenced past experience (N)" badge on responses that leveraged episodic memory.


Additional Features

Smart Routing (v3.1.0)

Automatic model selection based on query complexity. Short factual queries route to Claude Haiku (fast, cheap); complex analytical queries route to Claude Sonnet (powerful). Toggle ON/OFF in the sidebar.

Smart Routing

Image Analysis RAG (v3.1.0)

Drag-and-drop image upload in the chat input. Images are analyzed with Bedrock Vision API (Claude Haiku 4.5) and the analysis is integrated into KB search context.

Image Upload

6-Layer Security

Layer Technology Purpose
L1 CloudFront Geo Restriction Geographic access control
L2 AWS WAF (6 rules) Attack pattern detection
L3 CloudFront OAC (SigV4) Origin authentication
L4 Lambda Function URL IAM Auth API-level access control
L5 Cognito JWT / SAML / OIDC User authentication
L6 SID / UID+GID / OIDC Group Filtering Document-level authorization

8-Language i18n โ€” Why It Matters

The UI and all documentation (README, guides, setup instructions) are available in 8 languages: Japanese, English, Korean, Simplified Chinese, Traditional Chinese, French, German, and Spanish.

This isn't just a nice-to-have. Enterprise file servers are inherently multi-regional โ€” a global company's FSx ONTAP volumes serve teams across Tokyo, Seoul, Shanghai, Frankfurt, and New York. If the RAG interface only speaks English, you've created a barrier for the very users who need it most.

The implementation uses Next.js next-intl with per-locale message files. Every UI string goes through useTranslations(). The AI's chat responses also match the user's language โ€” a Korean user asking in Korean gets a Korean answer with Korean citation labels.

Here's what the card grid looks like across all 8 languages:

๐Ÿ‡ฏ๐Ÿ‡ต ๆ—ฅๆœฌ่ชž ๐Ÿ‡บ๐Ÿ‡ธ English ๐Ÿ‡ฐ๐Ÿ‡ท ํ•œ๊ตญ์–ด ๐Ÿ‡จ๐Ÿ‡ณ ็ฎ€ไฝ“ไธญๆ–‡
ja en ko zh-CN
๐Ÿ‡น๐Ÿ‡ผ ็น้ซ”ไธญๆ–‡ ๐Ÿ‡ซ๐Ÿ‡ท Franรงais ๐Ÿ‡ฉ๐Ÿ‡ช Deutsch ๐Ÿ‡ช๐Ÿ‡ธ Espaรฑol
zh-TW fr de es

v4.0.0: Six New Features (April 2026)

v4.0.0 adds six capabilities that extend the system from document search into a more complete enterprise AI platform. All are opt-in via CDK parameters โ€” zero additional cost when disabled.

Agent Registry Integration

enableAgentRegistry=true adds a "Registry" tab to the Agent Directory, connecting to AWS Agent Registry (Amazon Bedrock AgentCore). Your organization's shared Agents, Tools, and MCP Servers become searchable and importable directly from the UI.

  • Semantic search across registry records
  • One-click import from registry to local Bedrock Agent (name collision handling with _imported_YYYYMMDD suffix)
  • Publish local agents to the registry (with approval workflow)
  • Resource type filters (Agent / Tool / MCP Server)
  • Cross-region access via agentRegistryRegion parameter
  • Fault isolation: registry errors don't affect other Agent Directory tabs

Note: Agent Registry is a Preview API as of April 2026. The implementation uses SigV4-signed HTTP with REST path mapping. When the Node.js SDK adds native commands, the client can be swapped with minimal changes.

Multimodal RAG Search

embeddingModel: "nova-multimodal" switches the Knowledge Base from text-only (Titan Text Embeddings v2) to cross-modal search across text, images, video, and audio using Amazon Nova Multimodal Embeddings.

The architecture uses two patterns that make model changes painless:

  • Embedding Model Registry: Model definitions are configuration objects in a catalog. Adding a new model = adding one entry
  • KB Config Strategy: Dynamically generates KB configuration, IAM policies, and Lambda environment variables from the registry entry

For gradual migration, multimodalKbMode: "dual" runs two KBs in parallel โ€” text-only (Titan) + multimodal (Nova) โ€” with a query router that directs text queries to the text KB and image-attached queries to the multimodal KB. Users can toggle between them.

Caveat: Nova Multimodal Embeddings is currently available in us-east-1 and us-west-2 only. Changing the embedding model requires KB recreation and full data re-ingestion.

Guardrails Organizational Safeguards

enableGuardrails=true with optional guardrailsConfig gives fine-grained control over Bedrock Guardrails:

  • Content filter strength: Per-category (sexual, violence, hate, insults, misconduct, prompt attack) input/output filter levels (NONE/LOW/MEDIUM/HIGH)
  • Topic policies: Block specific topics (e.g., competitor information)
  • PII detection: Per-entity-type actions (BLOCK or ANONYMIZE for email, phone, credit card, etc.)
  • Contextual grounding: Hallucination prevention with configurable thresholds

The UI adds:

  • GuardrailsStatusBadge on every chat response: โœ… safe / โš ๏ธ filtered / โš ๏ธ check unavailable
  • GuardrailsAdminPanel in the sidebar (admin-only, read-only): shows account guardrails config and detects AWS Organizations Organizational Safeguards
  • EMF metrics: GuardrailsInputBlocked, GuardrailsOutputFiltered, GuardrailsPassthrough โ†’ CloudWatch dashboard + SNS alerts

Error handling follows a Fail-Open strategy: if the Guardrails API times out (5s) or returns 5xx, chat continues normally with an error log. The AI never stops working because of a guardrails hiccup.

Voice Chat (Amazon Nova Sonic)

enableVoiceChat=true adds voice interaction. Click the ๐ŸŽค microphone button (or Ctrl+Shift+V), speak your question, and get a text + audio response โ€” all through the same permission-aware RAG pipeline.

Phase 1 (current) uses REST + Bedrock Converse API:

Browser (mic) โ†’ POST /api/voice/stream โ†’ Converse API (speechโ†’text)
                                        โ†’ KB/Agent RAG pipeline
                                        โ†’ text + audio response โ†’ Browser
Enter fullscreen mode Exit fullscreen mode
  • Waveform animation (Canvas-based, input=blue, output=green, respects prefers-reduced-motion)
  • 30-second silence timeout with auto-stop
  • Auto-reconnect (max 3 attempts), then text fallback
  • Works in KB mode, Single Agent mode, and Multi Agent mode
  • Permission filtering is input-method-agnostic โ€” voice queries get the same SID/UID/GID filtering as text

Phase 2 (planned) will use API Gateway WebSocket + Nova Sonic InvokeModelWithBidirectionalStream for real-time bidirectional streaming.

Estimated monthly cost: $70โ€“$100 (input ~$0.0019/min, output ~$0.0076/min).

AgentCore Policy

enableAgentPolicy=true adds agent behavior control. Define boundaries in natural language โ€” what tools the agent can use, what APIs it can call, what data it can access โ€” and the system enforces them in real-time.

  • 3 policy templates: Security-focused, Cost-focused, Flexibility-focused
  • PolicyEvaluationMiddleware: Evaluates every agent action against the policy (3s timeout)
  • Fail-open / Fail-closed: policyFailureMode controls behavior when policy evaluation fails
  • Violation logging: EMF-format metrics (PolicyViolationCount, PolicyEvaluationLatency) โ†’ CloudWatch dashboard
  • PolicySection in Agent create/edit forms: optional natural language policy input (max 2000 chars)
  • PolicyBadge (๐Ÿ›ก๏ธ) on agents with active policies

Note: AgentCore Policy reached GA in March 2026 with a Policy Engine + Gateway architecture. Policies are written in Cedar language (with natural language auto-conversion). The implementation uses SigV4-signed HTTP.

Feature Flags Runtime API

A cross-cutting change that affects all v4 features: the UI no longer relies on NEXT_PUBLIC_* build-time environment variables. Instead, a /api/config/features endpoint reads Lambda environment variables at runtime and returns feature flags. The useFeatureFlags hook caches flags in localStorage for instant page loads.

This means you can enable/disable features by changing CDK parameters and redeploying โ€” without rebuilding the Docker image.

Multi-Agent Collaboration: Now Default-On

When enableAgent=true, multi-agent collaboration (enableMultiAgent) is now enabled by default. Bedrock Agents have zero standby cost, so this adds no running cost. Token consumption only increases (3-6x) when users actually chat in Multi Agent mode. Set enableMultiAgent: false explicitly to disable.

Multi-Agent Mode


Multi-Agent Collaboration: Permission-Aware Agent Teams

The system uses Amazon Bedrock Agents' Supervisor + Collaborator pattern. Instead of a single agent handling everything, specialized agents work together:

  • Supervisor Agent: Detects user intent, routes tasks to the right collaborator
  • Permission Resolver: Resolves SID/UID/GID from the User Access Table
  • Retrieval Agent: Executes KB search with permission metadata filters
  • Analysis Agent: Summarizes and reasons over filtered context (no direct KB access)
  • Output Agent: Generates reports and documents (no direct KB access)

The key design principle: KB access is restricted to Permission Resolver and Retrieval Agent only. Analysis and Output agents receive "filtered context" โ€” they never touch the knowledge base directly. This preserves the same SID/UID/GID permission boundaries that exist in single-agent mode.

Teams Gallery

Cost Structure

Scenario Agent Calls Est. Cost/Request
Single Agent (existing) 1 ~$0.02
Multi-Agent (simple query) 2โ€“3 ~$0.06
Multi-Agent (complex query) 4โ€“6 ~$0.17

Deployment Lessons Learned

CloudFormation AgentCollaboration values: Only DISABLED, SUPERVISOR, and SUPERVISOR_ROUTER are valid. COLLABORATOR is NOT a valid value. Collaborator Agents should not set this property at all.

2-stage deploy is mandatory: You cannot create a Supervisor Agent with SUPERVISOR_ROUTER and collaborators in a single CloudFormation operation. The solution: create with DISABLED first, then a Custom Resource Lambda changes to SUPERVISOR_ROUTER, associates collaborators, and runs PrepareAgent.

IAM permissions: The Supervisor Agent's IAM role needs bedrock:GetAgentAlias + bedrock:InvokeAgent on agent-alias/*/*. The Custom Resource Lambda needs iam:PassRole for the Supervisor role.


Tips for Builders

OpenLDAP memberOf Overlay

If you're testing with OpenLDAP, the LDAP Connector reads the memberOf attribute from user entries. Basic OpenLDAP doesn't populate this automatically โ€” you need to add moduleload memberof and overlay memberof to slapd.conf, and create groupOfNames entries (not just posixGroup).

The repo includes setup-openldap.sh that handles all of this automatically.

Geo Restriction Default

The WAF configuration defaults to Japan-only access (allowedCountries: ["JP"]). If you're deploying outside Japan, update this before deploying:

{ "allowedCountries": ["JP", "US", "DE", "SG"] }
Enter fullscreen mode Exit fullscreen mode

Set to [] for worldwide access.

Existing FSx ONTAP Reuse

If you already have an FSx for ONTAP file system, specify existingFileSystemId, existingSvmId, and existingVolumeId in cdk.context.json to skip FSx creation entirely. This cuts deployment time from 30-40 minutes to under 10 minutes.


Built with Kiro

I used Kiro throughout the entire development lifecycle โ€” specs for requirements-to-code traceability, hooks for automated validation on file saves, and steering files for project-specific rules that persist across sessions. The v4.0.0 release involved 195 files changed, 8-language documentation updates, property-based tests with fast-check, and live AWS environment verification across multiple accounts โ€” all developed with Kiro's assistance. As a solo developer, this level of tooling makes enterprise-quality projects feasible.


Getting Started

git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG.git
cd FSx-for-ONTAP-Agentic-Access-Aware-RAG && npm install

npx cdk bootstrap aws://$(aws sts get-caller-identity --query Account --output text)/ap-northeast-1
npx cdk bootstrap aws://$(aws sts get-caller-identity --query Account --output text)/us-east-1

bash demo-data/scripts/pre-deploy-setup.sh
npx cdk deploy --all --require-approval never
bash demo-data/scripts/post-deploy-setup.sh
Enter fullscreen mode Exit fullscreen mode

Prerequisites: Node.js 22+, Docker, AWS CLI configured with AdministratorAccess. Total deployment time is about 30-40 minutes (FSx ONTAP creation takes 20-30 minutes). Use existingFileSystemId to skip FSx creation if you already have one.


What's Next

The project is at v4.0.0 with 19 implementation aspects and actively evolving. Some directions I'm exploring:

  • Voice Chat Phase 2: WebSocket via API Gateway + Nova Sonic InvokeModelWithBidirectionalStream for real-time bidirectional streaming (replacing the current REST-based Phase 1)
  • DynamoDB-driven permission master: Eliminating per-file .metadata.json management for large-scale environments
  • Multi-volume embedding: Independent S3 Access Points per FSx for ONTAP volume with cross-volume search
  • Agent Registry GA SDK migration: When the Node.js SDK adds native Agent Registry commands, swap from SigV4 HTTP to SDK calls

I'm looking for feedback on:

  • Permission models: Are SID/UID-GID/OIDC-group/hybrid strategies sufficient for your use cases?
  • Voice interaction patterns: What voice-specific workflows would be valuable in enterprise RAG?
  • Policy templates: What agent behavior boundaries matter most in your organization?
  • Guardrails configurations: What content filtering rules does your compliance team require?

If you try it out, I'd love to hear about your experience โ€” especially edge cases I haven't considered. PRs and issues are welcome.

๐Ÿ‘‰ GitHub Repository โ€” README available in 8 languages, same as the application UI


Yoshiki Fujiwara

Top comments (1)

Collapse
 
iseecodepeople profile image
Varun Seth AWS Community Builders

awesome ๐Ÿ‘๐Ÿป

thanks for sharing.