Yoshiki Fujiwara(藤原善基)@AWS Community Builder for AWS Community Builders

Posted on May 14 • Edited on May 27

Smart Routing, Transfer Family Ingestion, and Voice Chat — Permission-Aware RAG v4.2

#aws #amazonfsxfornetappontap #serverless #amazonbedrock

Quick Start (5 minutes)

# Clone and deploy with all v4.2 features
git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-Agentic-Access-Aware-RAG.git
cd FSx-for-ONTAP-Agentic-Access-Aware-RAG

# Install dependencies
npm ci

# Deploy (Smart Routing is enabled by default)
npx cdk deploy --all --require-approval never

# Enable Transfer Family ingestion
npx cdk deploy --all -c enableTransferFamily=true --require-approval never

# Enable KB Auto-Sync
npx cdk deploy --all -c enableKbAutoSync=true --require-approval never

After deployment, run the post-deploy setup to create test users and upload demo data:

bash demo-data/scripts/post-deploy-setup.sh

What This Post Covers

This is a companion article to the FSx for ONTAP S3 Access Points Serverless Patterns series. While that series focuses on serverless patterns for FSx for ONTAP S3 Access Points across industries, this post covers the v4.2 release of the Agentic Access-Aware RAG system — a permission-aware RAG application built on FSx for ONTAP + Amazon Bedrock, production-grade in the sense of CI coverage, permission filtering, guardrails, and deployment parameterization — while some v4.2 features still have follow-up E2E items listed in What's Next.

The v4.2 release adds five features that address real-world enterprise needs: intelligent model routing for cost optimization, SFTP-based document ingestion for partners who can't use web UIs, automatic KB synchronization, operational guardrails for FSx ONTAP automation, and voice-based interaction via WebRTC.

1. Smart Routing Model Expansion

The Problem

Enterprise RAG workloads have wildly different complexity levels. A simple "What's the office address?" query doesn't need the same model as "Analyze the Q4 financial report across all subsidiaries and identify cost reduction opportunities." Routing everything through a single model either wastes money or delivers poor quality.

The Solution: 3-Tier Automatic Routing

The default routing tiers are configured for the model set currently enabled in this deployment:

Simple (greetings, factual lookups) → Claude Haiku 4.5 (anthropic.claude-haiku-4-5-20251001-v1:0)
Complex (analysis, comparison, summarization) → Claude 3.5 Sonnet v2 (anthropic.claude-3-5-sonnet-20241022-v2:0)
Full-context (multi-document reasoning, financial analysis) → Claude Opus 4 (anthropic.claude-opus-4-0-20250514-v1:0)

The exact model IDs are deployment parameters (lightweightModelId, powerfulModelId, heavyModelId), so teams can update to newer Sonnet/Opus releases without changing the routing logic.

┌─────────────────────────────────────────────────────┐
│                  User Query                         │
└──────────────────────┬──────────────────────────────┘
                       │
              ┌────────▼────────┐
              │  Complexity     │
              │  Classifier     │
              └┬───────┬───────┬┘
               │       │       │
        Simple │       │       │ Full-context
               ▼       ▼       ▼
           ┌──────┐ ┌──────┐ ┌──────┐
           │Haiku │ │Sonnet│ │ Opus │
           │ 4.5  │ │3.5 v2│ │  4   │
           └──────┘ └──────┘ └──────┘

The cost labels below are illustrative per-query estimates for typical RAG prompts (~1K input tokens, ~500 output tokens) in this deployment, not fixed model prices. Actual cost depends on input/output tokens, prompt caching, region, and inference configuration.

Tier	Illustrative per-query cost
Haiku 4.5	~$0.001
Sonnet 3.5 v2	~$0.01
Opus 4	~$0.10

Additionally, GPT-5.5 can be exposed as a manual selection option when OpenAI models on Amazon Bedrock are enabled for the account. In this deployment, the manual route is parameterized as openai.gpt-5-5, but teams should verify the exact model ID, Region availability, inference profile, and preview access status in their own AWS account.

If the selected model is unavailable or throttled, the router falls back to the next configured tier and emits a RoutingFallback metric.

Implementation

The classifier analyzes query characteristics — keyword count, presence of analytical terms, document references, context size — and routes to the appropriate tier:

// complexity-classifier.ts
export function classifyQuery(
  query: string, contextSize: number, threshold: number
): ClassificationResult {
  const features = extractFeatures(query);

  if (features.isGreeting || features.wordCount < 5) 
    return { classification: 'simple', confidence: 0.9 };
  if (features.hasAnalyticalTerms || contextSize > threshold) 
    return { classification: 'full-context', confidence: 0.8 };
  return { classification: 'complex', confidence: 0.7 };
}

CloudWatch EMF metrics track routing decisions, enabling cost analysis and route distribution monitoring:

Namespace: SmartRouting
Metrics: RoutingCount
Dimensions: RoutingTier (simple | complex | full-context | manual)

Sample CloudWatch EMF output:

{
  "_aws": {
    "Timestamp": 1716000000000,
    "CloudWatchMetrics": [{
      "Namespace": "SmartRouting",
      "Dimensions": [["RoutingTier"]],
      "Metrics": [{"Name": "RoutingCount", "Unit": "Count"}]
    }]
  },
  "RoutingTier": "simple",
  "RoutingCount": 1,
  "queryLength": 12,
  "confidence": 0.9,
  "modelId": "anthropic.claude-haiku-4-5-20251001-v1:0"
}

Full implementation: docker/nextjs/src/lib/complexity-classifier.ts

Related AWS documentation:

2. Transfer Family FSx ONTAP Ingestion

The Problem

Many enterprise partners — law firms, auditors, regulatory bodies — exchange documents via SFTP. They won't adopt a web UI. But their documents still need to flow into the RAG knowledge base with proper permission metadata.

Prerequisites and Limits

This pattern assumes:

FSx for ONTAP is running ONTAP 9.17.1 or later
The FSx file system and S3 Access Point are in the same AWS Region
The same AWS account owns the file system and access point
Transfer Family file operations follow the FSx S3 Access Point compatibility limits, including the 5 GB upload limit and unsupported rename/append operations

The Solution: SFTP → S3 Access Point → Bedrock KB

This feature bridges AWS Transfer Family with the existing permission-aware RAG pipeline. The architecture aligns with the approach described in the AWS Storage Blog — internal users access data via SMB/NFS, while external partners use SFTP, all reading/writing to the same FSx for ONTAP file system through S3 Access Points.

┌──────────┐     ┌─────────────────┐     ┌──────────────────┐
│  Partner │     │ Transfer Family │     │ FSx ONTAP        │
│  (SFTP)  │────▶│ SFTP Server     │────▶│ S3 Access Point  │
└──────────┘     └─────────────────┘     └────────┬─────────┘
                                                   │
                                    ┌──────────────▼──────────────┐
                                    │  EventBridge Scheduler      │
                                    │  (5-min polling)            │
                                    └──────────────┬──────────────┘
                                                   │
                              ┌─────────────────────▼─────────────────────┐
                              │         Ingestion Trigger Lambda          │
                              │  • ListObjectsV2 → detect changes         │
                              │  • Invoke Metadata Generator (async)      │
                              │  • StartIngestionJob (deduplicated)       │
                              └─────────────────────┬─────────────────────┘
                                                    │
                    ┌──────────────────────────────┬┘
                    ▼                              ▼
        ┌───────────────────┐          ┌────────────────────┐
        │ Metadata Generator│          │ Bedrock KB         │
        │ (.metadata.json)  │          │ StartIngestionJob  │
        └───────────────────┘          └────────────────────┘

This remains a polling-based sync path; an event-based CloudTrail/EventBridge mode is listed in What's Next.

Key Design Decisions

1. HomeDirectoryMappings uses S3 AP Alias, not ARN

The Transfer Family documentation explains that FSx-backed Transfer Family access uses S3 Access Point aliases, but the failure mode is not obvious: using the full ARN in HomeDirectoryMappings.Target produced cryptic access-denied errors in my deployment.

// Correct: use alias (e.g., "my-ap-ext-s3alias")
homeDirectoryMappings: [{
  entry: '/',
  target: `/${s3AccessPointAlias}/uploads/${userName}`,
}]

2. Deduplication via IN_PROGRESS check

Before triggering StartIngestionJob, the Lambda checks if a job is already running:

def should_trigger_ingestion(has_changes: bool, current_job_status: Optional[str]) -> bool:
    if not has_changes:
        return False
    if current_job_status == 'IN_PROGRESS':
        return False
    return True

3. Permission metadata auto-generation and trust boundary

When a new file is detected without a corresponding .metadata.json, the Metadata Generator Lambda creates one based on the SFTP user's permission mapping in DynamoDB:

{
  "allowed_sids": ["S-1-5-21-xxx-1001"],
  "allowed_uids": ["1001"],
  "allowed_gids": ["1001"],
  "source": "transfer-family",
  "uploaded_by": "partner-a",
  "uploaded_at": "2026-05-14T10:30:00Z"
}

The SFTP user does not supply permission metadata directly. The Metadata Generator derives it from an administrator-managed DynamoDB mapping and writes .metadata.json using a service role. Partner upload roles are scoped to their home directory (/uploads/{userName}/*).

Security note: The SFTP user's IAM role includes an explicit Deny statement for s3:PutObject and s3:DeleteObject on *.metadata.json keys within their home directory. This prevents partners from overwriting permission metadata generated by the service role.

This integrates seamlessly with the existing permission-filtering RAG pipeline.

CDK Deployment

npx cdk deploy --all \
  -c enableTransferFamily=true \
  -c s3AccessPointArn="arn:aws:s3:ap-northeast-1:ACCOUNT:accesspoint/my-ap" \
  -c transferFamilyS3ApAlias="my-ap-ext-s3alias"

Full implementation: lib/stacks/demo/demo-transfer-family-stack.ts

Related AWS documentation:

3. KB Auto-Sync

The Problem

Documents on FSx for ONTAP change continuously — new files added, existing files updated. Without automatic synchronization, the Bedrock Knowledge Base becomes stale.

The Solution

A lightweight Lambda (Python 3.12) polls the S3 Access Point every 5 minutes, compares against a DynamoDB inventory, and triggers StartIngestionJob only when changes are detected. The inventory is updated after StartIngestionJob is accepted (i.e., a job_id is returned). A future enhancement will move this to a pending/commit model so ingestion jobs that fail after start do not hide changes from the next scan:

# Scan → Diff → Start job → Update inventory (on job accepted)
current_files = scan_s3_access_point(s3_ap_arn)
previous = get_inventory(table)
diff = compute_diff(current_files, previous)

if diff.has_changes:
    job_id = trigger_ingestion_if_needed(kb_id, ds_id, diff)
    if job_id:
        # Inventory updated after StartIngestionJob is accepted.
        # Future: move to pending/commit model keyed on job SUCCEEDED.
        update_inventory(table, current_files, previous, job_id)

Enable with a single context parameter:

npx cdk deploy --all -c enableKbAutoSync=true

Full implementation: lambda/kb-auto-sync/handler.py

Related AWS documentation:

4. Capacity Guardrails

The Problem

The FSx ONTAP operations automation (volume resize, snapshot management) can be dangerous if triggered too frequently — especially during incidents where monitoring alerts cascade.

The Solution

A guardrails module that enforces:

Per-action rate limit: Max N executions per action per time window
Daily cap: Maximum total operations per day
Cooldown: Minimum interval between consecutive executions of the same action

@with_guardrails(action_name="volume_resize", max_per_hour=3, daily_cap=10, cooldown_seconds=300)
def resize_volume(volume_id: str, new_size_gb: int):
    # Only executes if guardrails pass
    ...

State is tracked in DynamoDB with TTL-based cleanup. The update_item call uses a ConditionExpression (attribute_not_exists(action_count) OR action_count < :max_actions) to prevent concurrent requests from bypassing the daily cap. Concurrent resize requests can still succeed while capacity remains under the configured cap, but the conditional update prevents them from collectively exceeding it. CloudWatch metrics expose guardrail rejections for operational visibility.

Full implementation: automation/fsxn-ops/lambda/common/guardrails.py

Related AWS documentation:

5. Voice Chat WebRTC (Phase 2)

The Problem

Knowledge workers often want to ask questions hands-free — during meetings, while reviewing physical documents, or when multitasking.

The Solution

A Strategy pattern implementation supporting both REST-based (Phase 1) and WebRTC-based (Phase 2) voice interaction:

interface VoiceSessionStrategy {
  connect(): Promise<void>;
  disconnect(): Promise<void>;
  sendAudio(data: ArrayBuffer): Promise<void>;
  onTranscript(callback: (text: string) => void): void;
}

Phase 2 uses:

Amazon Kinesis Video Streams Signaling Channel for WebRTC negotiation
Pipecat Voice Agent on Bedrock AgentCore Runtime for speech-to-text-to-RAG-to-speech
Automatic fallback: If WebRTC connection fails, seamlessly falls back to REST-based voice

Phase 2 implements the client/server strategy and fallback behavior; full AgentCore Runtime deployment automation remains in What's Next.

The WebRTC path is implemented behind the existing voice strategy interface, but production deployments should add authentication, rate limiting, CORS tightening, sanitized logging, and input validation around the signaling and session launch APIs — as noted in the Pipecat AgentCore WebRTC KVS example.

Full implementation: docker/nextjs/src/lib/voice/

Related AWS documentation:

Testing Strategy

All features are backed by comprehensive tests:

Category	Framework	Tests
CDK Assertion	Jest + aws-cdk-lib/assertions	42
Python Lambda Unit	pytest + moto	85
Property-Based	Hypothesis (Python)	6
Property-Based	fast-check (TypeScript)	12
Voice WebRTC	Jest	61
Smart Routing	Jest + fast-check	64

The Hypothesis property-based tests verify invariants like:

Change detection correctly classifies new/changed/unchanged files for any input combination
Ingestion deduplication logic is correct for all (changes × job_status) combinations
Metadata JSON always conforms to the required schema regardless of input permissions

Security & Portability

Before publishing, we ensured:

No hardcoded AWS account IDs in any public source file
Parameterized ECR repository name (ecrRepositoryName CDK prop)
Parameterized REGION in all shell scripts (${AWS_REGION:-ap-northeast-1})
Masked screenshots — AWS account IDs in console screenshots are covered
.gitignore coverage — cdk.context.json, cdk.out/, .env, .hypothesis/ all excluded

What's Next

AgentCore Runtime deployment for the Pipecat Voice Agent (currently requires CLI — CloudFormation support pending)
CloudTrail/EventBridge mode for Transfer Family ingestion (near-real-time event-based detection instead of 5-minute polling)
End-to-end SFTP upload test with actual SSH keys and partner simulation
Partner onboarding guide: Step-by-step SFTP setup for external partners — see docs/transfer-family-partner-onboarding.md
KB Auto-Sync pending/commit model: Inventory updates only on job SUCCEEDED ✅ Delivered in v4.3
Capacity Guardrails BREAK_GLASS mode: Emergency bypass with SNS audit trail ✅ Delivered in v4.3
Hybrid Search: Semantic + keyword search toggle in chat UI ✅ Delivered in v4.3
RAG Evaluation Pipeline: RAGAS-based quality metrics with CI integration ✅ Delivered in v4.3

End-to-End Architecture Flow

┌──────────────┐     ┌─────────────────┐     ┌──────────────────────────┐
│ External     │     │ Transfer Family │     │ FSx for ONTAP            │
│ Partner      │────▶│ SFTP Server     │────▶│ S3 Access Point          │
│ (SFTP)       │     └─────────────────┘     │ (data stays on FSxN)     │
└──────────────┘                             └─────────────┬────────────┘
                                                           │
                                            ┌──────────────▼──────────────┐
                                            │ Metadata Generator Lambda   │
                                            │ (admin-managed permissions) │
                                            └──────────────┬──────────────┘
                                                           │
                                            ┌──────────────▼──────────────┐
                                            │ KB Auto-Sync / Ingestion    │
                                            │ Trigger Lambda              │
                                            └──────────────┬──────────────┘
                                                           │
                                            ┌──────────────▼──────────────┐
                                            │ Amazon Bedrock              │
                                            │ Knowledge Base              │
                                            └──────────────┬──────────────┘
                                                           │
┌──────────────┐     ┌─────────────────┐     ┌─────────────▼────────────┐
│ End User     │────▶│ Smart Routing   │────▶│ Permission-Aware RAG     │
│ (Chat/Voice) │     │ (Haiku/Sonnet/  │     │ (fail-closed: missing    │
└──────────────┘     │  Opus)          │     │  metadata = excluded)    │
                     └─────────────────┘     └──────────────────────────┘

The RAG retrieval path is designed to fail closed: if permission metadata is missing, malformed, or unverifiable for a document, that document is excluded from retrieval results rather than exposed broadly. This fail-closed behavior is the core safety boundary of the permission-aware RAG design: a document without trusted metadata is treated as not retrievable.

Known Limitations

v4.2 is production-oriented, but a few items remain follow-up work:

KB Auto-Sync currently updates inventory when StartIngestionJob is accepted rather than when the job reaches SUCCEEDED. Failed ingestion jobs may mask unprocessed changes until the pending/commit model is implemented.
Transfer Family ingestion is implemented and unit-tested; full partner-style E2E validation with SSH keys is still planned. The current auto-sync path focuses on detecting additions and updates — delete reconciliation is follow-up work.
AgentCore Runtime deployment automation is not yet CloudFormation-based; the Pipecat Voice Agent requires CLI/SDK deployment.
Voice sessions require production policies for authentication, rate limiting, transcript retention, and sanitized logging before production rollout.
Smart Routing emits routing metrics, but monthly cost dashboards, budget enforcement, and savings-vs-baseline reporting are follow-up work.
Fail-closed enforcement happens in the retrieval filtering layer: documents without valid, trusted permission metadata are excluded before the model receives context. Audit events for retrieval decisions (DocumentSuppressedByPermission) are candidates for the next release.

Manual high-cost or preview model selection (GPT-5.5) should be governed by application-level authorization and audited separately from automatic routing. The networking model — public Transfer Family endpoint vs VPC-hosted endpoint, partner IP allowlists, and private DNS requirements — should be selected per customer environment.

Demo Verification Highlights

Transfer Family: IAM Deny for Metadata Overwrite

A key security property of the ingestion pipeline is that SFTP partners cannot overwrite permission metadata. The partner's IAM role includes an explicit Deny for *.metadata.json:

# Partner uploads a normal document — succeeds
sftp> put contract.pdf /uploads/partner-a/contract.pdf
Uploading contract.pdf to /uploads/partner-a/contract.pdf
contract.pdf                                         100%  1234KB  123.4KB/s   00:10

# Partner attempts to overwrite metadata — denied
sftp> put fake-metadata.json /uploads/partner-a/contract.pdf.metadata.json
remote open("/uploads/partner-a/contract.pdf.metadata.json"): Permission denied

This ensures the trust boundary: only the Metadata Generator Lambda (running with a service role) can write .metadata.json files.

Transfer Family: E2E Ingestion Flow (Verified 2026-05-13)

The full SFTP-to-KB pipeline was verified end-to-end:

Step	Result	Details
SSH key generation	✅	RSA 4096-bit
Transfer Family user key registration	✅	`import-ssh-public-key` API
SFTP connection	✅	Public key authentication
File listing (`ls`)	✅	2 files displayed
File upload (`put`)	✅	`sftp-uploaded.txt`
Ingestion Trigger Lambda	✅	1 file change detected
KB StartIngestionJob	✅	Job ID `JIGLRZMPEU`
Ingestion complete	✅	`COMPLETE`, 1 document newly indexed

Full verification report: docs/transfer-family-e2e-verification.md

Smart Routing: 3-Tier Cost Comparison

Tested with the same analytical query across all three tiers:

Tier	Response Time	Input Tokens	Output Tokens	Estimated Cost
Haiku 4.5	1.2s	1,024	256	~$0.001
Sonnet 3.5 v2	3.8s	1,024	512	~$0.008
Opus 4	8.5s	1,024	1,024	~$0.075

The classifier routes simple queries to Haiku (90%+ of typical enterprise queries), reserving Opus for multi-document analysis.

KB Auto-Sync: Detection and Ingestion Flow

Expected Lambda log output when a new file is detected:

{
  "level": "INFO",
  "message": "Scan completed",
  "scanId": "scan-20260523-103000",
  "filesScanned": 15,
  "newFiles": 1,
  "changedFiles": 0,
  "unchangedFiles": 14,
  "triggerIngestion": true
}

Capacity Guardrails: Rate Limit Enforcement

When the same action exceeds the configured rate limit (default: 3 per hour):

{
  "level": "WARN",
  "message": "Guardrail BLOCKED",
  "action": "volume_resize",
  "reason": "Rate limit exceeded: max 3 per hour for volume_resize",
  "action_count": 3,
  "window_start": "2026-05-23T10:00:00Z",
  "cooldown_remaining_seconds": 180
}

Automated verification script: Run bash demo-data/scripts/v4.2-verification-test.sh after deployment to collect masked logs for all use cases. See docs/v4.2-demo-verification-supplement.md for detailed test procedures.

Who Should Care About v4.2?

AI platform teams get model routing that balances quality and cost without manual intervention.
Security teams get administrator-derived permission metadata and explicit IAM protection against metadata overwrite.
Data teams get automatic KB synchronization from FSx for ONTAP through S3 Access Points.
Partners and SIs get an SFTP-to-RAG ingestion path for customers who exchange documents with external organizations.
Operations teams get guardrails for FSx ONTAP automation actions with conditional write protection.
Application teams get a WebRTC voice strategy with REST fallback.

Conclusion

v4.2 moves the permission-aware RAG system from a secure document Q&A application toward an enterprise ingestion and interaction platform.

Smart Routing reduces model cost without removing access to stronger models. Transfer Family ingestion lets partners keep using SFTP while documents land directly on FSx for ONTAP through S3 Access Points. KB Auto-Sync keeps Bedrock Knowledge Bases fresh, Capacity Guardrails make ONTAP automation safer, and WebRTC Voice Chat opens a lower-friction interaction path.

The common theme is the same as the FSx for ONTAP S3 Access Points pattern series: keep enterprise file data on FSx for ONTAP, expose it safely through S3-compatible access paths, and automate around it with serverless and managed AWS services.

Resources

GitHub: FSx-for-ONTAP-Agentic-Access-Aware-RAG
Release: v4.2.0
Related series: FSx for ONTAP S3 Access Points Serverless Patterns

AWS Official Documentation

Feature	Documentation
Smart Routing	Bedrock model invocation, Cross-Region Inference, CloudWatch EMF
Transfer Family	Transfer Family + FSx S3 AP, Security policies, HomeDirectoryMappings
KB Auto-Sync	StartIngestionJob API, EventBridge Scheduler, KB data source sync
Capacity Guardrails	DynamoDB Conditional Writes, DynamoDB TTL
Voice Chat	KVS WebRTC, Bedrock AgentCore
FSx for ONTAP	S3 Access Points for FSx ONTAP, FSx ONTAP User Guide

AWS Blog Posts

Secure SFTP file sharing with AWS Transfer Family, Amazon FSx for NetApp ONTAP, and S3 Access Points

AWS Reference Architectures & Samples

Update: Enterprise Readiness Guides Added

Based on feedback from AWS solution architecture perspectives (storage, partner/SaaS, public sector/healthcare, and generative AI/business value), I added a set of enterprise readiness documents covering:

Production readiness checklist — Demo/PoC/Production maturity levels with security, audit, DR, and operations checklists
Permission consistency and cache invalidation model — ACL change propagation flow, max delay (RPO-style), emergency revocation procedure
FSx for ONTAP sizing and performance guide — Scale-based configurations (10K/100K/1M files), S3 AP considerations, QoS design
Partner / SaaS deployment patterns — Multi-tenant isolation (account/SVM/hybrid), cost estimation templates
Governance and audit log design — Audit log schema (JSON), Responsible AI, Guardrails policy examples, industry use cases (healthcare, government, finance, education)
RAG and Agent evaluation metrics — 4-axis evaluation framework with PoC report template
Safe experimentation guide — Safe scope definition, prohibited actions, rollback procedures

I also added a permission test suite (31 scenarios) covering ACL filtering, group nesting, inherited permissions, fail-closed behavior, permission propagation, and edge cases — all passing.

All documentation is available in 8 languages (Japanese, English, Korean, Simplified Chinese, Traditional Chinese, French, German, Spanish).

Update: v4.3 Delivery Status (2026-05-28)

The following items from the original "What's Next" list have been delivered in v4.3:

Feature	Status	Details
KB Auto-Sync pending/commit model	✅ Delivered	Inventory updates keyed on job `SUCCEEDED`. Failed jobs no longer mask unprocessed changes
Capacity Guardrails BREAK_GLASS mode	✅ Delivered	Emergency bypass with SNS audit trail + structured audit log. 3-mode evaluation (ENFORCE / DRY_RUN / BREAK_GLASS)
Hybrid Search	✅ Delivered	`kbSearchType=HYBRID` CDK context parameter. Semantic + keyword search toggle in chat UI
RAG Evaluation Pipeline	✅ Delivered	RAGAS-based quality metrics in `tests/rag-evaluation/`. 4-axis evaluation framework (business KPI, RAG quality, permission control, agent performance)

Additionally, documentation quality improvements were applied based on AWS Solutions Architect lens review:

Architecture Decision Records: Fixed Smart Routing cost comparison logic (weighted average ~$0.005/query with 60% Haiku routing, ~50% cost reduction vs all-Sonnet)
Transfer Family verification: Added production HostKey verification guidance
Partner Central MCP integration: Applied least-privilege IAM policy (7 specific actions instead of wildcard)

Full changelog: v4.3.0 Release

Update: Industry Demo Data Packs

I also added industry-specific demo data packs covering 8 sectors (35 documents with permission metadata):

Sector	Documents	Permission Groups	S3AP UC
Government	5	政策企画課, 財政課, 危機管理課	UC16
Healthcare	5	内科, 看護部, 薬剤部	UC5
Legal	5	法務部	UC1
Manufacturing	5	品質管理部, 生産管理部	UC3
Construction	5	設計部, 工事管理部	UC10
Education	5	研究室, 教務課	UC13
Insurance	5	損害査定部, 不正対策室	UC14

Each pack demonstrates department-level access control with realistic Japanese business documents. The packs integrate with the FSx for ONTAP S3AP Serverless Patterns repository — processing results from those 17 UCs can be used as RAG search sources in this project.