Draw the boundary first. Then write Terraform.
A federal customer was ready to procure. The architecture was not.
This is a redacted write-up of a real engagement: a FedRAMP Moderate authorization boundary built on AWS GovCloud for an AI SaaS vendor selling into federal buyers, against a customer-driven timeline tied to a fiscal year.
The context
The client ran production on commercial AWS with a strong engineering culture, modern Terraform practice, and zero prior federal experience. A federal customer had committed to procurement contingent on a FedRAMP Moderate path, with an aggressive deadline.
The internal team understood the application deeply and had read enough FedRAMP material to know they were in trouble. The architecture decisions that worked beautifully for commercial customers each failed boundary review:
- shared accounts with the commercial environment
- a hosted vector store outside the cloud
- OpenAI behind the application
- an observability stack running outside the cloud account
The remediation list grew faster than the team could keep up with, and the timeline did not move. They reached out for boundary architecture help. Not policy writing, not 3PAO selection. Engineering work to redesign the cloud footprint so the boundary could be drawn cleanly and the assessment could proceed.
The approach: boundary before Terraform
The most common FedRAMP failure pattern is to start with the existing environment and try to bend it to fit the boundary. We do not do that. The first deliverable was a boundary diagram drawn from scratch, before any IaC was touched, identifying every service inside, outside, and at the edge of the authorization boundary.
From the boundary, every other architectural decision derived. The Terraform module library was written against the boundary, not the existing accounts. Identity federation, network architecture, KMS topology, and logging all followed.
┌─────────────────────────── FedRAMP Moderate Boundary (AWS GovCloud) ───────────────────────────┐
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Prod Account │ │ Staging Account │ │ Logging Acct │ │ Shared Services │ │
│ │ │ │ │ │ (write-only) │ │ (IdP / KMS) │ │
│ │ - EKS cluster │ │ - EKS cluster │ │ │ │ │ │
│ │ - RDS (PHI/CUI) │ │ - RDS │ │ - CloudTrail │ │ - IAM IdC │ │
│ │ - Bedrock (BAA) │ │ - Bedrock │ │ - VPC Flow Logs │ │ - KMS keys │ │
│ │ - Vector store │ │ - Vector store │ │ - EKS audit │ │ - Config rules │ │
│ │ │ │ │ │ - Retention lock│ │ │ │
│ └────────┬─────────┘ └────────┬─────────┘ └────────▲─────────┘ └────────▲─────────┘ │
│ │ │ │ │ │
│ └───────────────────────┴───────────────────────┴───────────────────────┘ │
│ All cross-account via PrivateLink │
│ Federation via IAM Identity Center │
│ │
└─────────────────────────────────────────────────────────────────────────────────────────────────┘
┃ Inheritance Boundary ┃
┃ ┃
FedRAMP services inherited (AWS GovCloud P-ATO):
CloudTrail · KMS · S3 · IAM · VPC · EKS · RDS · Bedrock (BAA)
Out of boundary (excluded):
Corporate IdP source · Marketing site · Customer support tooling
Key decisions
- Account topology. Four-account model in AWS GovCloud: production, staging, logging, and shared services. Cross-account access exclusively via PrivateLink with mutual authentication. No shared accounts with the commercial environment.
- Identity. IAM Identity Center federating from the existing corporate IdP (source outside the boundary). Time-bounded role assumption, MFA enforced via a FIPS-validated authenticator, no long-lived access keys anywhere in the boundary.
- Model endpoints. Inference served exclusively via AWS Bedrock under the GovCloud BAA path. No commercial model endpoints inside the boundary. RAG vector store inside the boundary on a managed pgvector deployment.
- Logging. Centralized logging account with S3 Object Lock retention. Engineers in production and staging have zero write access. SIEM integration is read-only across PrivateLink.
- KMS. Customer-managed keys with automatic annual rotation. Key usage logged to the centralized logging account. Separate keys for data classifications; key policy denies cross-classification use.
- Pipeline. GitHub Actions runners running inside the boundary. No commercial hosted runners touching boundary state. Signed container images, OPA admission control on the EKS clusters.
- Documentation. Every Terraform module shipped with a control narrative file mapping the module's resources to the NIST 800-53 controls it satisfies. The SSP is generated from those narratives, not written separately.
What we built
The engagement delivered the full set of artifacts a 3PAO expects during readiness review, plus the engineering primitives the client team needed to keep operating inside the boundary after we left.
Terraform module library. Account scaffolding, IAM federation, KMS topology, EKS cluster baseline, RDS with regulated defaults, Bedrock endpoint configuration, and the centralized logging account. Every module ships a control narrative file. Policy gates (OPA + Sentinel) reject non-regulated services, non-FIPS endpoints, and unsigned container images at plan time.
Identity and access. IAM Identity Center as the only path into the boundary. Role catalog with time-bounded assumption windows. Break-glass procedure documented and rehearsed. Access reviews automated quarterly, output flowing to the logging account.
Logging architecture. Logging account isolated to write-only from production and staging. S3 Object Lock retention configured for 7 years. SIEM (Splunk) integration via PrivateLink, read-only. CloudTrail organization trail, VPC Flow Logs, EKS audit logs, and KMS key usage all flowing in.
Pipeline. GitHub Actions self-hosted runners inside the boundary on isolated EKS node groups. Container images built and signed inside the boundary. Bedrock-backed inference deployed via GitOps (Argo CD) with admission control rejecting unsigned images and non-baseline configurations.
Control documentation. SSP-ready control narratives generated from the Terraform module library. Continuous monitoring plan documented. Incident response runbooks rehearsed with the client team. A 3PAO readiness pack delivered with the boundary diagram, data flow diagrams, and inherited-service mapping.
"Lucas handled a FedRAMP compliance project for us and it was a huge win. He architected the infrastructure to align with FedRAMP Moderate requirements, documented everything thoroughly for auditors, and didn't cut corners. Communication was excellent throughout and he proactively flagged issues before they became problems."
Ryan S., CTO @ AI SaaS
Results
| Outcome | |
|---|---|
| Passed | 3PAO readiness review on first-party assessment |
| On track | authorization timeline matched the customer deadline |
| 0 | significant changes required after boundary lock |
The client passed first-party 3PAO readiness review on schedule. The authorization boundary in the SSP matches the architecture diagrams; the architecture diagrams match the Terraform; the Terraform produces the resources the auditor will examine. There is no gap between document and reality, which is the single most common cause of 3PAO findings.
What made it work
Drawing the boundary before writing Terraform. Every subsequent decision derived from the boundary diagram. When a question arose ("should this service be inside?") the answer was already documented. There was no negotiation mid-engagement about whether a service was in scope.
Control narratives shipped alongside modules. Writing the control narrative when the Terraform module is written is the highest-leverage discipline in FedRAMP engineering. The narrative becomes a property of the module; the SSP is generated; the auditor's questions are answered by the same source of truth that produced the resources.
Policy gates rejecting non-regulated patterns. OPA and Sentinel gates at plan time prevented the most common drift pattern: an engineer reaching for a familiar commercial-region service that is not FedRAMP-authorized. The gate is boundary discipline at the speed of terraform plan.
Originally published at stonebridgetechsolutions.com.
Stonebridge Tech Solutions builds compliance-grade cloud infrastructure for healthcare and defense teams. If you want a rough read on your own control count and first-cycle audit cost, the scope estimator takes about two minutes.
Top comments (0)