Ophir Zahavi for AWS Community Builders

Posted on Jan 8 • Originally published at Medium on Jan 8

Implementing Security Lake in AWS GovCloud for FedRAMP High Compliance

#security #aws #awssecuritylake #fedramp

Operating in the AWS GovCloud regions often feels very similar to working in AWS commercial regions. Over the years, the gap has narrowed significantly, and feature parity continues to improve. From a day to day operational perspective, most workloads behave exactly as you would expect in either environment.

That similarity can be misleading. Simply running in GovCloud is rarely enough to deliver real security benefits, achieve compliance certifications, or confidently include federal customers as part of your ICP. The real complexity begins when your environment must meet federal compliance requirements. Achieving a FedRAMP High authorization is not a paperwork exercise. It is an architectural challenge that directly influences how you design, operate, and secure your platform.

Among the many controls and requirements involved, audit logging consistently stands out as one of the most difficult areas to get right. Collecting, protecting, and retaining security relevant logs across dozens or hundreds of AWS services, spread over a multi account environment, quickly becomes a non trivial problem. You need consistent log formats, strong access controls, long term retention, and clear ownership, all while staying aligned with NIST 800 53 expectations.

This is where AWS Security Lake becomes a real game changer for GovCloud environments. Earlier this year, Security Lake achieved FedRAMP High authorization, making it a viable and compliant foundation for centralized security logging. It is a purpose built data lake that aggregates security data into a dedicated account using the Open Cybersecurity Schema Framework, while addressing durability, availability, and retention requirements out of the box.

In this post, a production ready Security Lake implementation designed specifically for FedRAMP High workloads in AWS GovCloud is presented, with a focus on practical architecture decisions, operational realities, and lessons learned from real world environments.

Why Security Lake for FedRAMP High?

Meeting the NIST 800 53 controls required for FedRAMP High often pushes teams toward building custom log ingestion pipelines for every service. CloudTrail, VPC Flow Logs, Route 53, ELB, EKS, OS logs, and more all tend to end up with their own dedicated flows, schemas, and retention logic. Over time, this becomes fragile (even when done the “correct” way with IaC), expensive to operate, and difficult to audit.

AWS Security Lake simplifies this problem by mapping directly to several of the most critical audit and logging control families.

AU-2 Audit Events — Security Lake provides centralized and automated collection of security relevant events across AWS services and accounts. Instead of managing per-service pipelines, logs are automatically collected, normalized, and delivered into a single security owned account, making coverage easier to reason about and defend during assessments.
AU-3 Content of Audit Records — By using the Open Cybersecurity Schema Framework, Security Lake enforces a consistent and complete structure for audit data. Core fields such as source IPs, timestamps, identities, and resource context are standardized, which removes ambiguity and reduces the need for custom parsing logic during investigations or audits.
AU-4 Audit Storage Capacity — Security Lake automatically scales to petabytes using S3, eliminating the need to perform capacity planning, which can be difficult in a fast growing environment
AU-6 Audit Review and Reporting — Security Lake decouples storage from analysis. Data is stored once and can be queried at scale using Amazon Athena or integrated with downstream security tooling. This allows security teams to run ad hoc queries across large datasets without impacting ingestion pipelines or production workloads.
AU-9 and AU 11 Protection and Retention — Encryption, access control, and lifecycle management are handled natively. Security Lake integrates with KMS for key management and S3 lifecycle policies for retention, making it easier to enforce long term storage requirements while maintaining least privilege access and data durability expectations.
SC-7 Boundary Protection — VPC endpoints ensure all ingestion traffic stays within AWS private network

The net result is a logging architecture that is easier to operate, easier to secure, and significantly easier to explain to auditors. Instead of stitching together dozens of pipelines, Security Lake gives you a compliant foundation that aligns naturally with how FedRAMP High environments are expected to be built.

The Architecture: Multi-Region & Multi-Source

In AWS GovCloud, high availability is not open for debate it is a hard requirement. Security controls, logging, and audit data are all expected to survive regional failures. For that reason, Security Lake should be deployed in both GovCloud regions, us-gov-east-1 and us-gov-west-1, as part of a single logical logging architecture.

Each region ingests and stores its own data locally, which aligns with FedRAMP expectations around availability and fault isolation while avoiding unnecessary cross region dependencies.

Data Sources

Security Lake handles ingestion for many native AWS sources out of the box, which significantly reduces operational overhead and cost. With several AWS services, exporting vended logs to a centralized location or external logging platform requires enabling higher CloudWatch logging tiers, which can quickly increase complexity and spend.

This limitation exists in commercial regions as well, but it becomes far more impactful in regulated environments. In GovCloud and FedRAMP High setups, centralized logging is mandatory, long-term retention is required, and cost predictability is critical. Security Lake avoids many of these tradeoffs by ingesting supported service logs directly, without relying on expensive CloudWatch export paths or custom pipelines.

The result is a simpler architecture that is easier to operate, easier to secure, and easier to justify during compliance reviews, while keeping logging costs under control.

Common native sources include

VPC Flow Logs — Network traffic analysis and threat detection
Route 53 Resolver Query Logs — DNS query logging for command & control detection
CloudTrail — Management events and S3 data events across all accounts
Security Hub Findings — Aggregated security findings from GuardDuty, Config, IAM Access Analyzer
Lambda Execution Logs — Function invocation and error tracking
AWS WAF Logs — Web application firewall events
EKS Audit Logs — Kubernetes control plane activity

For custom or non-AWS sources such as Windows Event Logs or Linux syslog, a lightweight transformation layer is introduced. Raw logs are normalized and converted into OCSF-compliant Parquet files before being delivered into Security Lake. This ensures the data remains consistent with native sources and avoids the need for a parallel logging system.

The Ingestion Pipeline

In practice, three ingestion patterns are typically used depending on the nature of the data and the latency requirements.

Path 1 — Automatic Ingestion

The common native sources are ingested automatically by Security Lake; all that is required is to enable the ingestion and it happens automagically. Zero heavy lifting, no effort or infrastructure required just a checkbox and logs are ingested.

Path 2 — S3-based ingestion

This path is best suited for batch-oriented logs such as ALB access logs or service-generated log files that are stored in S3. Data flows from S3 into SQS, is processed by a Lambda transformation function, and is then written into Security Lake. This model is simple, resilient, and easy to reason about during audits.

Path 3 — Streaming ingestion

This path is designed for near-real-time logs, such as host-level telemetry or security agent output. Logs are streamed through a Kinesis Data Stream, transformed by Lambda, and delivered directly into Security Lake. This approach provides lower latency while still maintaining schema consistency and backpressure handling.

Together, these three ingestion paths allow you to onboard a wide range of log sources without over-engineering the pipeline. More importantly, they give you a consistent and auditable architecture that scales cleanly across regions and accounts.

FedRAMP High Data Lifecycle Management

Retention is where many GovCloud environments quietly lose control of cost and no one likes that end of month bill shock. Under FedRAMP High, audit data typically needs to be readily available for roughly ninety days and retained for at least one year. Keeping everything in hot storage for the full retention window is rarely necessary and almost always expensive.

Security Lake stores data in S3, which makes lifecycle-based cost optimization straightforward and defensible during assessments. The key is to align storage tiers with how the data is actually used.

For the first ninety days, logs remain in S3 Standard. This supports fast access for for SOC investigations, incident response, and ad hoc Athena queries. After that window, access patterns drop sharply, but the data still needs to be searchable. Transitioning to S3 Standard IA balances cost savings with continued query capability. After one year, data can either expire or transition to long term archival storage depending on agency policy and contractual requirements.

Below is an example Terraform lifecycle configuration that aligns well with common FedRAMP High expectations while keeping costs predictable.

# Example Terraform lifecycle configuration
lifecycle_configuration = {
  # 0-90 days: S3 Standard (Instant access for SOC/IR)
  # 90-365 days: S3 Standard-IA (Cost-optimized but queryable)
  # 365+ days: Expiration (Or move to Glacier if agency policy dictates)

  expiration_days = 365
  transitions = [
    {
      days = 90
      storage_class = "STANDARD_IA"
    }
  ]
}

The important takeaway is not the exact numbers, but the model. Security Lake gives you native control over retention and storage tiers without introducing custom archival pipelines. That makes lifecycle management easier to operate, easier to audit, and far less likely to become a surprise line item on your AWS bill.

Security Controls: Encryption & IAM

Strong encryption and tightly scoped access controls are foundational for FedRAMP High. This is not an area where defaults are sufficient, and auditors will look closely at how keys are managed and who can access security data.

KMS Strategy

For FedRAMP High, AWS managed keys are not enough. You need customer managed keys with rotation enabled and clear ownership. In a typical FedRAMP High implementation, concerns are separated by using distinct keys for different parts of the system, which limits blast radius and simplifies access reviews.

Main key — This key encrypts the Security Lake S3 storage. It protects all raw and normalized security data at rest and is the most sensitive key in the system.
Ingestion key — This key is used for encrypting streaming services such as Amazon Kinesis. Separating ingestion encryption ensures that pipeline level access does not automatically grant access to stored data.
Analysis key — This key encrypts query results generated by Amazon Athena. This prevents analysts or downstream tooling from accessing decrypted outputs unless explicitly authorized.

Least Privilege Access

Granting broad access to S3 buckets is one of the fastest ways to fail a FedRAMP review. Instead, AWS Lake Formation can be used to enforce fine-grained permissions on top of Security Lake.

Rather than assigning wide S3 permissions, access is defined at the database, table, and when needed column level. This allows SOC analysts, security engineers, and external auditors to query only the data they are authorized to see, without exposing the underlying storage layer.

This approach significantly simplifies access reviews, supports least privilege by design, and aligns well with how FedRAMP assessors expect sensitive security data to be protected.

Custom Log Transformation: The Power of OCSF

Normalization is the real force multiplier behind Security Lake. Converting Windows XML events or Linux syslog into OCSF means your security analysts no longer need to understand several different log formats just to answer basic questions.

Instead of thinking in terms of Windows events, syslog lines, or service specific fields, analysts work with consistent concepts such as authentication, network activity, or process execution. This dramatically reduces cognitive load during investigations and makes cross source correlation realistic at scale.

For custom sources, the transformation step is where this normalization happens.

Example Linux Security Logs Transformation

In this example, a Lambda function receives a raw Linux security logs and maps it into the OCSF Authentication class. The goal is not to preserve every original field, but to extract the security relevant signals and express them in a consistent schema.

AWS Lambda handles this transformation inline as part of the ingestion pipeline.

# Simplified OCSF Mapping Logic
mapped_event = {
    'source': 'linux-security',
    'target_schema': 'AUTHENTICATION',
    'target_mapping': {
        'activity_id': 1, # Logon
        'class_uid': 3002, # Authentication Class
        'user': {'name': 'admin', 'uid': 'S-1-5-21-...'},
        'time': 1640995200,
        'status': 'Success'
    }
}

Once transformed, this event is stored alongside native AWS authentication events in Security Lake. From the perspective of Athena queries, detections, or dashboards, there is no distinction between cloud native and custom sourced data.

This is where OCSF delivers real value. Normalization happens once at ingestion time, not repeatedly during every investigation. The result is faster queries, simpler detections, and a security dataset that actually scales as your environment grows.

Querying the Lake with Amazon Athena

Once logs are normalized into OCSF and consistently partitioned by eventday, querying Security Lake becomes a straightforward SQL exercise. At that point, you are no longer thinking in terms of individual services or log formats. You are querying security events across your entire fleet using a common schema.

This is where Security Lake starts to pay off operationally. Engineers and analysts do not need to remember where a specific log lives or how it was parsed. They can focus on asking questions and getting answers.

For example, the query below searches for failed SSH login attempts across all Linux hosts in the last twenty four hours. The same approach works across regions, accounts, and services, without any custom joins or parsing logic.

-- Find failed SSH attempts across the entire fleet in the last 24h
SELECT
    time,
    src_endpoint.ip,
    user.name,
    status
FROM "amazon_security_lake_glue_db_us_gov_east_1"."ext_linux_auth_table"
WHERE eventday = '20251222'
  AND activity_name = 'Logon'
  AND status = 'Failure'
  AND dst_endpoint.port = 22
ORDER BY time DESC;

Because the data is partitioned by day and stored in columnar Parquet format, these queries scale well even as data volume grows into the terabyte or petabyte range. Just as importantly, query behavior remains predictable and auditable, which is critical in FedRAMP High environments.

This model allows SOC teams, security engineers, and auditors to run meaningful investigations without standing up separate analytics infrastructure or duplicating data. Athena becomes the query layer, Security Lake remains the system of record, and access is governed centrally through Lake Formation.

Operational Best Practices

Security Lake is not a set it and forget it service. In FedRAMP High environments, how you operate and monitor the platform matters just as much as how you deploy it.

AWS Organizations Integration — always designate a dedicated Security or Log Archive account as the Security Lake delegated administrator. This account should be isolated from application workloads and owned by the security or platform team. Centralizing ownership simplifies cross-account ingestion, enforces clear separation of duties, and makes it much easier to explain the architecture to auditors.
Monitor the Monitor — Log ingestion health is itself a security control. For streaming sources, you should continuously monitor the Kinesis Iterator Age metric. If this value starts to climb, ingestion is falling behind, which means security events are no longer arriving in near real time. During a FedRAMP audit, this is a major red flag.

CloudWatch alarms on iterator age, Lambda errors, and throttles should be treated as high severity alerts, not informational noise.

Network Boundary Protection — All ingestion traffic should stay within the AWS private network. Use VPC Interface Endpoints for S3, Kinesis, and Lambda to ensure log data never traverses the public internet. This directly supports SC 7 Boundary Protection requirements and reduces the attack surface of the logging pipeline.
Infrastructure as Code — Consistency matters in regulated environments. Use Terraform modules to deploy Security Lake and its supporting infrastructure across your GovCloud production, staging, and development environments. This reduces configuration drift, simplifies change reviews, and provides a clear audit trail for how the logging platform is built and maintained.

Operational discipline is what turns Security Lake from a compliant design into a compliant system. These practices help ensure the platform remains reliable, defensible, and auditable long after the initial authorization is complete.

Conclusion

AWS Security Lake in GovCloud shifts organizations from simply collecting logs to truly operationalizing security data. It removes much of the custom plumbing traditionally required to meet FedRAMP High expectations, while still giving teams full control over architecture, access, and cost.

For security teams, it delivers a centralized, normalized, and auditable source of truth that aligns naturally with NIST 800 53 controls. For platform engineers, it provides a scalable and predictable way to manage petabytes of audit data without maintaining fragile ingestion pipelines or over provisioning hot storage.

Used correctly, Security Lake becomes more than a compliance checkbox. It becomes a foundational security service that supports investigations, audits, and long term operational maturity in GovCloud environments.

About the Authors

Avner Vidal | DevOps Tech Lead | LinkedIn

Ophir Zahavi | Director of Cloud Engineering | LinkedIn

DEV Community