DEV Community

Sorin Costea
Sorin Costea

Posted on • Originally published at tryingthings.wordpress.com

Poor man’s static web site protection in AWS S3 (with Terraform)

We’re talking internet so arguably the best protection would be to not use it at all, but here we are, serving static files from AWS S3 and hoping not every site and forum is going to link to them. Why? Because we’re the ones paying the AWS bills, aren’t we.

I have this web application – you could call it SPA even though it’s little more than a REST API client – and it uses a few assets like images and the page itself (duh). So here’s the easy way to reach some security:

  1. serve the files with a CloudFront distribution (that’s AWS talk for their own CDN)
  2. restrict only CloudFront to read files from S3 (by setting up OAI – origin access identity)
  3. upgrade always the connection to HTTPS and allow only GET, HEAD and OPTIONS
  4. enable WAF (AWS web application firewall, version 2) ACL to only allow on rules
  5. and finally, restrict that acceptable requests have a custom header with a known value

Did I say easy? WAF killed me an entire day and even now I have no idea what was initially wrong and why it works now. But here’s what works, in Terraform because I hate CloudFormation – but the concepts should be clear.

1) the CloudFront distribution itself, which will reference your S3 bucket where the desired assets lie (you already have your bucket, right?) and the soon to be defined ACL:

locals {
  s3_origin_id = "my_origin"
}

resource "aws_cloudfront_distribution" "my_distribution" {
  origin {
    domain_name = aws_s3_bucket.my_bucket.bucket_regional_domain_name
    origin_id   = local.s3_origin_id
    s3_origin_config {
      origin_access_identity = aws_cloudfront_origin_access_identity.my_oai.cloudfront_access_identity_path
    }
  }
  web_acl_id          = aws_wafv2_web_acl.my_acl.arn
  enabled             = true
  is_ipv6_enabled     = true
  default_root_object = "index.html"
  default_cache_behavior {
    allowed_methods  = ["GET", "HEAD", "OPTIONS"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = local.s3_origin_id
    forwarded_values {
      query_string = false
      cookies {
        forward = "none"
      }
    }
    viewer_protocol_policy = "redirect-to-https"
  }
  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }
  tags = {
    maypp = "test"
  }
  viewer_certificate {
    cloudfront_default_certificate = true
  }
}

2) the OAI of course. It's only this, really.

resource "aws_cloudfront_origin_access_identity" "my_oai" {
  comment = "Serve securely S3 assets"
}

Just, don't forget to add it to your S3 bucket policy, otherwise nothing (good) will happen:

data "aws_iam_policy_document" "my_s3_policy" {
...
  statement {
    actions   = ["s3:GetObject"]
    resources = ["${aws_s3_bucket.my_web_bucket.arn}/*"]
    principals {
      type        = "AWS"
      identifiers = ["${aws_cloudfront_origin_access_identity.my_oai.iam_arn}"]
    }
  }
...

3) the connection and method filters can be also noticed in the CloudFront distribution definition.

4) The WAF2 access control list (ACL for the advanced). This is where most of my time got burned, maybe there are better ways to do it but heck if I want to invest more sweat any time soon into it.

Notice it's a "CLOUDFRONT" type and even if CloudFront is global, it MUST be in the us-east-1 region. For this I needed the multi-provider Terraform hack, see below. It also needs for everything and its mother (see also next point) a mandatory "visibility_config" block even if you don't need metrics right now, because if AWS is a mess, why shouldn't Terraform imitate it.

resource "aws_wafv2_web_acl" "my_acl" {
  name     = "my-acl"
  scope    = "CLOUDFRONT"
  provider = aws.us-east
  default_action {
    block {}
  }
  rule {
    name     = "listlik-acl-rule"
    priority = 1
    override_action {
      none {}
    }
    statement {
      rule_group_reference_statement {
        arn = aws_wafv2_rule_group.my_rule_group.arn
      }
    }
    visibility_config {
      cloudwatch_metrics_enabled = false
      metric_name                = "my-acl-rule-metric"
      sampled_requests_enabled   = false
    }
  }
  tags = {
    maypp = "test"
  }
  visibility_config {
    cloudwatch_metrics_enabled = false
    metric_name                = "my-acl-metric"
    sampled_requests_enabled   = false
  }
}

As mentioned, Terraform needed two providers - the regular AWS one and a special one for the global CloudFront distribution which you will always refer by alias:

provider "aws" {
  region = var.region
}
provider "aws" {
  alias  = "us-east"
  region = "us-east-1"
}

5) And finally the rule group with no geographical restrictions but a single rule, letting only requests with a custom header containing exactly a custom value. Notice that if the ACL default action (if no rule method matched) was to block, and as the rule group didn't override it, this rule's action is the only thing which can allow you receive the assets.

resource "aws_wafv2_rule_group" "my_rule_group" {
  name     = "my-rule-group"
  scope    = "CLOUDFRONT"
  provider = aws.us-east
  capacity = 2
  rule {
    name     = "my-rule"
    priority = 1
    action {
      allow {}
    }
    statement {
      byte_match_statement {
        positional_constraint = "EXACTLY"
        search_string         = "megasecretstring"
        field_to_match {
          single_header {
            name = "x-my-referrer"
          }
        }
        text_transformation {
          priority = 1
          type     = "NONE"
        }
      }
    }
    visibility_config {
      cloudwatch_metrics_enabled = false
      metric_name                = "my-rule-metric"
      sampled_requests_enabled   = false
    }
  }
  visibility_config {
    cloudwatch_metrics_enabled = false
    metric_name                = "my-rule-group-metric"
    sampled_requests_enabled   = false
  }
  tags = {
    maypp = "test"
  }
}

Now you can add to your browser an extension (like Simple Modify Headers if you use Firefox) which for a specific domain - the domain of your CloudFront distribution - will always attach to requests the header as configured above. I know I could have used the standard "X-Referrer" header but then it wouldn't be so obivous that for CloudFront it doesn't matter at all - if will filter the requests by anything you like.

(Published as part of the #100DaysToOffload challenge https://100daystooffload.com/)

Top comments (0)