Hi! In this post, I'm going to run through what I did to get my site https://man-yells-at.cloud onto the internet - the intent is to be a useful guide for anybody who's looking to do something similar, but as I'm writing this at the same time as I build everything, it'll likely include a bit of stream-of-consciousness on the process!
The best bit for you is that you get to skip the bits where I called AWS services foul things because I'd configured something wrong, and get right to the bits that work, with some explanations where things might have had me Googling or I had to backtrack on decisions.
The Stack
Gatsby
I won't be focusing too much on this in this post, but to get this built, I followed the Gatsby tutorial for a bit, threw a strop when I realised how much Gatsby loves GraphQL, and then begrudgingly came back when I realised that building much else was going to be a fair bit more work than I wanted to do.
Once I finished the guide I did a couple of modifications so that I could messily cram it all into a blog template theme built by Xiaoying Riley - you can find their templates here, which I'm going to hopefully be able to build and throw into AWS.
At a certain point, I realised that the routing wouldn't work without a plugin called gatsby-plugin-s3
, which I gave the following config in my
gatsby-config
{
resolve: `gatsby-plugin-s3`,
options: {
bucketName: "<BUCKET_NAME>",
protocol: "https",
hostname: "<DOMAIN_NAME>"
}
}
AWS
Most of my interactions with AWS are through Terraform at the moment (and then a bit of ClickOps when I'm confused or trying to ascertain the state of a system).
Route53
Route53 is AWS' DNS service, which allows me to manage where my domain points to. By pointing my name servers at it, I can then automate setting up new records via Terraform.
One caveat to doing this is that currently the control planes for Route53 are located in AWS 'us-east-1' zone, which has seen a couple of major outages in the last year. If you have critical applications in production and DNS changes are part of your disaster recovery plan then make sure to consider this whilst building!
S3
AWS' Simple Storage Service serves as a suitable service to situate my very serious site. Whilst primarily advertised as a 'file storage' service, S3 is one of my favourite AWS services because of it's versatility. In this instance, we're going to use it as the place where we host our site, meaning we don't have to spin up and manage any pesky servers. I'm hoping this pleases the Serverless cult crowd.
Cloudfront
Cloudfront is a CDN that helps tie together the entire project by making sure my site is available as close to users as it can be in 'edge locations', and caching that site at those edge locations. This helps me by reducing latency for users, and reducing the amount of requests that need to fetch information from the S3 bucket itself, which is a massive cost saver.
The Guide
In broad steps, I'll be walking through:
- Managing my externally purchased domain using AWS Route53
- Hosting a Gatsby site in S3
- Serving that site from Cloudfront
So let's get into it.
My setup
This is not prescriptive, but if you're following along then this is how I'm doing things:
- Using Terraform to manage infrastructure, installed via tfenv
- Using a .tfvars file that I feed into my
terraform apply
for things like domain etc that might get repeated
- Using a .tfvars file that I feed into my
- Running everything in a new AWS Organizations sub account
- The AWS CLI
- Authenticating using Granted
Here's what my variables.tf
file looks like:
variable domain_name {
type = string
}
variable bucket_name {
type = string
}
variable environment {
type = string
}
I then fill out these records into a .tfvars
file and use them as shown here
Managing my domain through AWS
Before setting up this site, I had purchased the domain https://man-yells-at.cloud through Namecheap (as they seem to always be decently priced, making any DNS changes isn't frustrating as hell, and they're familiar).
I could manually point domains at things later on in the guide, but I like being able to manage everything through Terraform, so I'm going to point my Namecheap domain at AWS. In order to do this, I'm going to create a little Terraform module to manage a Route53 Hosted Zone. I set up the provider and remote backend in a main.tf
file, and then created a new file -
Route53
route53.tf
resource "aws_route53_zone" "main" {
name = var.domain_name
tags = {
Name = var.bucket_name
Environment = var.environment
}
}
output "nameservers" {
value = aws_route53_zone.main.name_servers
}
And running terraform apply
. I included the output so that I'd get something that looks like this:
I then used these to change the nameservers settings in Namecheap. To check to make sure everything propagated and I used the correct nameservers, I created a TXT record by adding this block:
route53.tf
resource "aws_route53_record" "propagation_check" {
zone_id = aws_route53_zone.main.zone_id
name = "test.${var.domain_name}"
type = "TXT"
ttl = 300
records = ["teststring"]
}
And applying again. After a couple of minutes, I ran
dig -t txt test.man-yells-at.cloud
And saw the record I'd just set. So far, so good.
Creating an S3 bucket and uploading my site
I had to come back to this bit after a bit of a reminder as to how some types of site operate in S3. This was initially a bucket with absolutely no public read, where our Cloudfront distribution had the permissions necessary to serve the objects in it - a setup that is generally best practice as it means nobody is interacting directly with the bucket. Unfortunately this isn't feasible without breaking the way some routes work in Gatsby, so I'll write a different post about how to do that securely at some point!
With the above addendum in mind, I switched to making a publicly readable bucket.
S3 Bucket
resource "aws_s3_bucket" "site_bucket" {
bucket = var.bucket_name
tags = {
Name = var.bucket_name
Environment = var.environment
}
}
resource "aws_s3_bucket_acl" "site_bucket_acl" {
bucket = aws_s3_bucket.site_bucket.id
acl = "public-read"
}
resource "aws_s3_bucket_website_configuration" "site_config" {
bucket = aws_s3_bucket.site_bucket.id
index_document {
suffix = "index.html"
}
error_document {
key = "/404.html"
}
}
resource "aws_s3_bucket_public_access_block" "public_access_block" {
bucket = aws_s3_bucket.site_bucket.id
block_public_acls = false
block_public_policy = false
ignore_public_acls = false
restrict_public_buckets = false
}
locals {
s3_origin_id = "myS3Origin"
}
And another terraform apply
.
Once the bucket and it's corresponding website configuration was set up, it was a simple matter of running the s3 deploy plugin in Gatsby
Getting this into Cloudfront
Okay so this got out of hand...
This bit is where it got frustrating. At first, as noted earlier - I had intended for the buckets to be locked down and only available via Cloudfront. It quickly became clear due to the way that routing works in Gatsby that this couldn't be the case, and so I slowly had to undo the pieces I had put in place to protect the bucket from direct access.
The most frustrating part of this was the subtle difference (when you're looking at code at least) that if you serve an S3 bucket on it's website endpoint, it changes from an s3_origin_config to a custom_origin_config which uses some different rules. It's a small gripe, but I am a drama queen and I will make a mountain out of this molehill.
Anyway, let's get building
Firstly, I knew I needed a certificate that I could attach to my Cloudfront distribution, so I made that first. Cloudfront only accepts certificates made in their us-east-1
zone, so remember this when you're building. I used a separate aliased Terraform provider, and considering how simple it makes things I used the AWS ACM module to set up the certificate - this handily does the minor lifting of validating the certificate using Route53 for me (but if you want to roll your own, Terraform gives you all the component pieces to do so pretty easily!):
Certificate
certificate.tf
provider "aws" {
alias = "us-east-1"
region = "us-east-1"
}
module "acm" {
source = "terraform-aws-modules/acm/aws"
version = "~> 3.0"
domain_name = var.domain_name
zone_id = aws_route53_zone.main.zone_id
subject_alternative_names = [
"*.${var.domain_name}",
]
providers = {
aws = aws.us-east-1
}
wait_for_validation = true
}
Applied that - this can take slightly longer if the DNS validation doesn't happen straight away - and I had my certificate!
Cloudfront
With the certificate made, I proceeded onto the Cloudfront distribution. This uses the S3 Website Endpoint as an origin (which doesn't accept HTTPS - this revelation took me another short while to figure out as I refreshed to multiple 504 pages). I made sure to include a redirect-to-https in the configuration so that we don't have anybody accessing the site over HTTP:
cloudfront.tf
resource "aws_cloudfront_distribution" "s3_distribution" {
origin {
domain_name = aws_s3_bucket.site_bucket.website_endpoint
origin_id = local.s3_origin_id
custom_origin_config {
http_port = 80
https_port = 443
origin_protocol_policy = "http-only"
origin_ssl_protocols = ["SSLv3", "TLSv1.2", "TLSv1.1", "TLSv1"]
}
}
enabled = true
is_ipv6_enabled = true
comment = "Some comment"
default_root_object = "index.html"
aliases = [ var.domain_name ]
default_cache_behavior {
allowed_methods = ["GET", "HEAD", "OPTIONS"]
cached_methods = ["GET", "HEAD"]
target_origin_id = local.s3_origin_id
forwarded_values {
query_string = false
cookies {
forward = "none"
}
}
viewer_protocol_policy = "redirect-to-https"
min_ttl = 0
default_ttl = 3600
max_ttl = 86400
}
price_class = "PriceClass_200"
restrictions {
geo_restriction {
restriction_type = "none"
}
}
tags = {
Environment = var.environment
}
viewer_certificate {
acm_certificate_arn = module.acm.acm_certificate_arn
ssl_support_method = "sni-only"
}
}
Finally, I added one more block to my route53.tf
:
route53.tf
resource "aws_route53_record" "site" {
zone_id = aws_route53_zone.main.zone_id
name = var.domain_name
type = "A"
alias {
name = aws_cloudfront_distribution.s3_distribution.domain_name
zone_id = aws_cloudfront_distribution.s3_distribution.hosted_zone_id
evaluate_target_health = false
}
}
Applying this all took the longest because Cloudfront distributions can take a little while (in their defence, they sort of have to take over the world).
Once the route had propagated... well, if you're reading this on https://man-yells-at.cloud then you're looking at it!
Conclusion
MVP is best
Some bits didn't work out the way I wanted, and that's okay! This is meant to be a rough side project that:
- gives me the ability to post ✅
- without having to worry about significant cost or security issues ✅
- doesn't take long to set up and / or maintain ✅
It's hit the minimum of what I set out to do, and if it becomes infeasible to manage later down the line I can spend a bit of time moving the React bits into their own thing, and and making something a bit more fully formed later on. Maybe it's Lambdas!
Can we get more friendly?
Definitely the biggest issue I saw was between the Cloudfront -> S3 connection. Whilst typical setups are well documented, it took a fair bit of exploration and trial-and-error to get what I'd assume is a fairly common outlier up and running. Maybe it doesn't get documented a lot because there are friendlier platforms for Gatsby out there and I added complication by trying to put it on S3?
Stay tuned for more!
Getting this published once is all well and good, but I'm going to need to automate the deployment. In my next post, I'll build a secure pipeline that authenticates with AWS without needing an access key for deploying changes / new posts on my site.
Top comments (0)