Setting up automatic failover for your static websites hosted on S3

#webdev #aws

Up until late last year, you couldn't set up automatic failover in CloudFront to look at a different bucket. You could only have one Primary origin at a time, which meant that you needed additional scripts to monitor and trigger failover. (It was one of the gotcha's I had written about previously about using CloudFront)

Then AWS dropped the bombshell during re:invent 2018 that Cloudfront can now support it by using Origin Groups - Announcement

In this post, I'll show how you can configure this for an existing site via the console.

Adding an origin group to your Cloudfront distribution

I was working on this for one of my websites - gotothat.link - its not a static website, but its just a site that serves a bunch of redirects so all the same rules apply.

Step 1 - Configure replication on your S3 bucket

Go to the replication settings page for your S3 bucket.

If you haven't yet enabled Versioning, it will tell you that you're required to do so.

Enable replication for the entire bucket by clicking Add Rule. For our use case, we don't need to encrypt the replicated contents.

You can follow the prompt to create a new bucket in us-west-1 - N. California.

Once you've created the S3 bucket, make sure to update the bucket policy on your new replication bucket to allow the static website hosting to work:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::gotothat.link-west-1/*"
        }
    ]
}

For the IAM role, I'll let the console create one for me - this is whats generated automatically for you

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:Get*",
                "s3:ListBucket"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::gotothat.link",
                "arn:aws:s3:::gotothat.link/*"
            ]
        },
        {
            "Action": [
                "s3:ReplicateObject",
                "s3:ReplicateDelete",
                "s3:ReplicateTags",
                "s3:GetObjectVersionTagging"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::gotothat.link-west-1/*"
        }
    ]
}

Once you click Save, you should see this panel appear now when you're on the Management page of your S3 bucket.

Step 2: Replicate data

Replication is now enabled. But we're not quite done.

When you enable replication, any objects created/modified from that point forward are then replicated to your target bucket. Any existing objects are NOT automatically transferred over - you have to do a one-time sync when you first set it up.

If all you have in your bucket is your website files and nothing else, then you can do an aws s3 sync s3://source.bucket s3://target.bucket to import all your existing objects. Depending on how much is in there, it can take awhile.

If you have some objects in your source S3 bucket that require you use the Website-Redirect-Location metadata (this is needed if you want to serve redirects out of your domain like this):

You cannot do a sync with the replicated bucket using something like aws s3 sync s3://my-bucket-name s3://my-bucket-name-us-west-1. While the sync may succeed, your redirects will not. This is annoying.
The S3 docs state that Website-Redirect-Location isn't copied over when using a sync command, although it doesn't specify why. See docs here
However, Objects that are automatically replicated will keep their Website-Redirect-Location metadata. But any files that are aws s3 sync'd between 2 buckets will not have that metadata.

Last step is to go to the replicated bucket, select Properties, and enable Versioning + Static Website Hosting for it. If you forget this, you won't receive the 301 redirect.

At this point, you should have a bucket in another region, with all the same contents as your primary bucket, with static website hosting enabled.

Configure Failover in CloudFront

Create a new origin with an origin id like gotothatlink-failover-prod and use the URL of the form gotothat.link-west-1.s3-website-us-west-1.amazonaws.com

Create a new origin group and add your two origins at the prompt. The primary origin will be our original S3 bucket, and the secondary origin will be the new failover origin we created.

For the failover criteria, we can check all the 5xx status codes since that would imply issues happening in S3. You can check 404s if you want to protect against someone accidentally deleting all your data, or 403 Forbidden errors to protect against accidental bucket policy changes.

Next, go to the Behaviors tab, and make sure to update the behavior to route all paths to your Origin Group (and not your Origin). Otherwise, even if it fails over, it will still send requests to your primary origin.

After you've set these changes, you'll have to wait for the CloudFront distribution to finish updating (could take up to a half hour).

Test it by "accidentally" trashing your primary bucket

Since I've enabled failover on 404 errors, I'll run a test where I delete everything in my bucket. I ran aws s3 rm --recursive s3://gotothat.link/.

And now to save you boatloads of time - here is every error I ran into when setting this up.

Troubleshooting:

You get a 404 - Cloudfront is still looking at your primary bucket. Turns out I hadn't updated the Behaviors for the CloudFront distribution.
You get a 403 - Your replicated bucket has incorrect permissions. See the section above labeled "Configure replication on your S3 bucket" to see the required bucket policy.
You now get a 200 - your object doesn't have the Website-Redirect-Location metadata associated with it
- Take a look at the replicated object by doing the following - note that it includes WebsiteRedirectLocation
- Check that you haven't done an aws s3 sync to import your redirects to your bucket.

☁  intricate-web  aws s3api get-object --bucket gotothat.link-west-1 --key newvideo  newvideo-1
{
    "AcceptRanges": "bytes",
    "LastModified": "Sun, 22 Sep 2019 05:21:08 GMT",
    "ContentLength": 0,
    "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
    "VersionId": "dynNjEGPeE3.z6kCCp.zgvVlbYkA.h3X",
    "ContentType": "binary/octet-stream",
    "WebsiteRedirectLocation": "https://youtube.com/video",
    "Metadata": {},
    "ReplicationStatus": "REPLICA"
}

Wrapping up

If you're doing it via the console, this doesn't take long to set up. Its a handful of steps that you can do in < 30 mins that will give you extra confidence that your site is backed up, and still functional even if S3 goes down (it usually has a high-profile outage at least once a year).

With this solution in place, you have an HTTPS website served from a global CDN with automatic failover in case of an S3 outage or data loss in your primary bucket so that you don't need to take any downtime. All this for just a few bucks a month. :mindblown: