loading...
Cover image for Automating fetching of wildcard LetsEncrypt HTTPS certificates for your domain with Terraform

Automating fetching of wildcard LetsEncrypt HTTPS certificates for your domain with Terraform

meseta profile image Yuan Gao ・6 min read

This post is about automatically fetching a wildcard certificate for your whole domain, and therefore avoiding having to have each subdomain fetch its own certificate. This post is intended for those who already manage their cloud deployment with Terraform.

Introduction to LetsEncrypt

These days accessing websites using HTTPS is the norm. Browsers will generally warn you when you are using an HTTP website, and in some instances refuse to connect to HTTP resources from sites that are served with HTTPS.

Fortunately the process of getting an HTTPS certificate using LetsEncrypt is pretty trivial, especially if you use docker. Prior to my setting up a wildcard request (the subject of this post), I had my VMs all do this on startup:

docker run -d -v /etc/letsencrypt:/etc/letsencrypt -p 80:80 certbot/certbot certonly --standalone --preferred-challenges http -d site.example.com -m letsencrypt@example.com --agree-tos --no-eff-email

This is basically an unattended certificate fetch. What it will do is start a temporary webserver at port 80 on machine, run certbot in "certonly" mode (i.e. doesn't try to detect an existing webserver to try to configure), and respond to HTTP challenges needed for certbot. This assumes this particular machine has already been set up as site.example.com on the DNS (which is something in my setups, Terraform will do). How this command works exactly is outside the scope of this post, but check out the certbot docker image documentation, and the certbot documentation for more details.

This method has been relatively straightforward and quite reliable for managing LetsEncrypt certs for a small number of VMs, I even ended up writing a shell script that would accept the request domain as an argument, do the the above, and then sit in a sleep loop trying to renew the cert every few days; and built this into a Google VM using Packer so that every machine booted up with certbot installed and ready to fetch the required subdomain cert. But as the deployment grew, and as I started needing more internal-only certs that wasn't externally accessible, I needed a better solution.

Wildcard certificates

Fetching a certificate for a site.example.com domain means that certificate is valid for that one subdomain. A wildcard certificate on the other hand, like *.example.com would match every subdomain of example.com. This means you'd only need one certificate issued for your entire domain.

The slight catch to this is that you can't issue an HTTP challenge to *.example.com for you to prove to LetsEncrypt that you own the domain. Instead, you need to do a DNS challenge. This is where upon connecting to LetsEncrypt servers, they tell you to stick a special value in your DNS records, so that you can prove you own the whole DNS.

Doing this manually isn't too bad, you can run it in docker as well (but this time as an interactive program)

docker run --rm -it -v /etc/letsencrypt:/etc/letsencrypt certbot/certbot certonly --manual --preferred-challenges dns -d *.example.com -m letsencrypt@example.com --agree-tos --no-eff-email

This will create the following prompt:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator manual, Installer None
Obtaining a new certificate
Performing the following challenges:
dns-01 challenge for example.com

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
NOTE: The IP of this machine will be publicly logged as having requested this
certificate. If you're running certbot in manual mode on a machine that is not
your server, please ensure you're okay with that.


Are you OK with your IP being logged?
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
(Y)es/(N)o: y

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Please deploy a DNS TXT record under the name
_acme-challenge.example.com with the following value:

F2np-hIEy7ajPLK6OaWztedukdTQCNGJgzB-PfOaT24

Before continuing, verify the record is deployed.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Press Enter to Continue

At this point, you need to go create a TXT record of F2np-hIEy7ajPLK6OaWztedukdTQCNGJgzB-PfOaT24 in the DNS of your domain, and then wait some time for that DNS to propogate a little (usually not too much time as LetsEncrypt is smart enough to poll the authoritive DNS of your domain, rather than the public DNS which would take longer to propagate to), and then hit Enter to let LetsEncrypt check DNS before issuing a certificate, which you can then collect and store somewhere you can use for future deployments.

This is a manual multi-step process, you have to:

  1. Run certbot to collect the challenge text
  2. Put that challenge text in the DNS
  3. Wait a while for the DNS to propagate
  4. Hit enter back at your challenge prompt and wait for LetsEncrypt to generate a certificate
  5. Pick up that certificate and put it somewhere for distribution

Doing this manually every three months or so isn't a bad choice, but we can do better.

Wildcard Certificates under Terraform

If you're not familiar with Terraform already, it's IaC - Infrastructure as Code, letting you set up and configure your public cloud resources using a declarative langauge, rather than by hand using the web interface, or by CLI. Its strengths is that once you write your declarative cloud configs (which are checked into version control for easy versioning and collaboration), it communicates with the cloud provider and works out what changes are needed, and only applies those changes. You don't need to worry about accidentally creating duplicate resources.

Fortunately for us, Terraform already has a module that does DNS-based LetsEncrypt challenges. The terraform file can be quite simple, and looks like this (assuming we're using GCP, but there is support for AWS Route53 and others)

provider "acme" {
  # staging
  # server_url = "https://acme-staging-v02.api.letsencrypt.org/directory"

  # production
  server_url = "https://acme-v02.api.letsencrypt.org/directory"
}

resource "tls_private_key" "reg_private_key" {
  algorithm = "RSA"
}

resource "acme_registration" "reg" {
  account_key_pem = tls_private_key.reg_private_key.private_key_pem
  email_address   = "letsencrypt@example.com"
}

resource "tls_private_key" "cert_private_key" {
  algorithm = "RSA"
}

resource "tls_cert_request" "req" {
  key_algorithm   = "RSA"
  private_key_pem = tls_private_key.cert_private_key.private_key_pem
  subject {
    common_name = "*.example.com"
    organization = "Example Organization"
  }
}

resource "acme_certificate" "certificate" {
  account_key_pem           = acme_registration.reg.account_key_pem
  certificate_request_pem   = tls_cert_request.req.cert_request_pem

  dns_challenge {
    provider = "gcloud"
    config = {
      GCE_SERVICE_ACCOUNT_FILE = "path/to/credentials.json"
      GCE_POLLING_INTERVAL = 240
      GCE_PROPAGATION_TIMEOUT = 600
    }
  }
}

The only thing that was a gotcha here was how the dns_challenge provider works in the acme_certificate resource. The "acme" provider maintains its own client to go do the DNS updates, which are separate from any other Terraform providers you may be using. This provider appears to be able to automatically find your DNS records and add the TXT to them, though it does have some variables to help tell it where to search for them.

In the above example, I opted to specify an GCE_SERVICE_ACCOUNT_FILE that I already had (it's the same one that the Terraform Google provider uses, so it was already on disk somewhere). But you could also have Terraform generate a service account and key with just the DNS privileges if you want.

The second issue was that I was finding it checking the DNS too fast, before the records had propagated to a point where LetsEncrypt could read it, so I stuck in custom polling interval and timeout, which has fixed it.

Once this is applied, the cert files are available as attributes of acme_certificate.certificate. An example usage of copying them into a VM is:


resource "google_compute_instance" "a_vm" {
  ...

  provisioner "file" {
    content = tls_private_key.cert_private_key.private_key_pem
    destination = "/etc/ssl/privkey.pem"

    connection {
      type        = "ssh"
      ...
    }
  }

  provisioner "file" {
    content = "${acme_certificate.certificate.certificate_pem}${acme_certificate.certificate.issuer_pem}"
    destination = "/etc/ssl/fullchain.pem"

    connection {
      type        = "ssh"
      ...
    }
  }
}

This pulls in the private key used to request the certificate, and the concatenates the certificate/issuer used to form the fullchain.pem.

Unfortunately there's a slight complexity when using this method when it comes to dealing with certificate renewals, since Terraform doesn't automatically taint VM resources file provisioner contents change, meaning if using this method, I would have to manually taint this VM so that Terraform can recreate it with the new certificate. Fortunately there are other ways to handle this without manually tainting, ranging from using a load balancer or proxy, to sticking the certificate into something else to manage.

Either way, the issue of keeping wildcard certs updated across resources is not dissimilar when using Terraform and when doing it manually, and since this has now simplified the task of fetching a wildcard certificate for a domain, and making the certificate programatically available ready to be stuck into whatever you use to manage certs, from 5 or 6 manual steps, down to a single terraform apply, I call that a win.


Cover Photo by Mackenzie Marco on Unsplash

Posted on by:

meseta profile

Yuan Gao

@meseta

CTO in tech πŸ‘¨β€πŸ’» Python, Vue.js, Former Electrical Engineer πŸ€– Occasional robot robot builder and gamedev πŸ† Forbes 30 Under 30 Enterprise tech

Discussion

pic
Editor guide