DEV Community

Pavel
Pavel

Posted on • Originally published at hostim.dev

Let's Encrypt Wildcard Certs in Kubernetes: cert-manager + DNS-01 (and When We Skipped It)

If you run Kubernetes and want a wildcard TLS cert from Let's Encrypt — say *.example.com — you need a DNS-01 challenge. HTTP-01 cannot prove control over a wildcard. That single fact rules out the easy path most tutorials show.

This post is what we actually run at Hostim.dev for our shared *.region.hostim.dev wildcard. We use cert-manager for per-app certs and a plain certbot Ansible playbook for the wildcard. Two different tools for two different jobs. We will explain why, then show the code for both.

Why two tools for one cluster?

You can do everything with cert-manager. It supports DNS-01 with a long list of providers. So why are we running a second tool?

Three reasons:

  1. Our DNS provider (Namecheap) does not have a stable cert-manager webhook. There are community webhooks, but they break on upgrades. Maintaining one for a single cert is more work than running certbot once a quarter.
  2. The wildcard cert covers our shared ingress, not user apps. It rotates rarely, lives in one namespace, and is read by every ingress as a TLS secret. cert-manager is built for the opposite case: many short-lived certs per Ingress.
  3. A failed cert-manager renewal at 3 a.m. is hard to debug. A failed Ansible run on our laptop is a stack trace we can read.

For per-app domains (my-app.user.tld with cert-manager + HTTP-01), the controller-driven model wins. For the one shared wildcard, the manual model wins. Use the right tool.

Path A: cert-manager + HTTP-01 (per-app domains)

This is the standard path. Most apps want a cert for one or two hostnames. HTTP-01 is the simplest challenge: cert-manager spins up a temporary pod, the ACME server hits http://app.example.com/.well-known/acme-challenge/..., the pod responds, the cert is issued.

1. Install cert-manager

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.16.1/cert-manager.yaml
Enter fullscreen mode Exit fullscreen mode

Wait for the three pods (cert-manager, cert-manager-webhook, cert-manager-cainjector) to be ready.

2. Create a ClusterIssuer

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: you@example.com
    privateKeySecretRef:
      name: letsencrypt-prod-account
    solvers:
      - http01:
          ingress:
            class: nginx
Enter fullscreen mode Exit fullscreen mode

Apply it. cert-manager will register an ACME account on first use.

3. Annotate your Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
    - hosts: ["app.example.com"]
      secretName: app-example-com-tls
  rules:
    - host: app.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-app
                port:
                  number: 80
Enter fullscreen mode Exit fullscreen mode

That is it. cert-manager sees the annotation, requests the cert, solves the HTTP-01 challenge, writes the cert into the app-example-com-tls secret. Renewal is automatic.

This works for any number of distinct hostnames. We do this exact thing for every user app on hostim.dev.

Path B: certbot + DNS-01 (the wildcard)

For *.region.hostim.dev, HTTP-01 cannot work — the ACME server cannot resolve every possible subdomain. We need DNS-01: prove control over the parent domain by adding a TXT record.

You can do this with cert-manager and a DNS-01 webhook for your provider. We chose not to. Here is the Ansible playbook we run instead.

The flow

  1. Ansible writes two scripts: an auth hook (creates the TXT record) and a cleanup hook (deletes it).
  2. certbot --manual --preferred-challenges dns runs the auth hook, waits for DNS to propagate, lets ACME verify, then runs the cleanup hook.
  3. The resulting fullchain.pem and privkey.pem get loaded into a Kubernetes Secret of type kubernetes.io/tls.
  4. Every ingress in the shared namespace references that secret.

The playbook (trimmed)

- name: Issue and upload wildcard TLS certificate
  hosts: localhost
  vars:
    sld: "example"
    tld: "com"
    region: "eu-center"
    wildcard_domain: "*.{{ region }}.{{ sld }}.{{ tld }}"
    local_tmp: "/tmp/wildcard-{{ region }}"
    k8s_namespace: "ingress-nginx"
    k8s_secret_name: "wildcard-{{ region }}-tls"

  tasks:
    - name: Create certbot auth hook (creates the TXT record)
      copy:
        dest: "/tmp/certbot-auth-{{ region }}.sh"
        mode: "0755"
        content: |
          #!/bin/bash
          set -e
          namecheap-cli setone \
            --sld {{ sld }} --tld {{ tld }} \
            --type TXT --name "_acme-challenge.{{ region }}" \
            --address "${CERTBOT_VALIDATION}" --ttl 60
          # Wait for DNS to propagate
          for i in {1..30}; do
            val=$(dig TXT _acme-challenge.{{ region }}.{{ sld }}.{{ tld }} @1.1.1.1 +short | tr -d '"')
            [[ "$val" == "${CERTBOT_VALIDATION}" ]] && break
            sleep 10
          done
          sleep 30  # belt and suspenders

    - name: Issue wildcard certificate
      command: >
        certbot certonly --manual --preferred-challenges dns
        --manual-auth-hook /tmp/certbot-auth-{{ region }}.sh
        --manual-cleanup-hook /tmp/certbot-cleanup-{{ region }}.sh
        --agree-tos -m you@example.com
        --server https://acme-v02.api.letsencrypt.org/directory
        -d "{{ wildcard_domain }}"
        --work-dir {{ local_tmp }} --config-dir {{ local_tmp }}
        --logs-dir {{ local_tmp }} --non-interactive

    - name: Create or update TLS Secret
      kubernetes.core.k8s:
        state: present
        namespace: "{{ k8s_namespace }}"
        definition:
          apiVersion: v1
          kind: Secret
          metadata:
            name: "{{ k8s_secret_name }}"
          type: kubernetes.io/tls
          data:
            tls.crt: "{{ lookup('file', local_tmp + '/live/.../fullchain.pem') | b64encode }}"
            tls.key: "{{ lookup('file', local_tmp + '/live/.../privkey.pem') | b64encode }}"
Enter fullscreen mode Exit fullscreen mode

Reference the secret in your Ingress

spec:
  tls:
    - hosts: ["*.region.example.com"]
      secretName: wildcard-region-tls
Enter fullscreen mode Exit fullscreen mode

When does it run?

We run the playbook every 60 days. Let's Encrypt certs are valid for 90 days, so 60 leaves a 30-day buffer. A simple cron on a bastion host is enough — we do not even need to automate this. The cost of a manual run twice a quarter is lower than the cost of debugging a webhook.

"Unable to locate package 'appengine'" — a real gotcha we hit

If you copy this playbook and your certbot is from your distro's package manager, you may hit:

ImportError: cannot import name 'appengine' from 'urllib3.contrib'
Enter fullscreen mode Exit fullscreen mode

This is a Python env collision. System certbot (often 1.21) wants old urllib3; you have a newer one in ~/.local/lib/python3.10/site-packages. The newer version dropped appengine.

Quick fix — add PYTHONNOUSERSITE: "1" to the certbot task's environment:

- name: Issue wildcard certificate
  environment:
    PYTHONNOUSERSITE: "1"
  command: >
    certbot certonly --manual ...
Enter fullscreen mode Exit fullscreen mode

Long-term fix — install certbot via snap or pipx so it has its own Python env.

Should you do it this way?

Probably not. If your DNS provider has a stable cert-manager webhook (Cloudflare, Route53, DigitalOcean, Google Cloud DNS), use cert-manager for both per-app and wildcard certs. It is simpler and renews automatically.

The hybrid model only makes sense when:

  • Your DNS provider has no first-party or stable cert-manager support
  • You have one wildcard, not many
  • You would rather audit a 30-line shell script than a webhook deployment

For us those three are all true. For most teams, only the first might be — and even then, switching DNS provider is often easier than maintaining a webhook.

TL;DR

  • Per-app domains → cert-manager + HTTP-01 + ClusterIssuer. One annotation per Ingress, automatic renewals.
  • Wildcards → DNS-01 is mandatory. Use cert-manager with your DNS provider's webhook if it exists. Otherwise, a 60-day Ansible run with certbot --manual and a TLS Secret.
  • Two tools is fine. Don't force one model onto two different problems.

Want to skip TLS entirely?

Hostim.dev does this for you. Bring a Docker image or a git repo, get a cert and a domain.

Top comments (0)