DEV Community

Zoe
Zoe

Posted on • Originally published at s1lv3r.codes

Automatic LetsEncrypt certificates with Tailscale and Traefik in Docker

Just want it to work? See #The Solution. Want to know why it works, along with some backstory? Read on :)

Recently (not really, 2022), Traefik released support for obtaining LetsEncrypt certs for Tailscale hosts. With this, Traefik gained the ability to automatically (if configured correctly) communicate with a locally running Tailscale daemon to request a certificate through the Tailscale certificate cervices for the locally running machine.

Doing some research, this seemed to perfectly align with my goals, so I gave it a try. Reading up on documentation a bit, I figured it would be enough to just configure Traefik to be on the same network as Tailscale, so I tried the following:

services:
  tailscale:
    # redacated for example
  traefik:
    network: service:tailscale
Enter fullscreen mode Exit fullscreen mode

However, when trying this out I kept getting a strange error... Traefik was unable to communicate with the Tailscale daemon even though it was running on the same network!

Looking into the logs and source, I discovered that Traefik is trying to access Tailscale through the local socket on /var/run/tailscale/tailscaled.sock. This obviously isn't passed through through the above network: option, so I add some more:

services:
  tailscale:
    volumes: [var_run_tailscale:/var/run/tailscale]
  traefik:
    volumes: [var_run_tailscale:/var/run/tailscale]

volumes:
  var_run_tailscale:
Enter fullscreen mode Exit fullscreen mode

But, this also didn't work! At this point I got stumped for a solid while. It should've worked, no? Traefik has access to the Tailscale socket? But no! Digging into the containers themselves and checking the status of the socket, I see this:

❯ docker compose exec traefik sh
/ # stat /var/run/tailscale/tailscaled.sock
  File: '/var/run/tailscale/tailscaled.sock' -> '/tmp/tailscaled.sock'
#...
Enter fullscreen mode Exit fullscreen mode

Aha! The socket, for whatever reason, is actually a symlink to /tmp?? This really confused me, as on my non-docker install of tailscale the socket is simply a regular socket.

Talking with a friend, I was told that sockets are usually put in /tmp on rootless containers, however Tailscale seems to run as rootfull? Running ps aux in the container reveals tailscaled running as root, so I don't think that's the explanation in this case.

So, more source code digging is required. Looking into the container, I see it starts a binary called containerboot. I checked the tailscale repo, and found the matching file: cmd/containerboot/tailscaled.go Nestled in there, on lines 78-82, there is the following:

case cfg.StateDir != "":
  args = append(args, "--statedir="+cfg.StateDir)
default:
  args = append(args, "--state=mem:", "--statedir=/tmp")
}
Enter fullscreen mode Exit fullscreen mode

And where does cfg.StateDir come from? Well, cfg is initialized in cmd/containerboot/main.go by calling configFromEnv(), and that is defined in cmd/containerboot/settings.go like this:

func configFromEnv() (*settings, error) {
  cfg := &settings{
    // ...
    Socket: defaultEnv("TS_SOCKET", "/tmp/tailscaled.sock"),
    // ...
  }
}
Enter fullscreen mode Exit fullscreen mode

Aha! It defaults to putting the socket in /tmp (for whatever reason)! Why this was introduced I have NO idea, I tried to dig around a bit and got nowhere1, however this does tell me what to change to get it to work properly.

The solution

So with these in mind, there are a few required properties for the compose file:

  • Traefik has to use Tailscale's networking to allow using the tailnet
  • Traefik needs R/W access to tailscale.sock to be able to request a certificate
  • Tailscale needs to put the socket in the correct location

This gives me the following file in the end (only including keys required for this specific config):

services:
  traefik:
    network: service:tailscale
    command:
      - --certificatesResolvers.tailscale.tailscale=true
      - --entrypoints.websecure.address=:443
      - --entrypoints.websecure.http.tls.certResolver=tailscale
    volumes:
      - var_run_tailscale:/var/run/tailscale:rw,Z
    depends_on:
      tailscale:
        required: true
        condition: service_healthy

  tailscale:
    volumes:
      - var_run_tailscale:/var/run/tailscale:rw,Z
    environment:
      TS_SOCKET: /var/run/tailscale/tailscale.sock
      TS_USERSPACE: false # Traffic goes through traefik, not directly through tailscale itself
      TS_HOSTNAME: your-app.tailscale-domain.ts.net
Enter fullscreen mode Exit fullscreen mode

Those options will expose Traefik to the tailnet, and allow it to get certificates for itself. Note that Traefik can only ever get a single cert, namely whatever matches the TS_HOSTNAME env var. This is a limitation of Tailscale, and while there is an open issue to get this changed (#7081), it has not seen activity in almost two years as of this writing, so I am not hopeful for it being implemented anytime soon.


  1. Well actually that's not completely true. I did find a few issues on it, most notably #6849 - Change default socket path in containers as well as the commit where this was introduced (2c403cb), however neither of these explain why it was done the way it was. 

Top comments (0)