우병수

Posted on May 7 • Originally published at techdigestor.com

Self-Hosting GitLab Runners: Stop Paying for Shared Minutes and Run Your Own

#docker #devops #productivity #tools

TL;DR: The thing that finally broke me wasn't the compute minute cap itself — it was watching a critical hotfix sit in the queue for 14 minutes during a Friday afternoon deploy because every shared runner was saturated. GitLab's free tier gives you 400 minutes per month on shared runne

📖 Reading time: ~31 min

What's in this article

Why I Stopped Using GitLab's Shared Runners
What You Actually Need Before Starting
Step 1 — Install the GitLab Runner Binary
Step 2 — Register the Runner with Your GitLab Instance
Step 3 — Configure the Runner for Docker Executor
Step 4 — Write a .gitlab-ci.yml That Actually Uses Your Runner
The Rough Edges I Hit (And How to Fix Them)
Locking Down the Runner: Security Basics You Shouldn't Skip

Why I Stopped Using GitLab's Shared Runners

The thing that finally broke me wasn't the compute minute cap itself — it was watching a critical hotfix sit in the queue for 14 minutes during a Friday afternoon deploy because every shared runner was saturated. GitLab's free tier gives you 400 minutes per month on shared runners, and the paid tiers aren't much better when you're running a team with multiple pipelines firing constantly. On GitLab Premium you get 10,000 minutes/month per group, which sounds generous until you have a monorepo with a 25-minute test suite running on every merge request.

The cost math is where it gets embarrassing. GitLab SaaS charges around $10 per 1,000 additional minutes when you exhaust your allocation. If your team burns through 20,000 extra minutes in a month — completely normal for a mid-sized team with integration tests — that's $200 on top of your seat licenses. A $20/month Hetzner VPS (their CX21, 2 vCPU, 4GB RAM) or a $24/month DigitalOcean Droplet runs unlimited minutes. I ran the numbers after three months of overages and the self-hosted runner paid for itself in the first week of that billing cycle. The math only gets more extreme if you're on AWS and can reserve EC2 capacity.

Docker-in-Docker is where shared runners fall apart architecturally, not just economically. GitLab's shared fleet uses a locked-down executor config that makes DinD either broken or requires the --privileged flag, which shared runners won't give you. If you're building container images in CI — and most teams are — you're either fighting with kaniko workarounds or you're getting inexplicable failures that disappear the moment you run it locally. Same story with large artifacts: shared runners have a 1GB artifact size limit and will silently truncate or fail uploads beyond that. I have a pipeline that produces a 2.3GB compiled binary for an embedded target. That pipeline simply cannot run on shared infrastructure.

Monorepos accelerate the pain in a specific way. If you're using rules: changes: to only run affected service pipelines, you still end up with a lot of concurrent jobs that each need 8–12 minutes. Queue time on shared runners compounds this — I've seen total wall-clock time from push to green badge hit 45 minutes, where the actual compute was only 18 minutes of that. A self-hosted runner with a concurrent = 8 config on a machine you control collapses that to near-actual-compute-time. That's the difference between developers context-switching three times per PR and just waiting once.

One other scenario that doesn't get mentioned enough: network access. Shared runners can't reach your internal services — your staging database, your private npm registry, your internal Docker registry. You end up with elaborate workarounds like exposing services over the public internet with auth tokens, or maintaining a second "integration" pipeline that's semi-manual. Self-hosted runners live inside your VPC by default. Your pipeline can talk to postgres.internal:5432 directly. That single change eliminated an entire category of flaky tests for us. For a broader look at tooling that pairs well with a tightened CI setup, the Essential SaaS Tools for Small Business in 2026 guide covers complementary pieces worth having alongside your runner infrastructure.

What You Actually Need Before Starting

The thing that trips most people up isn't the runner registration — it's showing up with a 1GB RAM droplet because it "looks fine" in the GitLab docs. I've done this. Your pipeline starts, spins up a Docker container, pulls a Node image, and then just... hangs. Then OOM-killer fires. Then you're debugging a zombie job at 11pm wondering why your CI is flaky. 2 vCPU and 4GB RAM is the real minimum for anything that involves Docker-in-Docker, building images, or running test suites. For a simple shell executor running bash scripts, you can get away with 2GB — but I wouldn't bother trying to go lower.

You've got four main executor types to choose from, and the naming is confusing enough that I want to be direct about this:

shell — runs jobs directly on the host. Fast, but jobs share state, leave artifacts behind, and can poison each other. Use this only for quick internal tooling where isolation doesn't matter.
docker — spins up a fresh container per job using a specified image. This is what you should start with. Clean environment every time, no leftover state, reproducible builds.
docker+machine — autoscales by provisioning new VMs on demand via Docker Machine. Docker Machine is effectively abandoned upstream, so I wouldn't start anything new on this in 2024.
kubernetes — runs each job as a pod. Powerful, but the configuration surface area is large. Save this for when you actually need horizontal scaling and already know Kubernetes.

Start with the docker executor. The mental model is simple, the debugging is straightforward (docker ps and docker logs are your friends), and it handles 90% of real CI workloads cleanly. You can always migrate to Kubernetes later — but starting there means you're debugging pod scheduling issues before you've even gotten a green pipeline.

GitLab runner versioning is one of those things where the docs bury the important bit. The runner version should match your GitLab instance version, or be no more than one minor version behind. So if you're on GitLab 16.11, runner 16.10 is fine — runner 15.x is not. Running a significantly older runner against a newer GitLab instance causes silent weirdness: job variables don't get passed correctly, certain CI features get ignored without error. Check your GitLab version at Help → About GitLab or via the API at /api/v4/version before you download anything.

Here's the exact prerequisite checklist I run through before touching a single command:

A GitLab account with Maintainer or Owner role on the project or group you're setting the runner up for — Reporter and Developer roles can't register runners.
A Linux VM running Ubuntu 22.04 LTS. This is what I use throughout, and it's the distro with the longest documented support path for the GitLab runner package repo.
Docker installed and the daemon running. Verify with:

# Both of these should return without error before you proceed
docker --version
# Docker version 25.0.3, build 4debf41

sudo systemctl is-active docker
# active

If Docker is installed but not running, sudo systemctl enable --now docker handles it. Also make sure the user you'll run the runner as is in the docker group — otherwise every job will fail with a socket permission error that looks completely unrelated to permissions at first glance: sudo usermod -aG docker gitlab-runner, then log out and back in for it to take effect.

Step 1 — Install the GitLab Runner Binary

The thing that bit me first time around was installing the runner binary from the default Ubuntu repos — don't. The version there is months behind and you'll hit registration API mismatches that produce completely unhelpful error messages. Always pull from GitLab's own package repo.

Add the official repo and install in two commands:

# Adds GitLab's apt repo and GPG key — safe to re-run on existing machines
curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | sudo bash

# Installs latest stable release
sudo apt-get install gitlab-runner

That script auto-detects your distro (Debian, Ubuntu, Raspbian) and sets up the right apt sources. After the install finishes, the runner process starts automatically as a systemd service and sets up a gitlab-runner system user. You don't need to do anything extra to make it persist across reboots — that's already handled.

If your self-hosted GitLab instance isn't running the absolute latest version, pin the runner to match. GitLab's compatibility policy says the runner minor version should not exceed the GitLab server minor version — violate this and you'll see subtle job failures that are hard to trace. Pin like this:

# Check available versions first
apt-cache madison gitlab-runner

# Install a specific version — here matching a GitLab 16.9 instance
sudo apt-get install gitlab-runner=16.9.0

To lock it so apt upgrade doesn't silently bump it later:

sudo apt-mark hold gitlab-runner

Verify the install and confirm the service is actually running:

# Should output version, architecture, and Git revision
gitlab-runner --version

# Look for "active (running)" — if it's failed, check journalctl -u gitlab-runner
sudo systemctl status gitlab-runner

The ARM gotcha is real. The install script should detect architecture automatically, but I've seen it pull x86_64 binaries on Graviton instances when the OS metadata was misconfigured. Before you do anything else on an ARM machine — AWS Graviton2/3, Raspberry Pi 4, Ampere Altra — run uname -m and confirm you get aarch64 back. Then cross-check with gitlab-runner --version which prints the architecture field explicitly. If those don't match, you'll get a silent crash on first job execution with no useful output in the log.

Step 2 — Register the Runner with Your GitLab Instance

The token location changed in GitLab 16.x and it trips up a lot of people following older tutorials. You no longer grab a shared registration token from the admin panel — you create a runner object first through the UI, and then get the token. Go to your project → Settings → CI/CD → Runners → New project runner. Fill in the platform (Linux), add your tags, optionally check "Run untagged jobs," hit Create, and only then do you get a one-time registration token. That token is prefixed with glrt- now, not the old 20-character alphanumeric string. Don't confuse the two — the old gitlab-runner register --registration-token flow is deprecated since 16.0 and removed in later patches.

Once you have that glrt- token, run the interactive registration on your server:

sudo gitlab-runner register

Here's exactly what I enter for each prompt — and what the wrong answer costs you later:

GitLab instance URL: https://gitlab.com or your self-hosted domain. No trailing slash. Get this wrong and the runner silently fails to connect.
Registration token: paste the glrt- token exactly. It's single-use — if you mess up the rest of the prompts, you'll need to generate a new one from the UI.
Description: Something machine-readable like prod-runner-01-do-nyc3. You'll thank yourself when you have four runners and need to know which one is hung.
Tags: docker,linux,x86_64 — I'll expand on this below.
Optional maintenance note: skip it, just hit Enter.
Executor: docker — this is the only answer that makes sense for isolated CI jobs.
Default Docker image: docker:24.0 if most of your jobs build images, or ubuntu:22.04 if they're general-purpose scripts. This is the fallback when a job doesn't specify image: in its YAML.

Tags are where I've seen teams shoot themselves in the foot. Whatever you type here gets embedded in config.toml, and your .gitlab-ci.yml will need to match exactly. Pick a convention once and never change it mid-project. I use docker,linux,x86_64 because I have ARM runners too (docker,linux,arm64), and this lets me pin architecture-sensitive builds without any ambiguity. If you just use a vague tag like self-hosted, you'll regret it the first time you introduce a second machine with a different arch.

After registration completes, your runner config lives at /etc/gitlab-runner/config.toml. Here's what the real file looks like — not a stripped-down example:

concurrent = 4
check_interval = 0
shutdown_timeout = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "prod-runner-01-do-nyc3"
  url = "https://gitlab.com"
  id = 28374651
  token = "glrt-t3_Abc123xyz789EXAMPLE"
  token_obtained_at = 2024-03-15T10:22:11Z
  token_expires_at = 0001-01-01T00:00:00Z
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    MaxUploadedArchiveSize = 0
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "ubuntu:22.04"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
    network_mtu = 0

Two things in that config to address immediately after registration. First, concurrent = 4 is the global job limit across all runners in this process — the default is 1, which means your runner queues everything serially. I bump it to match the CPU core count of the machine. Second, if any of your jobs build Docker images (not just run inside Docker), you need privileged = true under [runners.docker]. Don't set it globally out of laziness — create a separate tagged runner with privilege for those specific jobs. Running all jobs as privileged defeats the entire isolation model.

Step 3 — Configure the Runner for Docker Executor

The config.toml Block That Actually Works

The registration command spits out a basic config.toml, but the defaults will bite you. Here's the block I land on for most Docker executor setups — stored at /etc/gitlab-runner/config.toml:

concurrent = 4  # tune to your vCPU count — don't go higher than 2x cores

[[runners]]
  name = "my-docker-runner"
  url = "https://gitlab.com"
  token = "YOUR_RUNNER_TOKEN"
  executor = "docker"

  [runners.docker]
    image = "alpine:3.19"
    privileged = false           # flip to true only if you need dind
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    pull_policy = ["if-not-present", "always"]  # saves bandwidth on repeated jobs
    volumes = ["/cache", "/var/run/docker.sock:/var/run/docker.sock"]
    shm_size = 0

The pull_policy array is something a lot of people miss. With ["if-not-present", "always"], GitLab Runner tries the local image cache first, then falls back to pulling. This cuts job startup time meaningfully on runners that keep running the same base image (say, node:20-alpine a hundred times a day). Set it to just "always" if you're paranoid about stale layers in CI, but expect slower cold starts.

concurrent = 4 Is Not a Suggestion — It's a Throttle

The concurrent value at the top of the file is a global cap on how many jobs run simultaneously across all runners defined in that config. I've seen teams set this to 1 by accident, then wonder why their pipeline queues up even though the box has 16 cores and 32 GB of RAM. A sensible starting rule: match it to your physical or vCPU count. If jobs are memory-bound (big Node installs, Java builds), use half the core count. If they're mostly shell scripts and linting, you can go 1.5–2x. Watch htop during a busy pipeline run and tune from there.

The Docker Socket Mount: Convenient, But Know What You're Doing

The /var/run/docker.sock:/var/run/docker.sock volume mount lets your CI jobs talk directly to the host Docker daemon. That means a docker build inside a job uses the real host Docker, with real layer caching, with no nested containers. The speed advantage is real — Docker-in-Docker (dind) adds 10–30 seconds of daemon startup per job.

The tradeoff is significant though. Any job that runs on this runner has effective root on the host machine. A malicious or compromised job could read secrets from other containers, modify host volumes, or pull images and run arbitrary processes. I use socket mounting on internal runners on private networks where all developers already have SSH access to the box anyway. If you're running shared runners, or if third-party code runs in your pipelines, don't do this. Use dind instead, and accept the startup cost.

Docker-in-Docker When You Actually Need to Build Images

If your pipeline builds Docker images and you can't use socket mounting, you need dind. That means setting privileged = true in your runner config and referencing the docker:dind service in your .gitlab-ci.yml:

# In your .gitlab-ci.yml
build-image:
  image: docker:26.1
  services:
    - docker:26.1-dind
  variables:
    DOCKER_TLS_CERTDIR: "/certs"   # required since Docker 19.03 — don't skip this
    DOCKER_HOST: tcp://docker:2376
    DOCKER_TLS_VERIFY: 1
    DOCKER_CERT_PATH: "/certs/client"
  script:
    - docker build -t my-app:$CI_COMMIT_SHORT_SHA .

And in config.toml, under [runners.docker], set privileged = true. The DOCKER_TLS_CERTDIR variable is non-negotiable — skipping it drops you back to unencrypted TCP communication between the job container and the dind daemon, and GitLab's own docs had an inconsistency on this for a while that caused a lot of confusion. With privileged = true, you're giving the container full kernel capabilities. On a dedicated build box that nothing else sensitive touches, acceptable. On a shared host running your database? Absolutely not.

Restart the Runner or Nothing You Just Did Matters

Every single time you edit config.toml, run this:

sudo systemctl restart gitlab-runner

# verify it picked up your changes
sudo gitlab-runner verify
sudo gitlab-runner status

The runner does not hot-reload its config. I've lost a frustrating chunk of an afternoon wondering why concurrent = 8 wasn't taking effect, only to realize I'd never restarted after editing the file. The verify command talks to your GitLab instance and confirms the runner token is still valid and the connection is healthy — run it after every config change, not just restarts. If you're doing this on a remote box, also check sudo journalctl -u gitlab-runner -f to watch the logs live during your next pipeline trigger.

Step 4 — Write a .gitlab-ci.yml That Actually Uses Your Runner

The tag system is where most people get tripped up first. Your self-hosted runner won't automatically pick up jobs — GitLab will keep sending them to shared runners unless you explicitly tag your jobs to match the tags you registered your runner with. If you registered with docker,linux, every job that needs to land on your machine needs those same tags in its definition. No tags on the job means shared runners get it by default (assuming you haven't disabled them at the project level).

Here's a minimal pipeline that actually runs on your runner and does something useful — a Node.js 20 build with a real test command:

stages:
  - build
  - test

variables:
  NODE_ENV: "test"

build:
  stage: build
  image: node:20-alpine
  tags:
    - docker
    - linux
  cache:
    key: $CI_COMMIT_REF_SLUG
    paths:
      - node_modules/
  script:
    - node --version   # sanity check — confirms you're on 20.x not some cached layer
    - npm ci           # use ci not install; it respects package-lock.json exactly
    - npm run build

test:
  stage: test
  image: node:20-alpine
  tags:
    - docker
    - linux
  cache:
    key: $CI_COMMIT_REF_SLUG
    paths:
      - node_modules/
    policy: pull        # don't re-upload cache from test job; build already did it
  script:
    - npm test

The policy: pull on the test job is the thing the docs bury three pages deep. Without it, both jobs try to upload the cache at the end, the second one wins, and you might end up with a cache state that reflects post-test installs rather than post-build. More importantly: the default cache key example in the official docs is often just key: "$CI_COMMIT_REF_NAME" — that works until you have branch names with slashes (like feature/auth-redesign), which silently fails to write the cache on some runner configurations. $CI_COMMIT_REF_SLUG is the slug-ified version, safe for any branch name. Use that one, always.

The other silent cache failure I've hit: if your runner's cache_dir isn't on persistent storage, the cache exists only for the lifetime of the container and does nothing across pipeline runs. Check your /etc/gitlab-runner/config.toml — the [runners.docker] section needs a volume mount if you want the cache to survive between jobs on different pipeline runs:

[[runners]]
  name = "my-docker-runner"
  executor = "docker"
  [runners.docker]
    image = "node:20-alpine"
    volumes = ["/cache", "/var/run/docker.sock:/var/run/docker.sock"]
    # /cache is the default cache path; GitLab runner manages it
    # without this volume, cache writes succeed but reads find nothing next run

Once a job triggers, the single fastest way to confirm it landed on your runner and not a GitLab shared runner: open the job log and look at the very first few lines. You'll see something like this:

Running with gitlab-runner 16.11.0 (91a27b2c)
  on my-homelab-runner xK3pQ9sR, system ID: r_abc123xyz
Preparing the "docker" executor
Using Docker executor with image node:20-alpine ...

That second line is the money line — it shows your runner's description and its short token ID. If you see on GitLab.com Shared Runner instead, your tags aren't matching. Double-check with gitlab-runner list on your host to confirm what tags were actually registered, because it's easy to type docker,linux during registration and forget the runner was initially set up with docker, linux (with a space) which registers as two separate tags but the job-level tags: array trims whitespace differently depending on the GitLab version. Safest bet: re-register if there's any ambiguity, tags are one of the few things you can't edit cleanly after the fact without restarting the runner.

The Rough Edges I Hit (And How to Fix Them)

The tag mismatch thing trips up almost everyone the first time. You register the runner, it shows as online in GitLab's UI, you push a commit, and the job just sits at "pending" forever. Before you start debugging network issues or permissions, check two things: whether your runner has runs untagged jobs enabled, and whether your .gitlab-ci.yml has a tags: field that doesn't match what the runner was registered with. GitLab's matching logic is strict — if a job lists tags: [docker, linux] and your runner only has docker, that job will never pick up. Either fix the tags or enable untagged job pickup in the runner's settings under Settings → CI/CD → Runners.

Docker image pull failures inside jobs are almost always a network access problem, not a Docker problem. Your runner VM needs a clear path to whatever registry you're pulling from. If you're behind a corporate proxy, the runner process itself needs to know about it — setting HTTP_PROXY in your shell profile does nothing because systemd spawns the runner outside your user environment. The fix goes in /etc/gitlab-runner/config.toml:

[[runners]]
  name = "my-docker-runner"
  url = "https://gitlab.com"
  executor = "docker"
  environment = [
    "HTTP_PROXY=http://proxy.corp.internal:3128",
    "HTTPS_PROXY=http://proxy.corp.internal:3128",
    "NO_PROXY=localhost,127.0.0.1,.corp.internal"
  ]
  [runners.docker]
    image = "alpine:3.19"

After editing config.toml, restart with sudo systemctl restart gitlab-runner. The runner picks up config changes on restart — no re-registration needed. One subtle gotcha: if you're pulling from a private registry that requires auth, you also need to set pull_policy and configure registry credentials separately. Don't conflate proxy issues with auth issues — the error messages look different. Proxy failures usually say "no route to host" or timeout; auth failures say "unauthorized" or 403.

The permission denied on /cache error shows up when you configure a shared cache volume and the runner process can't write to it. The gitlab-runner service runs as the gitlab-runner user, and if you created that directory as root, the fix is exactly one command:

# Run this on the host, not inside a job
sudo chown -R gitlab-runner:gitlab-runner /cache
sudo chmod 755 /cache

If you're mounting /cache as a Docker volume in the executor config, make sure the mount is also reflected in volumes inside [runners.docker] in config.toml — something like volumes = ["/cache:/cache"]. Missing that line means the directory exists on the host but the container never sees it.

Runners going offline randomly is almost always a network timeout, not a crash. The runner polls GitLab on a regular interval, and if the connection drops mid-poll, the runner marks itself offline and takes a while to reconnect. First, check what's actually happening:

sudo journalctl -u gitlab-runner -f

You'll usually see something like ERROR: Checking for jobs... error followed by a reconnect attempt. The default check_interval is 3 seconds — that's aggressive on flaky connections and can flood GitLab's API. Bumping it to 10 or 15 in config.toml reduces noise and makes the runner more resilient to brief outages:

# Top-level in config.toml, not inside [[runners]]
check_interval = 10

Disk filling up from Docker layer cache is a slow-burn problem that bites you after a few weeks of heavy CI usage. Each build pulls images, runs containers, and leaves behind dangling layers. On a runner with dozens of jobs per day, that fills a 50GB disk in under a month. The pragmatic fix is a daily cron:

# /etc/cron.daily/docker-prune
#!/bin/bash
docker system prune -f --filter "until=24h"

Make it executable with chmod +x /etc/cron.daily/docker-prune. The --filter "until=24h" flag is important — without it you'll prune images that are actively used in jobs and cause unnecessary re-pulls. The other lever is setting pull_policy = ["if-not-present"] in your runner's Docker config, which tells it to reuse locally cached images instead of pulling every time. Combine both approaches: prune stale stuff daily, but don't pull fresh layers you already have.

Locking Down the Runner: Security Basics You Shouldn't Skip

Start Narrow: Project-Level Runners First

The default mental model most people have when they first register a runner is "this thing runs my CI jobs." The security model you actually need is "this thing has access to my secrets, deploy keys, and production environment." That reframe changes everything about how you scope it. Register runners at the project level first. Don't click "Register an instance runner" and call it a day — instance-level runners are visible and available to every project on your GitLab instance by default. If someone on your team creates a throwaway repo to test something, that repo can now queue jobs on the same runner that has your AWS credentials in it.

Group-level runners are the right middle ground once you have a monorepo or a team where five projects legitimately share the same deploy pipeline. You get shared capacity without blasting credentials across your entire instance. The hierarchy is clear: project < group < instance. Move up only when you have a concrete reason, not because it's easier to register once and forget it.

Protected Runners Are Not Optional When Production Credentials Are Involved

GitLab's "protected" runner setting sounds like a generic checkbox but it does something specific and genuinely useful: it restricts the runner to only process jobs triggered by protected branches or protected tags. So your runner that holds the production deploy key won't fire on a branch some contractor pushed last Tuesday. Enable this in Settings → CI/CD → Runners → Edit runner → Protected. The corresponding side of this equation is making sure your main and release/* branches are actually marked as protected in the repository settings — otherwise the toggle does nothing.

I burned about two hours once debugging why a runner wasn't picking up jobs before realizing I'd marked the runner as protected but hadn't protected the branch. No error, just silent queue skipping. The jobs sat there labeled "pending" with no explanation in the UI until I dug into the runner logs.

The `privileged = true` Problem Is Worse Than the Docs Suggest

The GitLab docs mention that privileged = true is required for Docker-in-Docker builds and leave it at that. What they don't say loudly enough is that on a shared machine, this is essentially handing anyone who can push to the repo a root shell on the host. The container escape path is well-documented — mounting the host filesystem from inside a privileged container takes one command:

# What a malicious CI job can do inside a privileged container
docker run -v /:/host --rm alpine chroot /host sh

That's it. They're on your runner host with full access. If your runners share a VM with anything else — other projects, your monitoring stack, your secrets manager agent — you've got a problem. The fix is boring but correct: use a dedicated VM per runner if you need privileged = true, or switch to the Kubernetes executor with a proper pod security policy that drops capabilities. The Kubernetes executor runs each job in an isolated pod and you can set privileged: false in the runner's config.toml. Docker Machine executor is officially deprecated as of GitLab 16.x, so don't build new infrastructure around it.

# config.toml — safe Kubernetes executor config
[[runners]]
  name = "k8s-runner"
  executor = "kubernetes"
  [runners.kubernetes]
    namespace = "gitlab-runners"
    privileged = false          # never true unless you own every pod in that namespace
    pull_policy = "always"      # don't let stale images mask build problems

Token Rotation: GitLab 16.x Changed the Model, Use It

Before GitLab 16.0, runner registration tokens were static strings that lived in your group or project settings forever. If one leaked into a commit — and they do, I've seen it happen in .gitlab-ci.yml accidentally included in a variable dump — your only option was to reset the token and re-register every runner that used it, which meant touching config files on every runner host. GitLab 16.x introduced a new authentication token system where each registered runner gets its own token, and you can rotate individual runner tokens without affecting others.

The rotation flow from the CLI looks like this:

# Rotate a specific runner's token (GitLab 16.x+)
# First, get the runner ID from the GitLab UI or API
curl --request POST \
  --header "PRIVATE-TOKEN: " \
  "https://gitlab.example.com/api/v4/runners/42/reset_authentication_token"

# Response gives you the new token — update config.toml on the runner host
# Old token is immediately invalidated

If you're still on the old registration token model, GitLab is deprecating it — the exact cutoff date has shifted across minor versions, but 17.x is expected to fully remove it. Migrate now by re-registering runners using gitlab-runner register with --token instead of the legacy --registration-token flag. Also set up a secret scanning job in your pipeline to catch tokens before they hit the remote — GitLab's built-in secret detection template catches runner tokens specifically.

Scaling Up: Running Multiple Runners or Moving to Kubernetes

Registering a Second Runner vs. Bumping `concurrent` — You're Probably Reaching for the Wrong One

The instinct when jobs start queuing is to register another runner. I did this on a $20 VPS and ended up with two runners competing for the same 2 vCPUs and 4GB RAM, which made everything slower, not faster. The right call depends on what's actually bottlenecked. If your jobs are CPU-bound and you have headroom on the machine, increase concurrent in config.toml first — it's a one-line change and you don't pay the overhead of spawning a second runner process:

# /etc/gitlab-runner/config.toml
concurrent = 4  # was 1 — lets this single runner process 4 jobs simultaneously

[[runners]]
  name = "primary"
  executor = "docker"
  ...

Register a second runner (on the same machine or a different one) when you need isolation, not raw parallelism. Separate runners make sense when you have jobs that need different Docker socket access levels, or when you want a dedicated runner for a specific project that shouldn't compete with everything else. On the same machine, two runners still share the host's resources — you're not getting more compute, you're just getting separate queues. On a different machine, now you're actually scaling horizontally, which matters once a single host genuinely can't keep up.

Docker Machine Executor: Still Works, But Read the Exit Sign

Docker Machine auto-scaling was the way to spin up cloud VMs on demand per job. A job comes in, GitLab Runner calls Docker Machine, a VM appears on AWS or GCP, the job runs, the VM dies. For a while this was genuinely impressive. The config looked like this:

[[runners]]
  executor = "docker+machine"
  [runners.machine]
    IdleCount = 1
    MachineDriver = "amazonec2"
    MachineName = "gitlab-runner-%s"
    MachineOptions = [
      "amazonec2-instance-type=t3.medium",
      "amazonec2-region=us-east-1",
      "amazonec2-vpc-id=vpc-xxxxxxxx"
    ]

The problem is Docker Machine itself is effectively dead upstream. GitLab maintains a fork, but the project isn't getting meaningful new features. GitLab has been explicit that the replacement is Next Runner Auto-scaling with Fleeting — a plugin-based approach where Fleeting manages the instance lifecycle and the runner handles job routing on top of it. If you're building a new auto-scaling setup, don't start with Docker Machine. The new taskscaler + Fleeting combo is documented at docs.gitlab.com/runner/fleet_scaling and while the config is more verbose, it's not going anywhere.

Kubernetes Executor: The Setup That Actually Scales, With a Config Tax

If you already run Kubernetes — meaning you have a cluster, you use kubectl regularly, you're not standing up k8s just for CI — the Kubernetes executor is genuinely elegant. Each job gets its own pod. No shared state between jobs at the OS level. You get resource requests and limits per job, and you can target specific node pools for specific runners. The setup is straightforward: point the runner at your kubeconfig and specify a namespace.

[[runners]]
  name = "k8s-runner"
  executor = "kubernetes"

  [runners.kubernetes]
    host = ""  # empty = uses in-cluster config or KUBECONFIG env
    namespace = "gitlab-runner"
    image = "alpine:3.19"

    # these apply to the build container in every job pod
    cpu_request = "500m"
    memory_request = "512Mi"
    cpu_limit = "2"
    memory_limit = "2Gi"

    # pull secrets if your jobs use private registries
    image_pull_secrets = ["registry-credentials"]

    # attach a PVC for caching — this is where it gets complicated
    [[runners.kubernetes.volumes.pvc]]
      name = "runner-cache"
      mount_path = "/cache"

The thing that caught me off guard: the Kubernetes executor runs three containers per job pod by default — the build container, a helper container, and a services container for things like postgres or redis defined in your .gitlab-ci.yml. Each of those needs resource limits. Miss one and you'll hit OOMKill on jobs that seemed fine locally. Caching is also non-trivial — the default is no persistent cache between jobs, so you either mount a PVC, use S3-compatible storage (MinIO works fine for self-hosted), or accept rebuilding your node_modules on every run. Add distributed caching config on top of the executor config and config.toml grows fast.

The Honest Trade-off

Kubernetes executor gives you real resilience — a node dies, your runner pod reschedules, jobs retry. You get actual multi-tenant isolation and you can scale the runner deployment itself with HPA. But the complexity cost is real. I've seen teams spend two days debugging why jobs were failing with exit code 137 (OOMKill on the helper container) because nobody set limits on the helper image. The config.toml for a production k8s executor with caching, custom node selectors, pod annotations for Datadog, and pull secrets is 80+ lines before you've added a second runner definition. Start here only if k8s is already part of your operational vocabulary. If you're running a team of three on a couple of VMs, bump concurrent, add a second shell runner on a beefier host, and revisit Fleeting-based auto-scaling when the queue times actually hurt.

Quick Comparison: Self-Hosted Runner vs. GitLab SaaS Runners

The thing that surprises most people: shared SaaS runners aren't bad. I ran a side project on them for over a year and never hit a wall. The problems start when you scale — either by job volume, job duration, or infrastructure requirements your pipelines can't fake their way around.

Here's the honest breakdown side by side:

Factor

GitLab SaaS Runners

Self-Hosted Runners

Cost

400 CI/CD minutes free/month on Free tier; Premium starts at $29/user/month with 10,000 min

Your infra cost only — EC2, bare metal, or a spare machine in the closet

Job queue time

Unpredictable — I've seen 30s waits and I've seen 4-minute queues during peak hours

Near-zero if you size your runner pool correctly; you control concurrency

Customization

Limited to what GitLab exposes — pre-installed tools, fixed OS images

Full control: custom Docker images, kernel flags, mounted NFS volumes, anything

Maintenance burden

Zero — GitLab manages upgrades, scaling, and incident response

You own it: runner upgrades, token rotation, executor config, autoscaling logic

Docker-in-Docker

Supported but slow; shared runners use overlay2 with restrictions

Works cleanly with privileged = true in your executor config — full control

Caching speed

S3-backed distributed cache — adds latency, especially for large node_modules

Local disk cache is dramatically faster; I cut a Node.js install step from 90s to 8s

When shared SaaS runners are genuinely fine

Small teams with infrequent commits don't burn through 400 minutes fast. If your team pushes a few times a day and your jobs finish in under 5 minutes each, you may never hit the limit. Open source projects on GitLab.com free tier actually get a solid deal — public projects have historically received more generous minute allocations. If your pipeline is: lint → test → build and nothing in there needs special hardware or secrets that can't live in GitLab CI variables, shared runners cover you completely.

When self-hosting is worth the operational pain

Four situations where I'd self-host without hesitation:

Long-running jobs — SaaS runners enforce a job timeout (default 60 minutes, max varies by plan). If you're running integration test suites, ML training scripts, or large binary builds that push past that, you're fighting the platform instead of using it.
Custom hardware — GPU-accelerated tests, ARM cross-compilation, FPGA toolchains. You can't get a shared runner with an A100 on it. I've run GitLab runners on bare metal with NVIDIA drivers and it just works once the executor is configured right.
Air-gapped environments — Regulated industries (finance, defense, healthcare) where your build artifacts can't touch the public internet. Self-hosted is the only real answer here; you run the runner inside the perimeter and it never calls home except to your internal GitLab instance.
High job volume — Once you're burning through hundreds of pipeline minutes daily, the math flips fast. A $20/month VPS running 4 concurrent jobs costs less than purchasing additional CI minutes on a team plan, and you stop caring about the meter running.

The maintenance argument against self-hosting is real but often overstated. Runner upgrades take about 10 minutes with a package manager. Token rotation is a one-command operation. The actual time sink is autoscaling — if you want runners that spin up and down on demand (say, using Docker Machine or the new fleeting plugin with AWS), that configuration does take real investment upfront. For a fixed-size runner on a single host, the ops burden is genuinely low after the initial setup.

Disclaimer: This article is for informational purposes only. The views and opinions expressed are those of the author(s) and do not necessarily reflect the official policy or position of Sonic Rocket or its affiliates. Always consult with a certified professional before making any financial or technical decisions based on this content.

Originally published on techdigestor.com. Follow for more developer-focused tooling reviews and productivity guides.

DEV Community

Self-Hosting GitLab Runners: Stop Paying for Shared Minutes and Run Your Own

What's in this article

Why I Stopped Using GitLab's Shared Runners

What You Actually Need Before Starting

Step 1 — Install the GitLab Runner Binary

Step 2 — Register the Runner with Your GitLab Instance

Step 3 — Configure the Runner for Docker Executor

The config.toml Block That Actually Works

concurrent = 4 Is Not a Suggestion — It's a Throttle

The Docker Socket Mount: Convenient, But Know What You're Doing

Docker-in-Docker When You Actually Need to Build Images

Restart the Runner or Nothing You Just Did Matters

Step 4 — Write a .gitlab-ci.yml That Actually Uses Your Runner

The Rough Edges I Hit (And How to Fix Them)

Locking Down the Runner: Security Basics You Shouldn't Skip

Start Narrow: Project-Level Runners First

Protected Runners Are Not Optional When Production Credentials Are Involved

The `privileged = true` Problem Is Worse Than the Docs Suggest

Token Rotation: GitLab 16.x Changed the Model, Use It

Scaling Up: Running Multiple Runners or Moving to Kubernetes

Registering a Second Runner vs. Bumping `concurrent` — You're Probably Reaching for the Wrong One

Docker Machine Executor: Still Works, But Read the Exit Sign

Kubernetes Executor: The Setup That Actually Scales, With a Config Tax

The Honest Trade-off

Quick Comparison: Self-Hosted Runner vs. GitLab SaaS Runners

When shared SaaS runners are genuinely fine

When self-hosting is worth the operational pain

Top comments (0)

What's in this article

Why I Stopped Using GitLab's Shared Runners

What You Actually Need Before Starting

Step 1 — Install the GitLab Runner Binary

Step 2 — Register the Runner with Your GitLab Instance

Step 3 — Configure the Runner for Docker Executor

The config.toml Block That Actually Works

concurrent = 4 Is Not a Suggestion — It's a Throttle

The Docker Socket Mount: Convenient, But Know What You're Doing

Docker-in-Docker When You Actually Need to Build Images

Restart the Runner or Nothing You Just Did Matters

Step 4 — Write a .gitlab-ci.yml That Actually Uses Your Runner

The Rough Edges I Hit (And How to Fix Them)

Locking Down the Runner: Security Basics You Shouldn't Skip

Start Narrow: Project-Level Runners First

Protected Runners Are Not Optional When Production Credentials Are Involved

The privileged = true Problem Is Worse Than the Docs Suggest

Token Rotation: GitLab 16.x Changed the Model, Use It

Scaling Up: Running Multiple Runners or Moving to Kubernetes

Registering a Second Runner vs. Bumping concurrent — You're Probably Reaching for the Wrong One

Docker Machine Executor: Still Works, But Read the Exit Sign

Kubernetes Executor: The Setup That Actually Scales, With a Config Tax

The Honest Trade-off

Quick Comparison: Self-Hosted Runner vs. GitLab SaaS Runners

When shared SaaS runners are genuinely fine

When self-hosting is worth the operational pain

The `privileged = true` Problem Is Worse Than the Docs Suggest

Registering a Second Runner vs. Bumping `concurrent` — You're Probably Reaching for the Wrong One