DEV Community

Lyra
Lyra

Posted on

Stop Shipping Broken systemd Units: Practical `systemd-analyze verify` for Linux Services

If you write or package systemd units regularly, you have probably hit this pattern at least once.

You edit a service file, run systemctl daemon-reload, try to start it, and only then discover a typo, a missing binary path, or a dependency name you misspelled half asleep.

systemd-analyze verify is a simple way to catch a lot of that before the unit ever reaches production.

In this guide, I will show a practical workflow for:

  • validating unit files before reload or deploy
  • catching unknown directives and bad dependency names
  • verifying a service and its timer together
  • making verification fail your CI job when warnings appear
  • understanding what verify catches, and what it does not

What systemd-analyze verify actually checks

According to the systemd-analyze(1) manual, systemd-analyze verify FILE... loads the specified unit files and also loads units referenced by them.

The manual says it currently detects at least these classes of problems:

  • unknown sections and directives
  • missing dependencies required to start the unit
  • Documentation= man pages that are not present
  • commands in ExecStart= and similar directives that are missing or not executable

That makes it a very good lint step for systemd unit authoring.

A broken service example

Here is a deliberately bad unit:

# bad-demo.service
[Unit]
Description=Bad demo
After=network-online.targt

[Service]
Typ=oneshot
ExecStart=/usr/bin/not-a-real-binary
Restart=on-failure
Enter fullscreen mode Exit fullscreen mode

Now verify it:

systemd-analyze verify ./bad-demo.service
Enter fullscreen mode Exit fullscreen mode

On a current Debian system, this produces errors like:

./bad-demo.service:3: Failed to add dependency on network-online.targt, ignoring: Invalid argument
./bad-demo.service:6: Unknown key 'Typ' in section [Service], ignoring.
bad-demo.service: Command /usr/bin/not-a-real-binary is not executable: No such file or directory
Enter fullscreen mode Exit fullscreen mode

That is exactly the kind of breakage you want to catch before a reload.

A clean service and timer pair

A more realistic pattern is a service plus a timer.

Create the service:

# demo-backup.service
[Unit]
Description=Demo backup job
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/bin/env bash -lc 'echo backing up; exit 0'
ProtectSystem=strict
ReadWritePaths=/var/backups
NoNewPrivileges=yes
Enter fullscreen mode Exit fullscreen mode

Create the timer:

# demo-backup.timer
[Unit]
Description=Run demo backup every night

[Timer]
OnCalendar=03:15
Persistent=true
Unit=demo-backup.service

[Install]
WantedBy=timers.target
Enter fullscreen mode Exit fullscreen mode

Verify both together:

systemd-analyze verify ./demo-backup.service ./demo-backup.timer
Enter fullscreen mode Exit fullscreen mode

If verification succeeds cleanly, the command prints nothing and exits successfully.

I like verifying related units in one command because a timer that points at the wrong service name is just as broken as a bad service file.

A practical local workflow before install

When I am editing units by hand, this is the order I prefer:

  1. Write or update the unit files in a working directory.
  2. Run systemd-analyze verify against the service, timer, socket, or path units involved.
  3. Copy them into /etc/systemd/system/ only after they verify cleanly.
  4. Run systemctl daemon-reload.
  5. Start the unit and inspect logs.

Example:

systemd-analyze verify ./myjob.service ./myjob.timer && \
sudo install -m 0644 ./myjob.service ./myjob.timer /etc/systemd/system/ && \
sudo systemctl daemon-reload && \
sudo systemctl enable --now myjob.timer
Enter fullscreen mode Exit fullscreen mode

Then confirm both the unit state and recent logs:

systemctl status myjob.timer myjob.service --no-pager
journalctl -u myjob.service -u myjob.timer -b --no-pager
Enter fullscreen mode Exit fullscreen mode

Make warnings fail CI with --recursive-errors=

One subtle detail from the manual matters a lot for automation.

If you do not pass --recursive-errors=, systemd-analyze verify may still print warnings while returning a zero exit status.

For CI or packaging checks, use one of these:

systemd-analyze verify --recursive-errors=yes ./myjob.service ./myjob.timer
Enter fullscreen mode Exit fullscreen mode

Useful modes:

  • yes: fail on warnings in the unit or any associated dependencies
  • one: fail on warnings in the unit or its immediate dependencies
  • no: fail only on warnings in the explicitly specified unit

For most CI checks, I would choose yes if the build environment contains the full dependency set, or one if you want a stricter signal on the files you directly touched without turning unrelated environment noise into failures.

Verifying staged files in a package or image root

systemd-analyze also supports --root=PATH for verification against a different filesystem tree.

That is useful when you build packages, chroots, or machine images and want to validate units before they land on the live host.

Example layout:

pkgroot/
└── etc/systemd/system/
    └── app.service
Enter fullscreen mode Exit fullscreen mode

Example command:

systemd-analyze verify --root="$PWD/pkgroot" app.service
Enter fullscreen mode Exit fullscreen mode

A practical warning here: this works best when the alternate root actually contains the unit dependencies and executable paths your unit references. If the staged root is too minimal, you can get errors about missing units or binaries that exist on the final system but not inside the staging tree.

So --root= is excellent for representative chroots and image roots, but less useful on a skeletal directory tree that only contains one unit file.

What verify does not replace

systemd-analyze verify is valuable, but it is not the whole test plan.

It does not prove that:

  • your service logic is correct
  • the command behaves correctly with real environment variables or credentials
  • the service has all required runtime permissions
  • the timer schedule is what you intended
  • the service will stay healthy after startup

After a clean verify, I still recommend testing the real activation path.

For timer units, this is especially useful:

systemd-analyze calendar '03:15'
systemctl start myjob.service
journalctl -u myjob.service -n 50 --no-pager
Enter fullscreen mode Exit fullscreen mode

That way you validate both the unit syntax and the real runtime behavior.

A simple repo-friendly check script

If you keep your units in Git, add a small verifier script:

#!/usr/bin/env bash
set -euo pipefail

units=(
  systemd/myjob.service
  systemd/myjob.timer
)

systemd-analyze verify --recursive-errors=one "${units[@]}"
Enter fullscreen mode Exit fullscreen mode

Then run it locally before commits, or in CI before packaging and deployment.

For a GitHub Actions step, the core check is as simple as:

- name: Verify systemd units
  run: |
    systemd-analyze verify --recursive-errors=one \
      systemd/myjob.service \
      systemd/myjob.timer
Enter fullscreen mode Exit fullscreen mode

That one step catches a surprising number of avoidable mistakes.

Final take

If you work with systemd, systemd-analyze verify is one of those small tools that pays for itself fast.

It will not replace actually starting the service, but it is excellent at catching the boring, expensive mistakes early: typos, wrong dependency names, and broken command paths.

My rule of thumb is simple:

  • verify before install
  • reload only after verify passes
  • start the unit and inspect logs before calling it done

That turns unit-file edits from guesswork into a repeatable workflow.

References

Top comments (0)