DEV Community

Lyra
Lyra

Posted on

Scrub Your Btrfs Before It Scrubs You: Practical `btrfs scrub` + systemd timer

If you run Btrfs and never schedule btrfs scrub, you are skipping one of the filesystem's most useful maintenance tools.

Scrub is not glamorous. It does not make your box faster. It will not clean up space. But it does walk your filesystem, verify checksums on data and metadata, and, when redundant copies exist, repair corrupted blocks from a good copy.

That is exactly the sort of quiet maintenance you want happening before a bad block turns into a bad day.

This guide covers:

  • what btrfs scrub actually does
  • what it does not do
  • when it can repair corruption and when it cannot
  • a practical monthly systemd timer setup
  • how to validate the run and interpret the result

What btrfs scrub actually checks

According to btrfs-scrub(8), scrub reads filesystem data and metadata, verifies checksums, and validates all copies of redundant block-group profiles.
If a corrupted block has another valid copy available, scrub can repair the bad copy automatically.

That means scrub is especially valuable on Btrfs filesystems that use redundancy for metadata and, where configured, for data too.

A simple manual run looks like this:

sudo btrfs scrub start -B /
Enter fullscreen mode Exit fullscreen mode

The -B flag keeps the command in the foreground and prints stats when it finishes, which is useful for manual checks and for one-shot troubleshooting.

If you want per-device statistics on a multi-device filesystem, add -d:

sudo btrfs scrub start -B -d /
Enter fullscreen mode Exit fullscreen mode

What scrub does not do

This part matters.

btrfs-scrub(8) is very explicit: scrub is not a filesystem checker, and it does not repair structural filesystem damage.
It checks checksums on data and tree blocks, but it is not a replacement for btrfs check.

So think about the tools like this:

  • btrfs scrub is for ongoing checksum verification and possible repair from a good copy
  • btrfs check is for deeper structural consistency checks and is a different class of tool

If you remember only one sentence from this article, make it this one: scrub is preventive integrity maintenance, not a general-purpose rescue tool.

When scrub can repair corruption, and when it cannot

Scrub can repair corrupted blocks only if there is another valid copy to repair from.

In practice, that means:

  • redundant metadata profiles are helpful
  • mirrored or otherwise redundant data profiles are helpful
  • a single-device, non-redundant data block cannot be magically repaired by scrub

Scrub is still worth running on single-device systems because detection matters.
Finding checksum mismatches early is much better than learning about them during a restore, upgrade, or database read months later.

The practical cadence: monthly is the documented default

The Btrfs scrub docs recommend running it manually or through a periodic system service, and call monthly the recommended interval.
That is a sensible default for most Linux systems.

If your box stores frequently changing important data, you can run it more often.
If it is archival or lightly used, monthly is still a strong baseline.

Manual health-check workflow first

Before automating anything, I like to confirm the basics manually.

1) Make sure the target is actually Btrfs

findmnt -no FSTYPE,TARGET /
Enter fullscreen mode Exit fullscreen mode

Example output:

btrfs /
Enter fullscreen mode Exit fullscreen mode

If you use multiple Btrfs mountpoints, replace / with the mount you actually want to scrub.

2) Run a foreground scrub

sudo btrfs scrub start -B /
Enter fullscreen mode Exit fullscreen mode

A healthy result typically ends with something like:

Error summary: no errors found
Enter fullscreen mode Exit fullscreen mode

3) Re-check the last recorded status

sudo btrfs scrub status /
Enter fullscreen mode Exit fullscreen mode

Useful fields to look at:

  • start time
  • duration
  • total bytes scrubbed
  • rate
  • error summary
  • corrected vs uncorrectable errors

If you want raw counters for deeper debugging:

sudo btrfs scrub status -R /
Enter fullscreen mode Exit fullscreen mode

Understanding the result

A clean run is easy:

Error summary: no errors found
Enter fullscreen mode Exit fullscreen mode

If errors are present, btrfs-scrub(8) documents a few counters worth watching:

  • Corrected: corrupted blocks repaired from another good copy
  • Uncorrectable: errors detected but not repairable from another copy
  • Unverified: transient read failures where a retry succeeded

If you see uncorrectable errors, stop treating the system as fully healthy.
That does not automatically mean catastrophic loss, but it does mean you should investigate the affected device, verify backups, and inspect the filesystem layout and redundancy assumptions.

Also note the documented exit codes:

  • 0 means success
  • 3 means scrub found uncorrectable errors

That makes it easy to wire alerting or log review around the command later.

Automate it with systemd

A monthly timer is a clean fit here.

systemd.timer(5) documents that a timer activates the matching service by default, so btrfs-scrub@.timer can activate btrfs-scrub@.service automatically.
It also documents Persistent=true, which is useful for catch-up behavior if the machine was off during the scheduled time.

I prefer a template unit so you can reuse the same service for /, /home, or any other Btrfs mountpoint.

Service unit: /etc/systemd/system/btrfs-scrub@.service

[Unit]
Description=Btrfs scrub for %I
Documentation=man:btrfs-scrub(8)
ConditionPathIsMountPoint=%I

[Service]
Type=oneshot
Nice=19
ExecStart=/usr/bin/btrfs scrub start -B %I
Enter fullscreen mode Exit fullscreen mode

Timer unit: /etc/systemd/system/btrfs-scrub@.timer

[Unit]
Description=Monthly Btrfs scrub for %I
Documentation=man:systemd.timer(5) man:btrfs-scrub(8)

[Timer]
OnCalendar=monthly
Persistent=true
RandomizedDelaySec=2h
AccuracySec=1h

[Install]
WantedBy=timers.target
Enter fullscreen mode Exit fullscreen mode

A few reasons I like this version:

  • Type=oneshot matches the command behavior
  • Nice=19 reduces CPU scheduling priority a bit
  • Persistent=true catches up after downtime
  • RandomizedDelaySec= avoids every machine in a fleet hammering storage at the same moment

Enable it for /

Because this is a template unit, systemd needs an escaped instance name for mount paths.
For the root filesystem, use:

sudo systemctl daemon-reload
sudo systemctl enable --now btrfs-scrub@-.timer
Enter fullscreen mode Exit fullscreen mode

Why -?
Because / is escaped by systemd to -.

If you want to see the escape result explicitly:

systemd-escape --path /
Enter fullscreen mode Exit fullscreen mode

For /home, the instance would be:

systemd-escape --path /home
# output: home
Enter fullscreen mode Exit fullscreen mode

And you would enable:

sudo systemctl enable --now btrfs-scrub@home.timer
Enter fullscreen mode Exit fullscreen mode

Verify the automation

First, inspect the timer:

systemctl list-timers --all | grep btrfs-scrub
Enter fullscreen mode Exit fullscreen mode

Then trigger the service manually once:

sudo systemctl start btrfs-scrub@-.service
Enter fullscreen mode Exit fullscreen mode

And inspect the logs:

journalctl -u btrfs-scrub@-.service --no-pager
Enter fullscreen mode Exit fullscreen mode

Finally, confirm the recorded scrub status:

sudo btrfs scrub status /
Enter fullscreen mode Exit fullscreen mode

The Btrfs docs note that scrub state is recorded under /var/lib/btrfs/, so status still has something useful to show even after the active run ends.

What about I/O impact?

This is where people get tripped up by old assumptions.

Older guidance often says scrub runs with idle I/O priority and therefore should not interfere much with normal workloads.
That can be true, but current docs are more careful: I/O priority behavior is scheduler-dependent.
The Btrfs docs explicitly warn that ionice-style behavior may not work as expected on all schedulers, and the Linux kernel I/O-priority docs say support is scheduler-dependent.

So my advice is:

  • start with monthly scheduling during a quiet window
  • watch real behavior on your own hardware
  • if needed, add stronger controls later with cgroup v2 I/O limits or Btrfs scrub limits where supported

Do not blindly trust decade-old blog posts about ionice and call it done.

A minimal recovery-minded checklist

If scrub reports corrected or uncorrectable errors, do these next:

  1. Check that backups are current.
  2. Review btrfs scrub status / carefully.
  3. Inspect kernel logs and the unit journal.
  4. Review underlying device health with SMART or NVMe tooling.
  5. Confirm whether the affected data profile actually had redundancy.

This is also where scrub and hardware monitoring complement each other nicely.
SMART/NVMe telemetry tells you about the device.
Scrub tells you whether the filesystem's checksummed data is staying readable and consistent.

The main point

If you chose Btrfs, use the maintenance features that make Btrfs worth choosing.

A monthly scrub is low drama, easy to automate, and one of the clearest examples of boring Linux hygiene paying off exactly when you need it.

Not every integrity problem can be repaired.
But catching corruption early, and automatically repairing it when redundancy exists, is a lot better than finding out by accident later.

References

Top comments (0)