<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Lyra</title>
    <description>The latest articles on DEV Community by Lyra (@lyraalishaikh).</description>
    <link>https://dev.to/lyraalishaikh</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3755481%2F7174207e-67eb-4a72-9c1a-6fdad7505b9c.png</url>
      <title>DEV Community: Lyra</title>
      <link>https://dev.to/lyraalishaikh</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/lyraalishaikh"/>
    <language>en</language>
    <item>
      <title>Stop Cache Creep on Linux: Practical `systemd-tmpfiles` Cleanup Policies for `/tmp`, `/var/tmp`, and App Caches</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Tue, 14 Apr 2026 05:03:22 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/stop-cache-creep-on-linux-practical-systemd-tmpfiles-cleanup-policies-for-tmp-vartmp-4m55</link>
      <guid>https://dev.to/lyraalishaikh/stop-cache-creep-on-linux-practical-systemd-tmpfiles-cleanup-policies-for-tmp-vartmp-4m55</guid>
      <description>&lt;p&gt;Linux boxes are great at accumulating junk quietly.&lt;/p&gt;

&lt;p&gt;Not catastrophic junk. Just enough to become annoying over time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;stale files in &lt;code&gt;/tmp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;forgotten payloads in &lt;code&gt;/var/tmp&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;application scratch directories that grow forever&lt;/li&gt;
&lt;li&gt;caches that should be disposable, but never get expired automatically&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A lot of people reach for ad-hoc &lt;code&gt;find ... -delete&lt;/code&gt; cron jobs when this happens. I think that is usually the wrong first move.&lt;/p&gt;

&lt;p&gt;If your system already runs systemd, you probably have a better tool built in: &lt;code&gt;systemd-tmpfiles&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It gives you a declarative way to say:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;create this directory if it should exist&lt;/li&gt;
&lt;li&gt;set the right mode and ownership&lt;/li&gt;
&lt;li&gt;clean old contents on a schedule&lt;/li&gt;
&lt;li&gt;preview what would happen before deleting anything&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This guide covers the practical parts: when to use it, when not to use it, safe examples, testing, and the easy mistakes that cause surprise deletions.&lt;/p&gt;

&lt;h2&gt;
  
  
  What &lt;code&gt;systemd-tmpfiles&lt;/code&gt; is actually for
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;systemd-tmpfiles&lt;/code&gt; creates, removes, and cleans files and directories based on rules from &lt;code&gt;tmpfiles.d&lt;/code&gt; configuration.&lt;/p&gt;

&lt;p&gt;The important pieces are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;tmpfiles.d(5)&lt;/code&gt; defines the config format&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd-tmpfiles(8)&lt;/code&gt; applies those rules&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd-tmpfiles-clean.timer&lt;/code&gt; typically runs cleanup daily&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd-tmpfiles-clean.service&lt;/code&gt; runs &lt;code&gt;systemd-tmpfiles --clean&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On this host, the shipped timer is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Timer]&lt;/span&gt;
&lt;span class="py"&gt;OnBootSec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;15min&lt;/span&gt;
&lt;span class="py"&gt;OnUnitActiveSec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;1d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the service runs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;ExecStart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;systemd-tmpfiles --clean&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That means you often do &lt;strong&gt;not&lt;/strong&gt; need to invent a custom timer just to expire old temporary files.&lt;/p&gt;

&lt;h2&gt;
  
  
  First, understand &lt;code&gt;/tmp&lt;/code&gt; vs &lt;code&gt;/var/tmp&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;This matters more than most cleanup guides admit.&lt;/p&gt;

&lt;p&gt;The systemd project documents the intended split clearly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/tmp&lt;/code&gt; is for smaller, temporary data and is often cleared on reboot&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/var/tmp&lt;/code&gt; is for temporary data that should survive reboot&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The same documentation also notes that systemd-tmpfiles applies automatic aging by default, with files in &lt;code&gt;/tmp&lt;/code&gt; typically cleaned after 10 days and files in &lt;code&gt;/var/tmp&lt;/code&gt; after 30 days.&lt;/p&gt;

&lt;p&gt;So if an application genuinely expects its scratch data to survive reboot, &lt;code&gt;/var/tmp&lt;/code&gt; is the right home. If not, prefer &lt;code&gt;/tmp&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That one decision alone prevents a lot of accidental foot-guns.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use &lt;code&gt;tmpfiles.d&lt;/code&gt;, and when not to
&lt;/h2&gt;

&lt;p&gt;Use &lt;code&gt;tmpfiles.d&lt;/code&gt; when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a path should exist independent of a single service lifecycle&lt;/li&gt;
&lt;li&gt;you want age-based cleanup for directory contents&lt;/li&gt;
&lt;li&gt;you want a declarative replacement for custom cleanup scripts&lt;/li&gt;
&lt;li&gt;you need predictable permissions on a scratch or cache path&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do &lt;strong&gt;not&lt;/strong&gt; reach for &lt;code&gt;tmpfiles.d&lt;/code&gt; first when a service can own its own runtime/state/cache directories.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;tmpfiles.d(5)&lt;/code&gt; man page explicitly recommends using these service settings when they fit better:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;RuntimeDirectory=&lt;/code&gt; for &lt;code&gt;/run&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;StateDirectory=&lt;/code&gt; for &lt;code&gt;/var/lib&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CacheDirectory=&lt;/code&gt; for &lt;code&gt;/var/cache&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LogsDirectory=&lt;/code&gt; for &lt;code&gt;/var/log&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ConfigurationDirectory=&lt;/code&gt; for &lt;code&gt;/etc&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I agree with that recommendation. If the directory belongs tightly to one service, keeping that lifecycle in the unit file is usually cleaner.&lt;/p&gt;

&lt;p&gt;Use &lt;code&gt;tmpfiles.d&lt;/code&gt; when the lifetime is broader than one service, or the cleanup behavior needs to be more explicit.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three line types you will use most
&lt;/h2&gt;

&lt;p&gt;The full format is powerful, but most admins only need a few types.&lt;/p&gt;

&lt;p&gt;From &lt;code&gt;tmpfiles.d(5)&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;d&lt;/code&gt; creates a directory, and optionally cleans its contents by age&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;D&lt;/code&gt; is like &lt;code&gt;d&lt;/code&gt;, but its contents are also removed when &lt;code&gt;--remove&lt;/code&gt; is used&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;e&lt;/code&gt; cleans an &lt;strong&gt;existing&lt;/strong&gt; directory by age without requiring tmpfiles to create it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For day-to-day cleanup policy, &lt;code&gt;d&lt;/code&gt; and &lt;code&gt;e&lt;/code&gt; are the stars.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rule of thumb
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;use &lt;code&gt;d&lt;/code&gt; when you want tmpfiles to create and manage the directory&lt;/li&gt;
&lt;li&gt;use &lt;code&gt;e&lt;/code&gt; when the application creates the directory itself, but you want cleanup policy applied to its contents&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  A safe first example: clean an app cache after 7 days
&lt;/h2&gt;

&lt;p&gt;Let us say an application writes disposable cache files to &lt;code&gt;/var/cache/myapp-downloads&lt;/code&gt;, and you want them expired after a week.&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;/etc/tmpfiles.d/myapp-downloads.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;d&lt;/span&gt; /&lt;span class="n"&gt;var&lt;/span&gt;/&lt;span class="n"&gt;cache&lt;/span&gt;/&lt;span class="n"&gt;myapp&lt;/span&gt;-&lt;span class="n"&gt;downloads&lt;/span&gt; &lt;span class="m"&gt;0750&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt; &lt;span class="n"&gt;root&lt;/span&gt; &lt;span class="m"&gt;7&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What this means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;d&lt;/code&gt; creates the directory if missing&lt;/li&gt;
&lt;li&gt;mode becomes &lt;code&gt;0750&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;owner/group become &lt;code&gt;root:root&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;contents older than &lt;code&gt;7d&lt;/code&gt; become eligible during cleanup runs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Apply creation immediately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-tmpfiles &lt;span class="nt"&gt;--create&lt;/span&gt; /etc/tmpfiles.d/myapp-downloads.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Preview cleanup behavior without deleting anything:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-tmpfiles &lt;span class="nt"&gt;--dry-run&lt;/span&gt; &lt;span class="nt"&gt;--clean&lt;/span&gt; /etc/tmpfiles.d/myapp-downloads.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then run the cleanup for real if the preview looks correct:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-tmpfiles &lt;span class="nt"&gt;--clean&lt;/span&gt; /etc/tmpfiles.d/myapp-downloads.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Example two: clean an application-owned directory without creating it
&lt;/h2&gt;

&lt;p&gt;Sometimes the app already creates the directory and you do not want tmpfiles to own that part.&lt;/p&gt;

&lt;p&gt;In that case, use &lt;code&gt;e&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;e&lt;/span&gt; /&lt;span class="n"&gt;var&lt;/span&gt;/&lt;span class="n"&gt;lib&lt;/span&gt;/&lt;span class="n"&gt;myapp&lt;/span&gt;/&lt;span class="n"&gt;scratch&lt;/span&gt; &lt;span class="m"&gt;0750&lt;/span&gt; &lt;span class="n"&gt;myapp&lt;/span&gt; &lt;span class="n"&gt;myapp&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells tmpfiles to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;adjust mode and ownership if needed&lt;/li&gt;
&lt;li&gt;clean old contents in that existing directory&lt;/li&gt;
&lt;li&gt;leave directory creation to the application or package&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a nice fit for scratch areas, export staging directories, or transient ingest folders.&lt;/p&gt;

&lt;h2&gt;
  
  
  A local demo you can test safely
&lt;/h2&gt;

&lt;p&gt;If you want to see it work without touching real application data, use a disposable directory under &lt;code&gt;/tmp&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;TESTROOT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;mktemp&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; /tmp/tmpfiles-demo.XXXXXX&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TESTROOT&lt;/span&gt;&lt;span class="s2"&gt;/cache"&lt;/span&gt;
&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'old\n'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TESTROOT&lt;/span&gt;&lt;span class="s2"&gt;/cache/a.bin"&lt;/span&gt;
&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'new\n'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TESTROOT&lt;/span&gt;&lt;span class="s2"&gt;/cache/b.bin"&lt;/span&gt;

&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TESTROOT&lt;/span&gt;&lt;span class="s2"&gt;/demo.conf"&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;
e &lt;/span&gt;&lt;span class="nv"&gt;$TESTROOT&lt;/span&gt;&lt;span class="sh"&gt;/cache 0755 &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="nt"&gt;-un&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="sh"&gt; &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="nt"&gt;-gn&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="sh"&gt; 0
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;systemd-tmpfiles &lt;span class="nt"&gt;--dry-run&lt;/span&gt; &lt;span class="nt"&gt;--clean&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TESTROOT&lt;/span&gt;&lt;span class="s2"&gt;/demo.conf"&lt;/span&gt;
systemd-tmpfiles &lt;span class="nt"&gt;--clean&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TESTROOT&lt;/span&gt;&lt;span class="s2"&gt;/demo.conf"&lt;/span&gt;
find &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TESTROOT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-maxdepth&lt;/span&gt; 2 &lt;span class="nt"&gt;-type&lt;/span&gt; f | &lt;span class="nb"&gt;sort&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why use &lt;code&gt;0&lt;/code&gt; here?&lt;/p&gt;

&lt;p&gt;Because &lt;code&gt;tmpfiles.d(5)&lt;/code&gt; documents that for &lt;code&gt;e&lt;/code&gt; entries, age &lt;code&gt;0&lt;/code&gt; means contents are deleted unconditionally whenever &lt;code&gt;systemd-tmpfiles --clean&lt;/code&gt; runs. That makes the demo immediate and predictable.&lt;/p&gt;

&lt;p&gt;On my test run, the dry run reported:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Would remove "/tmp/tmpfiles-demo.../cache/a.bin"
Would remove "/tmp/tmpfiles-demo.../cache/b.bin"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is exactly the sort of preview you want before pointing rules at real paths.&lt;/p&gt;

&lt;h2&gt;
  
  
  The subtle part: age is not just mtime
&lt;/h2&gt;

&lt;p&gt;This is where people get surprised.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd-tmpfiles&lt;/code&gt; does not simply look at file modification time in the naive way most shell one-liners do. In debug output on this host, cleanup thresholds were evaluated using multiple timestamps.&lt;/p&gt;

&lt;p&gt;When I tested a file whose modification time was 15 days old, tmpfiles still refused to clean it because the file's &lt;strong&gt;change time&lt;/strong&gt; was new.&lt;/p&gt;

&lt;p&gt;That matters because metadata updates can refresh eligibility in ways that are easy to miss.&lt;/p&gt;

&lt;p&gt;So if you are testing cleanup rules, do not assume that &lt;code&gt;touch -d '15 days ago' file&lt;/code&gt; perfectly simulates a genuinely old file for every case. Preview with &lt;code&gt;--dry-run&lt;/code&gt;, and verify behavior against the actual directory contents you care about.&lt;/p&gt;

&lt;h2&gt;
  
  
  Check what your system already ships
&lt;/h2&gt;

&lt;p&gt;Before writing custom rules, inspect the defaults.&lt;/p&gt;

&lt;p&gt;Useful commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl &lt;span class="nb"&gt;cat &lt;/span&gt;systemd-tmpfiles-clean.timer
systemctl &lt;span class="nb"&gt;cat &lt;/span&gt;systemd-tmpfiles-clean.service
systemd-tmpfiles &lt;span class="nt"&gt;--cat-config&lt;/span&gt; | less
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also inspect vendor rules directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-R&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; /usr/lib/tmpfiles.d /etc/tmpfiles.d 2&amp;gt;/dev/null | less
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is worth doing because many packages already install sensible tmpfiles rules, and you do not want to duplicate or conflict with them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Precedence and override behavior
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;tmpfiles.d(5)&lt;/code&gt; defines these system-level config locations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;/etc/tmpfiles.d/*.conf&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;/run/tmpfiles.d/*.conf&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;/usr/local/lib/tmpfiles.d/*.conf&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;/usr/lib/tmpfiles.d/*.conf&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The practical rule is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;vendor packages ship rules in &lt;code&gt;/usr/lib/tmpfiles.d&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;local admin overrides belong in &lt;code&gt;/etc/tmpfiles.d&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you need to disable a vendor tmpfiles config entirely, the documented approach is to place a symlink to &lt;code&gt;/dev/null&lt;/code&gt; in &lt;code&gt;/etc/tmpfiles.d/&lt;/code&gt; with the same filename.&lt;/p&gt;

&lt;h2&gt;
  
  
  A real pattern I like: expiring importer leftovers
&lt;/h2&gt;

&lt;p&gt;Suppose you have a periodic import job that stages files under &lt;code&gt;/var/tmp/inbox-import&lt;/code&gt; before moving them elsewhere.&lt;/p&gt;

&lt;p&gt;You want:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;directory created if missing&lt;/li&gt;
&lt;li&gt;owned by the importer account&lt;/li&gt;
&lt;li&gt;stale leftovers cleaned after 2 days&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;d&lt;/span&gt; /&lt;span class="n"&gt;var&lt;/span&gt;/&lt;span class="n"&gt;tmp&lt;/span&gt;/&lt;span class="n"&gt;inbox&lt;/span&gt;-&lt;span class="n"&gt;import&lt;/span&gt; &lt;span class="m"&gt;0750&lt;/span&gt; &lt;span class="n"&gt;importer&lt;/span&gt; &lt;span class="n"&gt;importer&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then apply and verify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-tmpfiles &lt;span class="nt"&gt;--create&lt;/span&gt; /etc/tmpfiles.d/inbox-import.conf
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-tmpfiles &lt;span class="nt"&gt;--dry-run&lt;/span&gt; &lt;span class="nt"&gt;--clean&lt;/span&gt; /etc/tmpfiles.d/inbox-import.conf
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start systemd-tmpfiles-clean.service
&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; systemd-tmpfiles-clean.service &lt;span class="nt"&gt;-n&lt;/span&gt; 50 &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is cleaner than a custom shell script, easier to audit, and easier to explain six months later.&lt;/p&gt;

&lt;h2&gt;
  
  
  What not to clean aggressively
&lt;/h2&gt;

&lt;p&gt;I would be conservative around these:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;browser profiles&lt;/li&gt;
&lt;li&gt;databases and queues&lt;/li&gt;
&lt;li&gt;anything under &lt;code&gt;/var/lib&lt;/code&gt; unless you are certain it is disposable scratch data&lt;/li&gt;
&lt;li&gt;upload staging paths that users may still need&lt;/li&gt;
&lt;li&gt;application caches you have not confirmed are rebuildable and safe to lose&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Also, do not treat &lt;code&gt;tmpfiles.d&lt;/code&gt; as a magic disk-pressure tool. It is policy-based cleanup, not capacity planning.&lt;/p&gt;

&lt;p&gt;If a path is growing because the application is misbehaving, fix the application too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security and correctness notes worth keeping in mind
&lt;/h2&gt;

&lt;p&gt;The systemd temporary-directories guidance also warns about the shared namespace under &lt;code&gt;/tmp&lt;/code&gt; and &lt;code&gt;/var/tmp&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Two practical takeaways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;avoid guessable file names in shared temporary directories&lt;/li&gt;
&lt;li&gt;prefer service isolation like &lt;code&gt;PrivateTmp=&lt;/code&gt; where appropriate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not just theoretical. Shared writable temp space is one of those places where sloppy habits become weird bugs, denial-of-service conditions, or worse.&lt;/p&gt;

&lt;h2&gt;
  
  
  My practical workflow
&lt;/h2&gt;

&lt;p&gt;When I add a tmpfiles rule, I keep it boring:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;inspect existing rules first&lt;/li&gt;
&lt;li&gt;create one small &lt;code&gt;.conf&lt;/code&gt; file in &lt;code&gt;/etc/tmpfiles.d/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;run &lt;code&gt;--create&lt;/code&gt; if needed&lt;/li&gt;
&lt;li&gt;run &lt;code&gt;--dry-run --clean&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;test on a disposable directory before touching important paths&lt;/li&gt;
&lt;li&gt;check logs after the first real cleanup run&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That sequence catches most mistakes before they become annoying.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final takeaway
&lt;/h2&gt;

&lt;p&gt;If you are still writing one-off cleanup scripts for every temp directory on a systemd machine, there is a good chance you are doing more work than necessary.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd-tmpfiles&lt;/code&gt; already gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;declarative directory policy&lt;/li&gt;
&lt;li&gt;age-based cleanup&lt;/li&gt;
&lt;li&gt;repeatable permissions&lt;/li&gt;
&lt;li&gt;built-in scheduling on many distros&lt;/li&gt;
&lt;li&gt;a dry-run path for safer changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a much nicer long-term story than a pile of fragile &lt;code&gt;find&lt;/code&gt; commands.&lt;/p&gt;

&lt;p&gt;Use scripts when you need custom logic. Use &lt;code&gt;tmpfiles.d&lt;/code&gt; when what you really want is policy.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;systemd-tmpfiles(8)&lt;/code&gt;: &lt;a href="https://man7.org/linux/man-pages/man8/systemd-tmpfiles.8.html" rel="noopener noreferrer"&gt;https://man7.org/linux/man-pages/man8/systemd-tmpfiles.8.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;tmpfiles.d(5)&lt;/code&gt;: &lt;a href="https://manpages.ubuntu.com/manpages/focal/man5/tmpfiles.d.5.html" rel="noopener noreferrer"&gt;https://manpages.ubuntu.com/manpages/focal/man5/tmpfiles.d.5.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;systemd, "Using /tmp/ and /var/tmp/ Safely": &lt;a href="https://systemd.io/TEMPORARY_DIRECTORIES/" rel="noopener noreferrer"&gt;https://systemd.io/TEMPORARY_DIRECTORIES/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Red Hat Developer, "Managing temporary files with systemd-tmpfiles on RHEL 7": &lt;a href="https://developers.redhat.com/blog/2016/09/20/managing-temporary-files-with-systemd-tmpfiles-on-rhel7" rel="noopener noreferrer"&gt;https://developers.redhat.com/blog/2016/09/20/managing-temporary-files-with-systemd-tmpfiles-on-rhel7&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>linux</category>
      <category>systemd</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
    <item>
      <title>Make NFS Mounts Stop Blocking Boot on Linux: Practical `systemd.automount` with Idle Unmounts</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Mon, 13 Apr 2026 05:02:21 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/make-nfs-mounts-stop-blocking-boot-on-linux-practical-systemdautomount-with-idle-unmounts-3m9d</link>
      <guid>https://dev.to/lyraalishaikh/make-nfs-mounts-stop-blocking-boot-on-linux-practical-systemdautomount-with-idle-unmounts-3m9d</guid>
      <description>&lt;p&gt;If you have ever watched a Linux box stall during boot because a NAS was slow, offline, or reachable only after Wi-Fi came up, this is the fix I wish more people used by default.&lt;/p&gt;

&lt;p&gt;Instead of mounting a remote share eagerly at boot, let systemd create an automount point. The path appears immediately, and the real mount only happens when something actually touches it.&lt;/p&gt;

&lt;p&gt;That gives you three practical wins:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;your system boots more reliably when the server is late or absent&lt;/li&gt;
&lt;li&gt;interactive shells and services stop paying the mount cost until they need the share&lt;/li&gt;
&lt;li&gt;you can add idle unmounts so inactive mounts do not stay pinned forever&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I will show a working &lt;code&gt;fstab&lt;/code&gt; example, how to verify it, and which NFS options are worth using carefully.&lt;/p&gt;

&lt;h2&gt;
  
  
  When &lt;code&gt;systemd.automount&lt;/code&gt; helps
&lt;/h2&gt;

&lt;p&gt;This pattern is especially useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;home labs with NAS shares&lt;/li&gt;
&lt;li&gt;laptops that sometimes leave the local network&lt;/li&gt;
&lt;li&gt;small servers that consume a remote media or backup share&lt;/li&gt;
&lt;li&gt;hosts where a slow NFS server should not delay boot&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is &lt;strong&gt;not&lt;/strong&gt; magic. The first access to the path still waits for the mount to complete. What changes is &lt;strong&gt;when&lt;/strong&gt; you pay that cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  The idea in one line
&lt;/h2&gt;

&lt;p&gt;A normal NFS line mounts the share during boot.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nas.example.internal:/srv/export/media  /mnt/media  nfs  defaults,_netdev  0  0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An automount-based line tells systemd to create an automount unit from &lt;code&gt;fstab&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nas.example.internal:/srv/export/media  /mnt/media  nfs  noauto,x-systemd.automount,x-systemd.idle-timeout=10min,_netdev  0  0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key option is &lt;code&gt;x-systemd.automount&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;According to &lt;code&gt;systemd.mount(5)&lt;/code&gt;, that option causes systemd to create a matching automount unit. &lt;code&gt;systemd.automount(5)&lt;/code&gt; documents that the real mount is activated when the path is accessed, and &lt;code&gt;x-systemd.idle-timeout=&lt;/code&gt; maps to the automount idle timeout behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical NFS example
&lt;/h2&gt;

&lt;p&gt;Create the mount point first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /mnt/media
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then add this to &lt;code&gt;/etc/fstab&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nas.example.internal:/srv/export/media  /mnt/media  nfs  noauto,x-systemd.automount,x-systemd.idle-timeout=10min,_netdev,nfsvers=4.2,hard,timeo=600,retrans=2  0  0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why these options?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;x-systemd.automount&lt;/code&gt; creates the on-demand automount&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;x-systemd.idle-timeout=10min&lt;/code&gt; lets systemd try to unmount after 10 minutes of inactivity&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;_netdev&lt;/code&gt; tells systemd to treat this as a network mount&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;nfsvers=4.2&lt;/code&gt; asks for NFSv4.2 and fails if the server does not support it&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;hard&lt;/code&gt; keeps retrying I/O instead of returning early errors that can corrupt workflows&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;timeo=600&lt;/code&gt; and &lt;code&gt;retrans=2&lt;/code&gt; keep the behavior explicit instead of relying on distro defaults&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A quick caution on &lt;code&gt;soft&lt;/code&gt;: the &lt;code&gt;nfs(5)&lt;/code&gt; man page warns that &lt;code&gt;soft&lt;/code&gt; or &lt;code&gt;softerr&lt;/code&gt; can cause silent data corruption in some cases. For anything that matters, I strongly prefer &lt;code&gt;hard&lt;/code&gt; unless you have a very specific reason not to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reload and enable the generated units
&lt;/h2&gt;

&lt;p&gt;After editing &lt;code&gt;fstab&lt;/code&gt;, reload systemd and start the automount unit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start mnt-media.automount
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable &lt;/span&gt;mnt-media.automount
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can derive the unit name from the path with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemd-escape &lt;span class="nt"&gt;--path&lt;/span&gt; /mnt/media
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That outputs &lt;code&gt;mnt-media&lt;/code&gt;, which is why the unit is named &lt;code&gt;mnt-media.automount&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you prefer to let the next boot pick it up, that also works, but I like verifying immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  Verify that the automount exists before the real mount
&lt;/h2&gt;

&lt;p&gt;Check the automount unit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl status mnt-media.automount &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or list just automount units:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl list-units &lt;span class="nt"&gt;--type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;automount
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this point, the automount should be active even if the real NFS mount is not mounted yet.&lt;/p&gt;

&lt;p&gt;You can confirm that with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;findmnt /mnt/media
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Depending on timing, you may see the autofs placeholder first. The real NFS mount appears after first access.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trigger the mount on first access
&lt;/h2&gt;

&lt;p&gt;Now touch the path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; /mnt/media
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then inspect it again:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;findmnt /mnt/media
mount | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s1"&gt;' /mnt/media '&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should now see the NFS mount active.&lt;/p&gt;

&lt;p&gt;This delayed mount is the whole point: the machine no longer has to complete that remote mount during early boot just to become usable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Test the idle unmount
&lt;/h2&gt;

&lt;p&gt;If you set &lt;code&gt;x-systemd.idle-timeout=10min&lt;/code&gt;, stop touching the path and wait.&lt;/p&gt;

&lt;p&gt;Then check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl status mnt-media.automount &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
findmnt /mnt/media
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The automount unit should remain, while the real NFS mount may disappear after the idle timeout. The next access mounts it again automatically.&lt;/p&gt;

&lt;p&gt;This is handy on laptops and intermittently connected systems because inactive mounts do not linger forever.&lt;/p&gt;

&lt;h2&gt;
  
  
  Troubleshooting tips that actually help
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) Do not add &lt;code&gt;After=network-online.target&lt;/code&gt; to the automount unit
&lt;/h3&gt;

&lt;p&gt;This is a subtle but important one.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd.automount(5)&lt;/code&gt; explicitly warns against adding &lt;code&gt;After=&lt;/code&gt; or &lt;code&gt;Requires=&lt;/code&gt; network-style dependencies to the automount unit itself because that can create ordering cycles. If you are using &lt;code&gt;fstab&lt;/code&gt;, let systemd generate the right relationships for the mount, and use &lt;code&gt;_netdev&lt;/code&gt; when needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) &lt;code&gt;noauto&lt;/code&gt; does not disable the automount when &lt;code&gt;x-systemd.automount&lt;/code&gt; is present
&lt;/h3&gt;

&lt;p&gt;This surprises people.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd.mount(5)&lt;/code&gt; documents that when &lt;code&gt;x-systemd.automount&lt;/code&gt; is used, &lt;code&gt;auto&lt;/code&gt; and &lt;code&gt;noauto&lt;/code&gt; do not affect whether the matching automount unit is pulled in. In practice, &lt;code&gt;x-systemd.automount&lt;/code&gt; is what matters.&lt;/p&gt;

&lt;p&gt;I still include &lt;code&gt;noauto&lt;/code&gt; because it communicates intent clearly to humans reading &lt;code&gt;fstab&lt;/code&gt;: do not mount this eagerly.&lt;/p&gt;

&lt;h3&gt;
  
  
  3) Use &lt;code&gt;_netdev&lt;/code&gt; if systemd might not recognize it as remote
&lt;/h3&gt;

&lt;p&gt;For NFS, the filesystem type already strongly suggests a network mount. But &lt;code&gt;_netdev&lt;/code&gt; is still useful as an explicit hint, and it matters more for storage that is network-backed but not obviously typed that way.&lt;/p&gt;

&lt;h3&gt;
  
  
  4) Avoid nested automounts
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;systemd.automount(5)&lt;/code&gt; warns that nested automounts are a bad fit because inner automount points can pin outer ones and defeat the purpose.&lt;/p&gt;

&lt;p&gt;If you need multiple remote shares, prefer separate top-level mount points such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;/mnt/media&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;/mnt/backups&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;/mnt/projects&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;instead of stacking automounts inside one another.&lt;/p&gt;

&lt;h3&gt;
  
  
  5) Be careful with background NFS mounts
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;systemd.mount(5)&lt;/code&gt; notes that traditional NFS &lt;code&gt;bg&lt;/code&gt; handling is translated by &lt;code&gt;systemd-fstab-generator&lt;/code&gt;, but it also says it may be more appropriate to use &lt;code&gt;x-systemd.automount&lt;/code&gt; instead.&lt;/p&gt;

&lt;p&gt;That matches my experience. For modern systemd-based systems, automounts are usually the cleaner answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  A second example for a read-mostly archive share
&lt;/h2&gt;

&lt;p&gt;For a mostly read-only archive, I would still stay conservative with integrity-related behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nas.example.internal:/srv/export/archive  /mnt/archive  nfs  ro,noauto,x-systemd.automount,x-systemd.idle-timeout=15min,_netdev,nfsvers=4.2,hard,timeo=600,retrans=2  0  0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then activate it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /mnt/archive
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start mnt-archive.automount
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable &lt;/span&gt;mnt-archive.automount
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How I decide between plain mount and automount
&lt;/h2&gt;

&lt;p&gt;I use a regular mount when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the system cannot function without the share&lt;/li&gt;
&lt;li&gt;an application must have the mount available before it starts&lt;/li&gt;
&lt;li&gt;I want failures to surface immediately during boot&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I use &lt;code&gt;x-systemd.automount&lt;/code&gt; when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the share is convenient, not boot-critical&lt;/li&gt;
&lt;li&gt;the server may be slow, asleep, or temporarily absent&lt;/li&gt;
&lt;li&gt;the host is mobile or changes networks&lt;/li&gt;
&lt;li&gt;I want less boot coupling between machines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point matters more than it sounds. Tight boot coupling between a client and a remote share is how a minor NAS hiccup becomes a system-wide nuisance.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;systemd.automount(5)&lt;/code&gt;, Debian manpages: &lt;a href="https://manpages.debian.org/testing/systemd/systemd.automount.5.en.html" rel="noopener noreferrer"&gt;https://manpages.debian.org/testing/systemd/systemd.automount.5.en.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd.mount(5)&lt;/code&gt;, Debian manpages: &lt;a href="https://manpages.debian.org/testing/systemd/systemd.mount.5.en.html" rel="noopener noreferrer"&gt;https://manpages.debian.org/testing/systemd/systemd.mount.5.en.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd-fstab-generator(8)&lt;/code&gt;, Debian manpages: &lt;a href="https://manpages.debian.org/testing/systemd/systemd-fstab-generator.8.en.html" rel="noopener noreferrer"&gt;https://manpages.debian.org/testing/systemd/systemd-fstab-generator.8.en.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;nfs(5)&lt;/code&gt;, man7.org: &lt;a href="https://man7.org/linux/man-pages/man5/nfs.5.html" rel="noopener noreferrer"&gt;https://man7.org/linux/man-pages/man5/nfs.5.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Final thought
&lt;/h2&gt;

&lt;p&gt;If a remote share is not truly required for boot, do not make boot wait for it.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd.automount&lt;/code&gt; is one of those small Linux tools that quietly removes a whole class of annoyance. You still get the mount, just at the moment it becomes useful instead of the moment it becomes risky.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>systemd</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
    <item>
      <title>Stop Hitting Swap Too Late: Practical zram on Linux with systemd-zram-generator</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Sun, 12 Apr 2026 05:02:10 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/stop-hitting-swap-too-late-practical-zram-on-linux-with-systemd-zram-generator-4m4j</link>
      <guid>https://dev.to/lyraalishaikh/stop-hitting-swap-too-late-practical-zram-on-linux-with-systemd-zram-generator-4m4j</guid>
      <description>&lt;p&gt;If a Linux box starts stuttering under memory pressure, traditional disk-backed swap usually arrives with a second problem: latency.&lt;/p&gt;

&lt;p&gt;A better middle ground on many systems is &lt;strong&gt;zram&lt;/strong&gt;. It creates a compressed block device in RAM, and you can use it as swap. That means the kernel can evict cold pages without immediately paying SSD or HDD latency for every swap operation.&lt;/p&gt;

&lt;p&gt;The key detail is that &lt;strong&gt;zram is not preallocated&lt;/strong&gt;. Memory is consumed on demand, and because pages are compressed, the resident memory cost is often lower than the logical swap size.&lt;/p&gt;

&lt;p&gt;In this guide, I’ll set up &lt;strong&gt;swap-on-zram with &lt;code&gt;systemd-zram-generator&lt;/code&gt;&lt;/strong&gt;, verify that it is actually active, and show a rollback path if it is not a good fit for your workload.&lt;/p&gt;

&lt;h2&gt;
  
  
  When zram is a good fit
&lt;/h2&gt;

&lt;p&gt;zram usually helps when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you want smoother behavior during short memory spikes&lt;/li&gt;
&lt;li&gt;you run developer tools, browsers, light containers, or modest local AI workloads on limited RAM&lt;/li&gt;
&lt;li&gt;you want swap that is much faster than disk-backed swap&lt;/li&gt;
&lt;li&gt;you do &lt;strong&gt;not&lt;/strong&gt; rely on hibernation via swap-only-on-zram&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;zram is usually a poor fit when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;your workload needs heavy, sustained page eviction and large working sets far beyond RAM&lt;/li&gt;
&lt;li&gt;your pages are poorly compressible&lt;/li&gt;
&lt;li&gt;you specifically need a classic hibernation target and only have zram swap configured&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, zram is a pressure relief valve, not a magic RAM upgrade.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the docs actually say
&lt;/h2&gt;

&lt;p&gt;A few facts worth grounding before we touch config:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Linux kernel docs describe zram as a &lt;strong&gt;compressed RAM-based block device&lt;/strong&gt; that can be used for swap, &lt;code&gt;/tmp&lt;/code&gt;, and other temporary storage.&lt;/li&gt;
&lt;li&gt;The kernel docs also note that &lt;strong&gt;oversizing zram is wasteful&lt;/strong&gt;, and say there is little point creating a zram device larger than roughly twice memory if you expect about a 2:1 compression ratio.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd-zram-generator&lt;/code&gt; creates zram devices from declarative config, and if you do not override it, the documented default sizing is &lt;strong&gt;&lt;code&gt;min(ram / 2, 4096)&lt;/code&gt;&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;zram-generator.conf&lt;/code&gt; man page documents &lt;strong&gt;&lt;code&gt;swap-priority=&lt;/code&gt;&lt;/strong&gt;, with an unset default of &lt;strong&gt;100&lt;/strong&gt;, so zram can be preferred over slower swap devices.&lt;/li&gt;
&lt;li&gt;Fedora’s swap-on-zram design notes call out an important operational detail: zram memory is &lt;strong&gt;allocated dynamically&lt;/strong&gt;, and a full logical zram device does &lt;strong&gt;not&lt;/strong&gt; mean the same amount of physical RAM is consumed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That makes zram attractive for general-purpose Linux systems, but it also explains why bad sizing choices can backfire.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install the generator
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Debian 12+ / Ubuntu versions that package it
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;systemd-zram-generator
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Fedora
&lt;/h3&gt;

&lt;p&gt;If you want the package plus Fedora’s default config behavior:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dnf &lt;span class="nb"&gt;install &lt;/span&gt;zram-generator-defaults
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you want only the generator and your own config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dnf &lt;span class="nb"&gt;install &lt;/span&gt;zram-generator
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Arch Linux
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;pacman &lt;span class="nt"&gt;-S&lt;/span&gt; zram-generator
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Create an explicit config
&lt;/h2&gt;

&lt;p&gt;Even if your distro ships defaults, I prefer an explicit local config so the system’s behavior is obvious later.&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;/etc/systemd/zram-generator.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[zram0]&lt;/span&gt;
&lt;span class="py"&gt;zram-size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;min(ram / 2, 4096)&lt;/span&gt;
&lt;span class="py"&gt;compression-algorithm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;zstd&lt;/span&gt;
&lt;span class="py"&gt;swap-priority&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;100&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What those settings do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;zram-size = min(ram / 2, 4096)&lt;/code&gt; keeps the logical device conservative: half of RAM, capped at 4 GiB&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;compression-algorithm = zstd&lt;/code&gt; requests &lt;code&gt;zstd&lt;/code&gt; if the kernel exposes it for zram on your system&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;swap-priority = 100&lt;/code&gt; makes zram preferred over lower-priority disk swap&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  A slightly larger example for RAM-rich systems
&lt;/h3&gt;

&lt;p&gt;If you have a machine with more memory and occasional spikes, you might prefer a piecewise rule like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[zram0]&lt;/span&gt;
&lt;span class="py"&gt;zram-size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;min(min(ram, 4096) + max(ram - 4096, 0) / 2, 8192)&lt;/span&gt;
&lt;span class="py"&gt;compression-algorithm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;zstd&lt;/span&gt;
&lt;span class="py"&gt;swap-priority&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;100&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;first 4 GiB of RAM maps 1:1 into zram sizing&lt;/li&gt;
&lt;li&gt;RAM above 4 GiB contributes at a 1:2 rate&lt;/li&gt;
&lt;li&gt;the final zram size is capped at 8 GiB&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I like this better than blindly setting &lt;code&gt;zram-size = ram&lt;/code&gt;, especially on workstations where you want a safety margin, not CPU-heavy swap thrash.&lt;/p&gt;

&lt;h2&gt;
  
  
  Apply the config
&lt;/h2&gt;

&lt;p&gt;Reload systemd’s generators and start the device:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start /dev/zram0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the next boot, it should come up automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Verify that it really works
&lt;/h2&gt;

&lt;p&gt;Do not stop at “the package installed”. Verify all the moving parts.&lt;/p&gt;

&lt;h3&gt;
  
  
  1) Check active swap devices
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;swapon &lt;span class="nt"&gt;--show&lt;/span&gt; &lt;span class="nt"&gt;--bytes&lt;/span&gt; &lt;span class="nt"&gt;--output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;NAME,TYPE,SIZE,USED,PRIO
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NAME       TYPE      SIZE       USED PRIO
/dev/zram0 partition 4294967296    0  100
/dev/nvme0n1p3 partition 8589934592 0   -2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If both zram and disk swap exist, the higher priority means zram is preferred first.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) Inspect the zram device
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;zramctl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example fields worth watching:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ALGORITHM&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DISKSIZE&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DATA&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;COMPR&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;TOTAL&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;STREAMS&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3) Read kernel-exported stats
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /sys/block/zram0/mm_stat
&lt;span class="nb"&gt;cat&lt;/span&gt; /sys/block/zram0/io_stat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The kernel docs define useful values in &lt;code&gt;mm_stat&lt;/code&gt;, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;orig_data_size&lt;/code&gt;, the uncompressed data stored in zram&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;compr_data_size&lt;/code&gt;, the compressed size&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mem_used_total&lt;/code&gt;, the actual memory consumed including overhead&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;huge_pages&lt;/code&gt;, incompressible pages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That makes it easy to see whether zram is helping or just burning CPU on data that barely compresses.&lt;/p&gt;

&lt;h2&gt;
  
  
  A safe way to test under memory pressure
&lt;/h2&gt;

&lt;p&gt;You do not need to crash a host to validate the setup.&lt;/p&gt;

&lt;p&gt;First, record the baseline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;free &lt;span class="nt"&gt;-h&lt;/span&gt;
swapon &lt;span class="nt"&gt;--show&lt;/span&gt;
zramctl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then create a temporary memory load. One simple option is &lt;code&gt;stress-ng&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;stress-ng   &lt;span class="c"&gt;# Debian/Ubuntu&lt;/span&gt;
&lt;span class="c"&gt;# or: sudo dnf install stress-ng&lt;/span&gt;
&lt;span class="c"&gt;# or: sudo pacman -S stress-ng&lt;/span&gt;

stress-ng &lt;span class="nt"&gt;--vm&lt;/span&gt; 2 &lt;span class="nt"&gt;--vm-bytes&lt;/span&gt; 70% &lt;span class="nt"&gt;--timeout&lt;/span&gt; 60s &lt;span class="nt"&gt;--metrics-brief&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While it runs, watch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;watch &lt;span class="nt"&gt;-n&lt;/span&gt; 1 &lt;span class="s1"&gt;'free -h; echo; swapon --show; echo; zramctl'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What you want to see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;USED&lt;/code&gt; on &lt;code&gt;/dev/zram0&lt;/code&gt; increases under pressure&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;zramctl&lt;/code&gt; shows compressed data smaller than original payload&lt;/li&gt;
&lt;li&gt;the machine stays responsive enough to keep working&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What you do &lt;strong&gt;not&lt;/strong&gt; want to see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;severe CPU thrash from compression&lt;/li&gt;
&lt;li&gt;very poor compression ratios on your real workload&lt;/li&gt;
&lt;li&gt;pressure so sustained that zram only delays the inevitable by a few seconds&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  If you also have disk swap
&lt;/h2&gt;

&lt;p&gt;That can be a good thing.&lt;/p&gt;

&lt;p&gt;A practical pattern is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;keep zram at higher priority for fast first-stage pressure relief&lt;/li&gt;
&lt;li&gt;keep disk swap at lower priority as a slower overflow path&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check priorities with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;swapon &lt;span class="nt"&gt;--show&lt;/span&gt; &lt;span class="nt"&gt;--output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;NAME,PRIO
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If needed, you can set a lower priority for disk swap in &lt;code&gt;/etc/fstab&lt;/code&gt;, for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;UUID=xxxx-xxxx none swap defaults,pri=10 0 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then keep zram at &lt;code&gt;swap-priority = 100&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This arrangement gives you a fast buffer before the system falls back to slower storage-backed swapping.&lt;/p&gt;

&lt;h2&gt;
  
  
  When zram is the wrong answer
&lt;/h2&gt;

&lt;p&gt;zram is not a replacement for capacity planning.&lt;/p&gt;

&lt;p&gt;If a box routinely runs out of RAM because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;too many containers are pinned in memory&lt;/li&gt;
&lt;li&gt;a database cache is oversized&lt;/li&gt;
&lt;li&gt;a model server is allowed to grow without limits&lt;/li&gt;
&lt;li&gt;the workload needs true eviction to disk more than compressed in-RAM storage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;then the fix is usually one of these:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;reduce memory pressure at the service level&lt;/li&gt;
&lt;li&gt;add real RAM&lt;/li&gt;
&lt;li&gt;keep a lower-priority disk swap path&lt;/li&gt;
&lt;li&gt;use service-level limits and OOM policy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;zram helps the most with bursts and moderate overcommit, not chronic memory abuse.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to disable or roll back
&lt;/h2&gt;

&lt;p&gt;If you want to turn it off cleanly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;swapoff /dev/zram0
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl stop /dev/zram0
&lt;span class="nb"&gt;sudo rm&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/systemd/zram-generator.conf
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your distro enables zram through a vendor default package, you may also need to remove that package or mask its config according to distro policy.&lt;/p&gt;

&lt;p&gt;After rollback, confirm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;swapon &lt;span class="nt"&gt;--show&lt;/span&gt;
zramctl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  A practical baseline I’d use
&lt;/h2&gt;

&lt;p&gt;For a laptop, mini PC, or general-purpose Linux workstation, I’d start here:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[zram0]&lt;/span&gt;
&lt;span class="py"&gt;zram-size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;min(ram / 2, 4096)&lt;/span&gt;
&lt;span class="py"&gt;compression-algorithm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;zstd&lt;/span&gt;
&lt;span class="py"&gt;swap-priority&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;100&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then I would verify three things on the real workload:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;responsiveness during memory spikes&lt;/li&gt;
&lt;li&gt;actual compression ratio from &lt;code&gt;zramctl&lt;/code&gt; and &lt;code&gt;mm_stat&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;whether disk swap still needs to exist as a lower-priority fallback&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That gets you something pragmatic: better behavior under pressure, simple config, and a clean rollback path.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Linux kernel documentation, “Compressed RAM-based block devices (zram)”: &lt;a href="https://docs.kernel.org/admin-guide/blockdev/zram.html" rel="noopener noreferrer"&gt;https://docs.kernel.org/admin-guide/blockdev/zram.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd-zram-generator&lt;/code&gt; README: &lt;a href="https://github.com/systemd/zram-generator" rel="noopener noreferrer"&gt;https://github.com/systemd/zram-generator&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;zram-generator.conf(5)&lt;/code&gt; man page: &lt;a href="https://manpages.ubuntu.com/manpages/questing/man5/zram-generator.conf.5.html" rel="noopener noreferrer"&gt;https://manpages.ubuntu.com/manpages/questing/man5/zram-generator.conf.5.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Fedora Change proposal, “SwapOnZRAM”: &lt;a href="https://fedoraproject.org/wiki/Changes/SwapOnZRAM" rel="noopener noreferrer"&gt;https://fedoraproject.org/wiki/Changes/SwapOnZRAM&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>linux</category>
      <category>systemd</category>
      <category>performance</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Stop Linux Memory Death Spirals Early: Practical `systemd-oomd` with PSI and cgroup policy</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Sat, 11 Apr 2026 05:03:19 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/stop-linux-memory-death-spirals-early-practical-systemd-oomd-with-psi-and-cgroup-policy-369j</link>
      <guid>https://dev.to/lyraalishaikh/stop-linux-memory-death-spirals-early-practical-systemd-oomd-with-psi-and-cgroup-policy-369j</guid>
      <description>&lt;h1&gt;
  
  
  Stop Linux Memory Death Spirals Early: Practical &lt;code&gt;systemd-oomd&lt;/code&gt; with PSI and cgroup policy
&lt;/h1&gt;

&lt;p&gt;When a Linux box runs out of memory, the bad outcome usually starts before the actual out-of-memory kill.&lt;/p&gt;

&lt;p&gt;SSH gets sticky. Web requests slow down. Latency spikes. The machine starts reclaiming memory aggressively, and by the time the kernel OOM killer finally swings, you are already in damage-control mode.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd-oomd&lt;/code&gt; is built to intervene earlier.&lt;/p&gt;

&lt;p&gt;It watches &lt;strong&gt;pressure stall information (PSI)&lt;/strong&gt; and cgroup state, then kills the right descendant cgroup before the whole host becomes miserable. If you run memory-hungry services, self-hosted AI workloads, or batch jobs that occasionally stampede RAM, this is one of the cleanest ways to make a Linux system fail more predictably.&lt;/p&gt;

&lt;p&gt;This guide covers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what &lt;code&gt;systemd-oomd&lt;/code&gt; actually does&lt;/li&gt;
&lt;li&gt;how to confirm your system can use it&lt;/li&gt;
&lt;li&gt;how to enable it safely&lt;/li&gt;
&lt;li&gt;how to apply policy at the right cgroup level&lt;/li&gt;
&lt;li&gt;how to inspect what it is monitoring&lt;/li&gt;
&lt;li&gt;how to test without guessing&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why this is a different angle
&lt;/h2&gt;

&lt;p&gt;I have already covered static cgroup guardrails for self-hosted AI workloads. This article is intentionally different.&lt;/p&gt;

&lt;p&gt;That approach is about hard ceilings such as &lt;code&gt;MemoryMax=&lt;/code&gt; and &lt;code&gt;CPUQuota=&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This one is about &lt;strong&gt;proactive pressure-based action&lt;/strong&gt;. Instead of waiting for a hard limit breach or for the kernel OOM killer to clean up the wreckage, &lt;code&gt;systemd-oomd&lt;/code&gt; uses PSI and cgroup policy to spot sustained memory distress and cut off the right workload earlier.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the docs say
&lt;/h2&gt;

&lt;p&gt;According to &lt;code&gt;systemd-oomd.service(8)&lt;/code&gt;, &lt;code&gt;systemd-oomd&lt;/code&gt; is a userspace OOM killer that uses &lt;strong&gt;cgroups v2&lt;/strong&gt; and &lt;strong&gt;pressure stall information (PSI)&lt;/strong&gt; to take corrective action before a kernel-space OOM occurs.&lt;/p&gt;

&lt;p&gt;The same documentation also notes a few important prerequisites:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you want a &lt;strong&gt;full unified cgroup hierarchy&lt;/strong&gt; (cgroup v2)&lt;/li&gt;
&lt;li&gt;memory accounting should be enabled for monitored units&lt;/li&gt;
&lt;li&gt;the kernel needs PSI support&lt;/li&gt;
&lt;li&gt;having &lt;strong&gt;swap enabled is strongly recommended&lt;/strong&gt;, because it gives &lt;code&gt;systemd-oomd&lt;/code&gt; time to react before the system collapses into a livelock&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From &lt;code&gt;oomd.conf(5)&lt;/code&gt;, the global defaults are documented as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;SwapUsedLimit=90%&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DefaultMemoryPressureLimit=60%&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DefaultMemoryPressureDurationSec=30s&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those are not magic numbers. They are just sane defaults. The right values depend on how interactive or latency-sensitive your workload is.&lt;/p&gt;

&lt;h2&gt;
  
  
  First, confirm the host is compatible
&lt;/h2&gt;

&lt;p&gt;Check whether you are on cgroup v2:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;stat&lt;/span&gt; &lt;span class="nt"&gt;-fc&lt;/span&gt; %T /sys/fs/cgroup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cgroup2fs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check whether PSI files exist:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; /proc/pressure
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see entries like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cpu
io
memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Peek at current system-wide memory pressure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/pressure/memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;some avg10=0.00 avg60=0.12 avg300=0.08 total=1234567
full avg10=0.00 avg60=0.05 avg300=0.02 total=345678
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the kernel PSI documentation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;some&lt;/code&gt; means at least some tasks are stalled&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;full&lt;/code&gt; means all non-idle tasks are stalled simultaneously&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That second case is where a system starts feeling truly awful.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install and enable &lt;code&gt;systemd-oomd&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Packaging varies by distro.&lt;/p&gt;

&lt;p&gt;On some systems, &lt;code&gt;systemd-oomd&lt;/code&gt; ships as part of the main systemd package. On others, it is split out. So start with discovery instead of guessing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl list-unit-files &lt;span class="s1"&gt;'systemd-oomd*'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the service is not present, check your package manager:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;apt-cache policy systemd-oomd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On Debian-family systems that package it separately, install it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;systemd-oomd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then enable it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable&lt;/span&gt; &lt;span class="nt"&gt;--now&lt;/span&gt; systemd-oomd.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Confirm it is active:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl status systemd-oomd.service &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Make sure memory accounting is on
&lt;/h2&gt;

&lt;p&gt;The man page recommends memory accounting for monitored units, and the simplest system-wide way is &lt;code&gt;DefaultMemoryAccounting=yes&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Check the effective setting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl show &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;DefaultMemoryAccounting
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If needed, add a systemd manager drop-in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /etc/systemd/system.conf.d
&lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/systemd/system.conf.d/60-memory-accounting.conf &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
[Manager]
DefaultMemoryAccounting=yes
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reload the manager configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reexec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify again:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl show &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;DefaultMemoryAccounting
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Start with slice-level policy, not one-off service hacks
&lt;/h2&gt;

&lt;p&gt;This is the part that matters most.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd-oomd&lt;/code&gt; does &lt;strong&gt;not&lt;/strong&gt; simply kill the unit where you set policy. Per the documentation, it monitors cgroups marked with &lt;code&gt;ManagedOOMSwap=&lt;/code&gt; or &lt;code&gt;ManagedOOMMemoryPressure=&lt;/code&gt; and then chooses an eligible &lt;strong&gt;descendant&lt;/strong&gt; cgroup to kill.&lt;/p&gt;

&lt;p&gt;That means slice-level policy is usually cleaner than sprinkling overrides everywhere.&lt;/p&gt;

&lt;p&gt;A good first target for server workloads is &lt;code&gt;system.slice&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Create a drop-in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl edit system.slice
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Slice]&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMMemoryPressure&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;kill&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMMemoryPressureLimit&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;50%&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMMemoryPressureDurationSec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;20s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or write it directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; /etc/systemd/system/system.slice.d
&lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/systemd/system/system.slice.d/60-oomd.conf &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
[Slice]
ManagedOOMMemoryPressure=kill
ManagedOOMMemoryPressureLimit=50%
ManagedOOMMemoryPressureDurationSec=20s
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then reload systemd:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why &lt;code&gt;system.slice&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;Because it catches ordinary system services while letting you reason about policy at the group level. If one worker service, inference job, or runaway application starts thrashing memory, &lt;code&gt;systemd-oomd&lt;/code&gt; can choose the stressed descendant cgroup instead of waiting for the entire machine to degrade further.&lt;/p&gt;

&lt;h2&gt;
  
  
  Add swap-aware protection if appropriate
&lt;/h2&gt;

&lt;p&gt;The documentation explicitly recommends swap for better behavior, because it buys time for userspace intervention.&lt;/p&gt;

&lt;p&gt;If the host has swap and you want swap-based protection too, you can add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Slice]&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMSwap&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;kill&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a combined drop-in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Slice]&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMMemoryPressure&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;kill&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMMemoryPressureLimit&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;50%&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMMemoryPressureDurationSec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;20s&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMSwap&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;kill&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I would not enable aggressive policy everywhere on day one. Start with the slice that contains restartable or less critical workloads, observe, then widen it if the results are good.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mark critical services as less likely kill candidates
&lt;/h2&gt;

&lt;p&gt;You may have services that should be sacrificed last, not first.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd.resource-control(5)&lt;/code&gt; documents &lt;code&gt;ManagedOOMPreference=&lt;/code&gt; for this kind of biasing. If a service is important to keep alive, add a drop-in like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl edit nginx.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMPreference&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;omit&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a lower-priority worker, you can lean the other direction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl edit ollama.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMPreference&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;avoid&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read the local man page for the exact semantics supported by your systemd version before standardizing on these values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;man systemd.resource-control
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That version check matters because systemd features do move over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inspect what &lt;code&gt;systemd-oomd&lt;/code&gt; is watching
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;oomctl&lt;/code&gt; exists for exactly this reason.&lt;/p&gt;

&lt;p&gt;Show the current state known to &lt;code&gt;systemd-oomd&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;oomctl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or dump monitored contexts in a more script-friendly way if your version supports it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;oomctl dump
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also inspect the slice and service properties directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl show system.slice &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ManagedOOMMemoryPressure &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ManagedOOMMemoryPressureLimit &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ManagedOOMMemoryPressureDurationSec &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ManagedOOMSwap
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And for a specific service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl show ollama.service &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ManagedOOMPreference &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;MemoryCurrent &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;MemoryPeak
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Watch the logs while testing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; systemd-oomd &lt;span class="nt"&gt;-f&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  A careful test plan
&lt;/h2&gt;

&lt;p&gt;Do &lt;strong&gt;not&lt;/strong&gt; test this blindly on a production host during business hours.&lt;/p&gt;

&lt;p&gt;A safer flow is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;apply policy to a non-critical slice or lab machine&lt;/li&gt;
&lt;li&gt;watch PSI and &lt;code&gt;oomctl&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;create controlled memory pressure&lt;/li&gt;
&lt;li&gt;confirm the right descendant cgroup becomes the target&lt;/li&gt;
&lt;li&gt;tune the thresholds&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You can observe PSI live with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;watch &lt;span class="nt"&gt;-n&lt;/span&gt; 1 &lt;span class="s1"&gt;'cat /proc/pressure/memory'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you already have a known memory-hungry workload, use that in a test environment.&lt;/p&gt;

&lt;p&gt;If you want a simple synthetic allocation tool on Debian or Ubuntu, &lt;code&gt;stress-ng&lt;/code&gt; is a common option:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;stress-ng
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemd-run &lt;span class="nt"&gt;--unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;oomd-test &lt;span class="nt"&gt;--slice&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;system.slice &lt;span class="se"&gt;\&lt;/span&gt;
  stress-ng &lt;span class="nt"&gt;--vm&lt;/span&gt; 1 &lt;span class="nt"&gt;--vm-bytes&lt;/span&gt; 85% &lt;span class="nt"&gt;--vm-keep&lt;/span&gt; &lt;span class="nt"&gt;--timeout&lt;/span&gt; 2m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, in another terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; systemd-oomd &lt;span class="nt"&gt;-f&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;oomctl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The goal is not “make something die.”&lt;/p&gt;

&lt;p&gt;The goal is “confirm the machine stays responsive and the right workload becomes the likely victim before a full host meltdown.”&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical policy pattern
&lt;/h2&gt;

&lt;p&gt;For many homelab and small-server setups, this is a sensible starting point:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;enable &lt;code&gt;systemd-oomd&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;turn on default memory accounting&lt;/li&gt;
&lt;li&gt;apply pressure-based policy to &lt;code&gt;system.slice&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;reserve stricter preferences for clearly critical services&lt;/li&gt;
&lt;li&gt;leave room to tune thresholds after observing real pressure patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example starting drop-in for &lt;code&gt;system.slice&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Slice]&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMMemoryPressure&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;kill&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMMemoryPressureLimit&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;50%&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMMemoryPressureDurationSec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;20s&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMSwap&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;kill&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then protect critical infra individually, for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;ManagedOOMPreference&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;omit&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;for your reverse proxy, database, or SSH bastion, if that matches your risk model.&lt;/p&gt;

&lt;h2&gt;
  
  
  What not to do
&lt;/h2&gt;

&lt;p&gt;A few things I would avoid:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Do not&lt;/strong&gt; treat &lt;code&gt;systemd-oomd&lt;/code&gt; as a substitute for capacity planning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Do not&lt;/strong&gt; skip swap and expect equally graceful behavior.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Do not&lt;/strong&gt; set one ultra-aggressive threshold globally without testing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Do not&lt;/strong&gt; forget that cgroup structure matters. If everything lives in one giant bucket, targeting gets worse.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Do not&lt;/strong&gt; rely only on &lt;code&gt;MemoryMax=&lt;/code&gt; for bursty workloads if the real failure mode is prolonged reclaim thrash before the limit is hit.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;systemd-oomd.service(8)&lt;/code&gt;: &lt;a href="https://www.man7.org/linux/man-pages/man8/systemd-oomd.8.html" rel="noopener noreferrer"&gt;https://www.man7.org/linux/man-pages/man8/systemd-oomd.8.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;oomd.conf(5)&lt;/code&gt;: &lt;a href="https://www.man7.org/linux/man-pages/man5/oomd.conf.5.html" rel="noopener noreferrer"&gt;https://www.man7.org/linux/man-pages/man5/oomd.conf.5.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd.resource-control(5)&lt;/code&gt;: &lt;a href="https://man7.org/linux/man-pages/man5/systemd.resource-control.5.html" rel="noopener noreferrer"&gt;https://man7.org/linux/man-pages/man5/systemd.resource-control.5.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Linux kernel PSI documentation: &lt;a href="https://docs.kernel.org/accounting/psi.html" rel="noopener noreferrer"&gt;https://docs.kernel.org/accounting/psi.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;oomctl(1)&lt;/code&gt; reference index: &lt;a href="https://www.freedesktop.org/software/systemd/man/latest/oomctl.html" rel="noopener noreferrer"&gt;https://www.freedesktop.org/software/systemd/man/latest/oomctl.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Closing thought
&lt;/h2&gt;

&lt;p&gt;The nice thing about &lt;code&gt;systemd-oomd&lt;/code&gt; is not that it prevents every memory problem.&lt;/p&gt;

&lt;p&gt;It is that it gives Linux a chance to fail like a systems engineer designed it, instead of like a panicking host trying to stay upright one reclaim cycle too long.&lt;/p&gt;

&lt;p&gt;That is a much better bargain.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>systemd</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
    <item>
      <title>Self-Hosted AI in 2026: Automating Your Linux Workflow with n8n and Ollama</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Thu, 02 Apr 2026 05:07:49 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/self-hosted-ai-in-2026-automating-your-linux-workflow-with-n8n-and-ollama-4934</link>
      <guid>https://dev.to/lyraalishaikh/self-hosted-ai-in-2026-automating-your-linux-workflow-with-n8n-and-ollama-4934</guid>
      <description>&lt;p&gt;In 2026, the "Local AI" movement is no longer just a niche hobby for hardware enthusiasts. With privacy concerns rising and cloud costs unpredictable, self-hosting your intelligence has become standard practice for developers and Linux sysadmins alike.&lt;/p&gt;

&lt;p&gt;Today, we’re looking at how to combine the power of &lt;strong&gt;Ollama&lt;/strong&gt; with the robustness of &lt;strong&gt;n8n&lt;/strong&gt; to build a truly private automation stack. We’re moving beyond simple chatbots and into autonomous workflows that can summarize your emails, monitor your logs, and even help you write better code—all without a single byte leaving your local network.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Self-Host AI Automation?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Zero Latency:&lt;/strong&gt; No API round-trips to Virginia or Ireland.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Privacy:&lt;/strong&gt; Your data, your logs, your secrets stay on your hardware.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;No Subscriptions:&lt;/strong&gt; One-time hardware cost, zero monthly fees.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Full Control:&lt;/strong&gt; Use any model you want, from Llama 3.x to Mistral or DeepSeek.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;OS:&lt;/strong&gt; Any modern Linux distribution (Ubuntu 24.04+ or Debian 13 recommended).&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Ollama:&lt;/strong&gt; The easiest way to run LLMs locally.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;n8n:&lt;/strong&gt; The "Zapier for self-hosters" with built-in AI nodes.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Docker:&lt;/strong&gt; For easy deployment and isolation.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 1: Install Ollama
&lt;/h2&gt;

&lt;p&gt;If you haven't installed Ollama yet, it's a single command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://ollama.com/install.sh | sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To verify it's working and pull a versatile model (like Llama 3):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull llama3
ollama run llama3 &lt;span class="s2"&gt;"Hello, world!"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Deploy n8n with Docker
&lt;/h2&gt;

&lt;p&gt;We’ll use Docker Compose to get n8n up and running. Crucially, we need to allow the n8n container to talk to the Ollama service running on the host.&lt;/p&gt;

&lt;p&gt;Create a &lt;code&gt;docker-compose.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;3.8'&lt;/span&gt;

&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;n8n&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;n8nio/n8n:latest&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5678:5678"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;N8N_HOST=localhost&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;N8N_PORT=5678&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;N8N_PROTOCOL=http&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;n8n_data:/home/node/.local/share/n8n&lt;/span&gt;
    &lt;span class="c1"&gt;# This allows n8n to reach Ollama on the host machine&lt;/span&gt;
    &lt;span class="na"&gt;extra_hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;host.docker.internal:host-gateway"&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;n8n_data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Launch it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Create Your First AI Workflow
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt; Open n8n at &lt;code&gt;http://localhost:5678&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt; Add an &lt;strong&gt;Ollama&lt;/strong&gt; node to your workflow.&lt;/li&gt;
&lt;li&gt; Configure the &lt;strong&gt;Credentials&lt;/strong&gt;: Set the URL to &lt;code&gt;http://host.docker.internal:11434&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt; Select your model (e.g., &lt;code&gt;llama3&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt; Connect it to a trigger—like an &lt;strong&gt;HTTP Request&lt;/strong&gt; or a &lt;strong&gt;Cron&lt;/strong&gt; job.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Practical Example: The "Log Watcher" Workflow
&lt;/h3&gt;

&lt;p&gt;Imagine you want a summary of your system logs emailed to you every morning, but you don't want to send raw logs to a cloud AI.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Node 1 (Execute Command):&lt;/strong&gt; &lt;code&gt;tail -n 100 /var/log/syslog&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Node 2 (Ollama):&lt;/strong&gt; Prompt: "Summarize these logs and highlight any security warnings or critical errors."&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Node 3 (Email/Discord):&lt;/strong&gt; Send the output to your preferred channel.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Performance Tips for 2026
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;GPU Acceleration:&lt;/strong&gt; If you have an NVIDIA GPU, make sure you have the &lt;code&gt;nvidia-container-toolkit&lt;/code&gt; installed so Docker can leverage CUDA.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Model Quantization:&lt;/strong&gt; Stick to &lt;code&gt;4-bit&lt;/code&gt; or &lt;code&gt;6-bit&lt;/code&gt; quantizations for a good balance of speed and intelligence.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;VRAM Matters:&lt;/strong&gt; For 7B or 8B models, 8GB of VRAM is the sweet spot. For 70B models, you’ll want 24GB+ (or a Mac Studio).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  References &amp;amp; Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://github.com/ollama/ollama" rel="noopener noreferrer"&gt;Ollama Official Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://github.com/n8n-io/self-hosted-ai-starter-kit" rel="noopener noreferrer"&gt;n8n Self-Hosted AI Starter Kit&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://linuxfoundation.org" rel="noopener noreferrer"&gt;Linux Automation Best Practices (2026)&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Self-hosting your AI isn't just about the technology; it's about reclaiming ownership of your tools. If you're building something cool with this stack, let me know in the comments!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Happy hacking!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>linux</category>
      <category>selfhosted</category>
      <category>automation</category>
      <category>ai</category>
    </item>
    <item>
      <title>Speed Up Linux Updates Across Your Homelab with apt-cacher-ng (Practical Guide)</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Fri, 13 Mar 2026 05:01:42 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/speed-up-linux-updates-across-your-homelab-with-apt-cacher-ng-practical-guide-4ail</link>
      <guid>https://dev.to/lyraalishaikh/speed-up-linux-updates-across-your-homelab-with-apt-cacher-ng-practical-guide-4ail</guid>
      <description>&lt;p&gt;If you update multiple Debian/Ubuntu machines, you’re probably downloading the same &lt;code&gt;.deb&lt;/code&gt; files repeatedly.&lt;/p&gt;

&lt;p&gt;That wastes bandwidth, slows patching windows, and makes offline-ish maintenance harder than it needs to be.&lt;/p&gt;

&lt;p&gt;A better pattern is a local APT cache server with &lt;strong&gt;apt-cacher-ng&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;first machine downloads packages from upstream&lt;/li&gt;
&lt;li&gt;the cache keeps those package files locally&lt;/li&gt;
&lt;li&gt;next machines reuse cached packages over LAN&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This post gives you a complete setup you can actually run.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this works (and where it doesn’t)
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;apt-cacher-ng&lt;/code&gt; acts like a proxy/cache for APT repositories.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Package payloads over HTTP can be cached and reused.&lt;/li&gt;
&lt;li&gt;For HTTPS repos, a common approach is CONNECT pass-through. That keeps transport encrypted but generally &lt;strong&gt;does not cache HTTPS payloads&lt;/strong&gt; in that mode.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So in real deployments, gains depend on your repo mix and transport path.&lt;/p&gt;




&lt;h2&gt;
  
  
  1) Install apt-cacher-ng on one Linux host
&lt;/h2&gt;

&lt;p&gt;Choose a host reachable by your clients (for example &lt;code&gt;192.168.1.50&lt;/code&gt;).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; apt-cacher-ng
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable&lt;/span&gt; &lt;span class="nt"&gt;--now&lt;/span&gt; apt-cacher-ng
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status &lt;span class="nt"&gt;--no-pager&lt;/span&gt; apt-cacher-ng
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Default listen port is &lt;code&gt;3142&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you run a firewall:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# UFW example&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;ufw allow from 192.168.1.0/24 to any port 3142 proto tcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Quick health check from another machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-I&lt;/span&gt; http://192.168.1.50:3142/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get an HTTP response (often &lt;code&gt;200&lt;/code&gt; or &lt;code&gt;403&lt;/code&gt; depending on endpoint/path).&lt;/p&gt;




&lt;h2&gt;
  
  
  2) Point Debian/Ubuntu clients at the cache
&lt;/h2&gt;

&lt;p&gt;On each client, create &lt;code&gt;/etc/apt/apt.conf.d/99proxy&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/apt/apt.conf.d/99proxy &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
Acquire::http::Proxy "http://192.168.1.50:3142";
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then refresh:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you need to disable quickly on one host:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo rm&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/apt/apt.conf.d/99proxy
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  3) HTTPS repositories: choose your behavior explicitly
&lt;/h2&gt;

&lt;p&gt;If your clients use HTTPS repository URLs, a widely used option is CONNECT pass-through on the cache host.&lt;/p&gt;

&lt;p&gt;Edit &lt;code&gt;/etc/apt-cacher-ng/acng.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="c"&gt;# Allow CONNECT passthrough to TLS port
&lt;/span&gt;&lt;span class="n"&gt;PassThroughPattern&lt;/span&gt;: ^(.*):&lt;span class="m"&gt;443&lt;/span&gt;$
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then reload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart apt-cacher-ng
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Important: with pass-through, HTTPS content is typically tunneled and &lt;strong&gt;not cached&lt;/strong&gt;. You still get centralized proxying behavior, but not full package cache efficiency for those paths.&lt;/p&gt;




&lt;h2&gt;
  
  
  4) Validate cache effectiveness (don’t guess)
&lt;/h2&gt;

&lt;p&gt;Run updates on two clients back-to-back and compare behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Client A (cold run)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt clean
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; curl jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Client B (warm run)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt clean
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; curl jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now inspect apt-cacher-ng stats on the cache host:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://127.0.0.1:3142/acng-report.html | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-Ei&lt;/span&gt; &lt;span class="s1"&gt;'Hits|Misses|Data'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see hit/miss and transfer counters move after repeated installs.&lt;/p&gt;




&lt;h2&gt;
  
  
  5) Safe maintenance
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Expire stale cache objects
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;apt-cacher-ng&lt;/code&gt; provides an admin/report endpoint for expiration tasks.&lt;/p&gt;

&lt;p&gt;If cache growth is uncontrolled, run expiration from the report UI or scripted maintenance as documented upstream.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic service checks
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; apt-cacher-ng &lt;span class="nt"&gt;-n&lt;/span&gt; 100 &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl is-active apt-cacher-ng
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Keep the server itself patched
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--only-upgrade&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; apt-cacher-ng
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Operational notes that matter
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Put the cache on wired LAN if possible; Wi-Fi bottlenecks can erase gains.&lt;/li&gt;
&lt;li&gt;Keep proxy config explicit in &lt;code&gt;/etc/apt/apt.conf.d/&lt;/code&gt; so rollback is one file delete.&lt;/li&gt;
&lt;li&gt;For laptops moving between trusted/untrusted networks, avoid blind auto-discovery unless you trust that network.&lt;/li&gt;
&lt;li&gt;Treat this as an optimization layer, not a trust bypass. APT signature verification still matters.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;If you manage more than a couple of Debian/Ubuntu nodes, apt-cacher-ng is a low-complexity win:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;less repeated bandwidth&lt;/li&gt;
&lt;li&gt;faster repeated installs/updates&lt;/li&gt;
&lt;li&gt;better control over patch windows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Start with one cache host, two clients, and verify hit rates before rolling wider.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Debian Wiki — AptCacherNg: &lt;a href="https://wiki.debian.org/AptCacherNg" rel="noopener noreferrer"&gt;https://wiki.debian.org/AptCacherNg&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Apt-Cacher NG User Manual (official): &lt;a href="https://www.unix-ag.uni-kl.de/%7Ebloch/acng/html/index.html" rel="noopener noreferrer"&gt;https://www.unix-ag.uni-kl.de/~bloch/acng/html/index.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;apt.conf(5) Debian manpage: &lt;a href="https://manpages.debian.org/bookworm/apt/apt.conf.5.en.html" rel="noopener noreferrer"&gt;https://manpages.debian.org/bookworm/apt/apt.conf.5.en.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>linux</category>
      <category>automation</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
    <item>
      <title>Ditch `authorized_keys` Sprawl: SSH User Certificates with OpenSSH CA (Practical Linux Guide)</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Thu, 12 Mar 2026 05:02:10 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/ditch-authorizedkeys-sprawl-ssh-user-certificates-with-openssh-ca-practical-linux-guide-9</link>
      <guid>https://dev.to/lyraalishaikh/ditch-authorizedkeys-sprawl-ssh-user-certificates-with-openssh-ca-practical-linux-guide-9</guid>
      <description>&lt;p&gt;If you manage more than a handful of Linux servers, &lt;code&gt;authorized_keys&lt;/code&gt; eventually becomes a mess:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;keys copied everywhere&lt;/li&gt;
&lt;li&gt;stale access that never gets cleaned up&lt;/li&gt;
&lt;li&gt;painful offboarding&lt;/li&gt;
&lt;li&gt;no easy way to force short-lived access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OpenSSH has a built-in answer: &lt;strong&gt;user certificates signed by your own SSH Certificate Authority (CA)&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of distributing every user key to every server, you:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;trust one CA public key on servers,&lt;/li&gt;
&lt;li&gt;issue short-lived user certificates,&lt;/li&gt;
&lt;li&gt;control access with principals,&lt;/li&gt;
&lt;li&gt;revoke when needed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This guide is hands-on and keeps the moving parts minimal.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why SSH certificates are cleaner than &lt;code&gt;authorized_keys&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;With classic public-key auth, each server must store each user key (or fetch it dynamically). With CA-based auth, servers only need to trust the CA key via &lt;code&gt;TrustedUserCAKeys&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;From there, login is allowed when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the cert is valid (&lt;code&gt;-V&lt;/code&gt; window),&lt;/li&gt;
&lt;li&gt;cert principal matches what server accepts,&lt;/li&gt;
&lt;li&gt;cert is signed by trusted CA.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That gives you clean central issuance and short-lived access without replacing SSH itself.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lab topology used in this tutorial
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CA host&lt;/strong&gt; (secure admin machine): signs user keys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Target server&lt;/strong&gt;: trusts CA pubkey and enforces principals&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User laptop&lt;/strong&gt;: has user key + signed cert&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All commands below are Linux/OpenSSH-native.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1) Create a dedicated SSH user CA key
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Do this once, store the private key securely, and back it up safely.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo install&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 0700 /etc/ssh/ca
&lt;span class="nb"&gt;sudo &lt;/span&gt;ssh-keygen &lt;span class="nt"&gt;-t&lt;/span&gt; ed25519 &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/ssh/ca/user_ca &lt;span class="nt"&gt;-C&lt;/span&gt; &lt;span class="s2"&gt;"ssh-user-ca-2026-03"&lt;/span&gt; &lt;span class="nt"&gt;-N&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;600 /etc/ssh/ca/user_ca
&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;644 /etc/ssh/ca/user_ca.pub
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will distribute only &lt;code&gt;user_ca.pub&lt;/code&gt; to servers.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2) Configure server trust + principal mapping
&lt;/h2&gt;

&lt;p&gt;On each target server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo install&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 0755 /etc/ssh/auth_principals
&lt;span class="nb"&gt;sudo install&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 0644 /path/to/user_ca.pub /etc/ssh/trusted_user_ca_keys.pub

&lt;span class="c"&gt;# Map Linux user "deploy" to allowed cert principals&lt;/span&gt;
&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'deploy\nops\n'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/ssh/auth_principals/deploy &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null
&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;0644 /etc/ssh/auth_principals/deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now update &lt;code&gt;/etc/ssh/sshd_config&lt;/code&gt; (or a drop-in under &lt;code&gt;/etc/ssh/sshd_config.d/&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PubkeyAuthentication yes
TrustedUserCAKeys /etc/ssh/trusted_user_ca_keys.pub
AuthorizedPrincipalsFile /etc/ssh/auth_principals/%u
PasswordAuthentication no
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Validate config and reload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;sshd &lt;span class="nt"&gt;-t&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl reload ssh
&lt;span class="c"&gt;# On some distros: sudo systemctl reload sshd&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 3) Create a user key and sign a short-lived certificate
&lt;/h2&gt;

&lt;p&gt;On the user machine (or where user key is generated):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh-keygen &lt;span class="nt"&gt;-t&lt;/span&gt; ed25519 &lt;span class="nt"&gt;-f&lt;/span&gt; ~/.ssh/id_ed25519 &lt;span class="nt"&gt;-C&lt;/span&gt; &lt;span class="s2"&gt;"[email protected]"&lt;/span&gt; &lt;span class="nt"&gt;-N&lt;/span&gt; &lt;span class="s2"&gt;""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the CA host, sign that public key for specific principals and a short validity window:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh-keygen &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-s&lt;/span&gt; /etc/ssh/ca/user_ca &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-I&lt;/span&gt; &lt;span class="s2"&gt;"ali-ticket-4821"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-n&lt;/span&gt; deploy,ops &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-V&lt;/span&gt; +8h &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-z&lt;/span&gt; 1001 &lt;span class="se"&gt;\&lt;/span&gt;
  ~/.ssh/id_ed25519.pub
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates &lt;code&gt;~/.ssh/id_ed25519-cert.pub&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;What those flags do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;-s&lt;/code&gt;: CA private key used to sign&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-I&lt;/code&gt;: key identity string (audit-friendly)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-n&lt;/code&gt;: certificate principals (who/roles this cert can act as)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-V&lt;/code&gt;: validity period (&lt;code&gt;+8h&lt;/code&gt; here)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-z&lt;/code&gt;: serial number for tracking/revocation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Inspect the certificate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh-keygen &lt;span class="nt"&gt;-L&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; ~/.ssh/id_ed25519-cert.pub
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 4) Connect using key + certificate
&lt;/h2&gt;

&lt;p&gt;SSH automatically uses &lt;code&gt;*-cert.pub&lt;/code&gt; when paired with the private key, but explicit config is clearer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Host prod-web-01
  HostName 203.0.113.10
  User deploy
  IdentityFile ~/.ssh/id_ed25519
  CertificateFile ~/.ssh/id_ed25519-cert.pub
  IdentitiesOnly yes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Connect:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh prod-web-01
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If cert principal, validity, and server policy align, login succeeds with no per-host &lt;code&gt;authorized_keys&lt;/code&gt; entry for that user key.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 5) Revoke certificates when needed (KRL)
&lt;/h2&gt;

&lt;p&gt;If a cert or key should be blocked before expiry, use an OpenSSH KRL (Key Revocation List).&lt;/p&gt;

&lt;p&gt;Create initial KRL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;ssh-keygen &lt;span class="nt"&gt;-k&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/ssh/revoked_keys.krl
&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;644 /etc/ssh/revoked_keys.krl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add a certificate to revocation list:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;ssh-keygen &lt;span class="nt"&gt;-k&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/ssh/revoked_keys.krl ~/.ssh/id_ed25519-cert.pub
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tell sshd to enforce it (&lt;code&gt;/etc/ssh/sshd_config&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;RevokedKeys /etc/ssh/revoked_keys.krl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then reload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;sshd &lt;span class="nt"&gt;-t&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl reload ssh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Audit KRL contents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh-keygen &lt;span class="nt"&gt;-Q&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/ssh/revoked_keys.krl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Operational pattern that works in real teams
&lt;/h2&gt;

&lt;p&gt;A practical baseline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CA key is offline or tightly restricted&lt;/li&gt;
&lt;li&gt;cert TTL: 4h–24h for humans, slightly longer for automation if needed&lt;/li&gt;
&lt;li&gt;principals represent roles (&lt;code&gt;ops&lt;/code&gt;, &lt;code&gt;db-admin&lt;/code&gt;, &lt;code&gt;deploy&lt;/code&gt;) not people&lt;/li&gt;
&lt;li&gt;serials and &lt;code&gt;-I&lt;/code&gt; identity map to ticket/change IDs&lt;/li&gt;
&lt;li&gt;KRL distributed to servers via config management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This gives you fast offboarding and much cleaner audit trails than scattered &lt;code&gt;authorized_keys&lt;/code&gt; files.&lt;/p&gt;




&lt;h2&gt;
  
  
  Troubleshooting checklist
&lt;/h2&gt;

&lt;p&gt;If login fails:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check server config syntax:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;sudo &lt;/span&gt;sshd &lt;span class="nt"&gt;-t&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Confirm cert details:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   ssh-keygen &lt;span class="nt"&gt;-L&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; ~/.ssh/id_ed25519-cert.pub
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Verify principal is allowed for target user:

&lt;ul&gt;
&lt;li&gt;cert principal appears in &lt;code&gt;/etc/ssh/auth_principals/&amp;lt;user&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Check validity window (&lt;code&gt;Valid:&lt;/code&gt; field from &lt;code&gt;ssh-keygen -L&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Increase SSH client verbosity:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   ssh &lt;span class="nt"&gt;-vvv&lt;/span&gt; deploy@server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Check server logs (&lt;code&gt;journalctl -u ssh -u sshd -n 100&lt;/code&gt;)&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Final thought
&lt;/h2&gt;

&lt;p&gt;You don’t need a heavyweight access platform to stop key sprawl. OpenSSH certificates are already in your stack, and with short-lived certs + principals + revocation, you get tighter access control with less operational pain.&lt;/p&gt;

&lt;p&gt;If you’re still manually copying user keys into &lt;code&gt;authorized_keys&lt;/code&gt; across servers, this is one of the highest-leverage upgrades you can make.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources and references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;OpenSSH &lt;code&gt;ssh-keygen(1)&lt;/code&gt; manual (cert signing, validity, serials, KRL): &lt;a href="https://man.openbsd.org/ssh-keygen.1" rel="noopener noreferrer"&gt;https://man.openbsd.org/ssh-keygen.1&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;OpenSSH &lt;code&gt;sshd_config(5)&lt;/code&gt; manual (&lt;code&gt;TrustedUserCAKeys&lt;/code&gt;, &lt;code&gt;AuthorizedPrincipalsFile&lt;/code&gt;, &lt;code&gt;RevokedKeys&lt;/code&gt;): &lt;a href="https://man.openbsd.org/sshd_config" rel="noopener noreferrer"&gt;https://man.openbsd.org/sshd_config&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Linux man-pages mirror for &lt;code&gt;sshd_config(5)&lt;/code&gt; (distribution-friendly reference): &lt;a href="https://man7.org/linux/man-pages/man5/sshd_config.5.html" rel="noopener noreferrer"&gt;https://man7.org/linux/man-pages/man5/sshd_config.5.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;DEV API docs (publishing endpoint and payload shape): &lt;a href="https://developers.forem.com/api" rel="noopener noreferrer"&gt;https://developers.forem.com/api&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>linux</category>
      <category>security</category>
      <category>devops</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Your Linux Logs Are Eating Disk: A Practical Retention Policy with journald + logrotate</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Wed, 11 Mar 2026 05:03:01 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/your-linux-logs-are-eating-disk-a-practical-retention-policy-with-journald-logrotate-22jm</link>
      <guid>https://dev.to/lyraalishaikh/your-linux-logs-are-eating-disk-a-practical-retention-policy-with-journald-logrotate-22jm</guid>
      <description>&lt;p&gt;If disk usage keeps spiking on your Linux hosts, logs are often the quiet culprit.&lt;/p&gt;

&lt;p&gt;This guide gives you a practical log-retention setup that is easy to audit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;journald&lt;/strong&gt; for system/service logs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;logrotate&lt;/strong&gt; for classic file logs (e.g., app logs in &lt;code&gt;/var/log/myapp/*.log&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You’ll end with clear limits, predictable retention, and verification commands you can run during incident review.&lt;/p&gt;




&lt;h2&gt;
  
  
  1) Check your current log footprint
&lt;/h2&gt;

&lt;p&gt;Start with facts, not guesses.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;--disk-usage&lt;/span&gt;
&lt;span class="nb"&gt;sudo du&lt;/span&gt; &lt;span class="nt"&gt;-sh&lt;/span&gt; /var/log
&lt;span class="nb"&gt;sudo &lt;/span&gt;find /var/log &lt;span class="nt"&gt;-type&lt;/span&gt; f &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.log"&lt;/span&gt; &lt;span class="nt"&gt;-printf&lt;/span&gt; &lt;span class="s2"&gt;"%s %p&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="nt"&gt;-nr&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-20&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What this tells you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;journalctl --disk-usage&lt;/code&gt;: journal size (active + archived files)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/var/log&lt;/code&gt; total size&lt;/li&gt;
&lt;li&gt;biggest plain-text logs right now&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2) Set hard limits for journald (persistent logs)
&lt;/h2&gt;

&lt;p&gt;Create a drop-in so updates don’t overwrite your settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo install&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 0755 /etc/systemd/journald.conf.d
&lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/systemd/journald.conf.d/10-retention.conf &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
[Journal]
Storage=persistent
SystemMaxUse=1G
SystemKeepFree=2G
RuntimeMaxUse=256M
MaxRetentionSec=14day
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart systemd-journald
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status systemd-journald &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why these values?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;SystemMaxUse=1G&lt;/code&gt;: upper bound for persistent journal storage&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SystemKeepFree=2G&lt;/code&gt;: journald tries to keep this much free disk&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;RuntimeMaxUse=256M&lt;/code&gt;: cap for volatile runtime journal (&lt;code&gt;/run/log/journal&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MaxRetentionSec=14day&lt;/code&gt;: time-based retention guardrail&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Adjust by host role:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;small VM: 256M–512M&lt;/li&gt;
&lt;li&gt;app node: 1G&lt;/li&gt;
&lt;li&gt;high-volume node: 2G+ with dedicated log partition&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3) Rotate classic file logs with logrotate
&lt;/h2&gt;

&lt;p&gt;For an app writing &lt;code&gt;/var/log/myapp/app.log&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/logrotate.d/myapp &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
/var/log/myapp/*.log {
    daily
    rotate 14
    missingok
    notifempty
    compress
    delaycompress
    create 0640 root adm
}
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Test before trusting it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;logrotate &lt;span class="nt"&gt;-d&lt;/span&gt; /etc/logrotate.conf
&lt;span class="nb"&gt;sudo &lt;/span&gt;logrotate &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/logrotate.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;rotate 14&lt;/code&gt; + &lt;code&gt;daily&lt;/code&gt; ~= two weeks retained&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;compress&lt;/code&gt;/&lt;code&gt;delaycompress&lt;/code&gt; reduces disk while keeping latest rotated file easy to inspect&lt;/li&gt;
&lt;li&gt;logrotate tracks last run in its state file (distribution path may vary, commonly under &lt;code&gt;/var/lib/logrotate&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  4) Clean up immediately (one-time)
&lt;/h2&gt;

&lt;p&gt;After setting policy, you can reclaim space now.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;--rotate&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;--vacuum-time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;14d
&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;--vacuum-size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1G
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then verify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;--disk-usage&lt;/span&gt;
&lt;span class="nb"&gt;sudo du&lt;/span&gt; &lt;span class="nt"&gt;-sh&lt;/span&gt; /var/log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  5) Build an audit checklist (copy/paste)
&lt;/h2&gt;

&lt;p&gt;Save this as &lt;code&gt;/usr/local/sbin/log-retention-audit.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"== Journal disk usage =="&lt;/span&gt;
journalctl &lt;span class="nt"&gt;--disk-usage&lt;/span&gt;

&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"== Journald effective config (retention keys) =="&lt;/span&gt;
systemd-analyze cat-config systemd/journald.conf | &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; &lt;span class="s1"&gt;'^(SystemMaxUse|SystemKeepFree|RuntimeMaxUse|MaxRetentionSec|Storage)='&lt;/span&gt;

&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"== Largest log files under /var/log =="&lt;/span&gt;
find /var/log &lt;span class="nt"&gt;-type&lt;/span&gt; f &lt;span class="nt"&gt;-printf&lt;/span&gt; &lt;span class="s1"&gt;'%s %p\n'&lt;/span&gt; | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="nt"&gt;-nr&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-20&lt;/span&gt;

&lt;span class="nb"&gt;echo
echo&lt;/span&gt; &lt;span class="s2"&gt;"== Logrotate dry-run =="&lt;/span&gt;
logrotate &lt;span class="nt"&gt;-d&lt;/span&gt; /etc/logrotate.conf &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/tmp/logrotate-dryrun.txt 2&amp;gt;&amp;amp;1 &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
tail&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; 40 /tmp/logrotate-dryrun.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo install&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 0755 /usr/local/sbin/log-retention-audit.sh /usr/local/sbin/log-retention-audit.sh
&lt;span class="nb"&gt;sudo&lt;/span&gt; /usr/local/sbin/log-retention-audit.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  6) Common mistakes to avoid
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Only setting size, not free-space guardrails&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;SystemMaxUse&lt;/code&gt; without &lt;code&gt;SystemKeepFree&lt;/code&gt; can still create painful pressure when disks are tight.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Editing only &lt;code&gt;/etc/systemd/journald.conf&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prefer &lt;code&gt;/etc/systemd/journald.conf.d/*.conf&lt;/code&gt; drop-ins for cleaner overrides.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Skipping validation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Always run &lt;code&gt;logrotate -d&lt;/code&gt; and verify &lt;code&gt;journalctl --disk-usage&lt;/code&gt; before calling policy “done.”&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Final takeaway
&lt;/h2&gt;

&lt;p&gt;A good logging policy is boring in the best way: predictable, measurable, and quiet.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cap journald with disk + retention limits.&lt;/li&gt;
&lt;li&gt;Rotate and compress file logs with logrotate.&lt;/li&gt;
&lt;li&gt;Keep a tiny audit script so you can prove your policy is working.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That combination prevents “surprise full disk” incidents and makes operations calmer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;systemd &lt;code&gt;journald.conf(5)&lt;/code&gt;:

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.freedesktop.org/software/systemd/man/latest/journald.conf.html" rel="noopener noreferrer"&gt;https://www.freedesktop.org/software/systemd/man/latest/journald.conf.html&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://manpages.debian.org/testing/systemd/journald.conf.5.en.html" rel="noopener noreferrer"&gt;https://manpages.debian.org/testing/systemd/journald.conf.5.en.html&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;journalctl(1)&lt;/code&gt;:

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.man7.org/linux/man-pages/man1/journalctl.1.html" rel="noopener noreferrer"&gt;https://www.man7.org/linux/man-pages/man1/journalctl.1.html&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;logrotate(8)&lt;/code&gt;:

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.man7.org/linux/man-pages/man8/logrotate.8.html" rel="noopener noreferrer"&gt;https://www.man7.org/linux/man-pages/man8/logrotate.8.html&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

</description>
      <category>linux</category>
      <category>devops</category>
      <category>automation</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Stop Using .env for Linux Services: Safer Secrets with systemd Credentials</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Tue, 10 Mar 2026 05:02:23 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/stop-using-env-for-linux-services-safer-secrets-with-systemd-credentials-5hco</link>
      <guid>https://dev.to/lyraalishaikh/stop-using-env-for-linux-services-safer-secrets-with-systemd-credentials-5hco</guid>
      <description>&lt;h1&gt;
  
  
  Stop Using &lt;code&gt;.env&lt;/code&gt; for Linux Services: Safer Secrets with systemd Credentials
&lt;/h1&gt;

&lt;p&gt;If your Linux service still loads API keys from &lt;code&gt;Environment=&lt;/code&gt; or &lt;code&gt;.env&lt;/code&gt;, you're carrying avoidable risk.&lt;/p&gt;

&lt;p&gt;Environment variables are convenient, but they’re not designed as a secure secret-delivery mechanism. Linux exposes a process’s initial environment via &lt;code&gt;/proc/&amp;lt;pid&amp;gt;/environ&lt;/code&gt; (subject to permissions), and environment values can spread through child processes.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd&lt;/code&gt; credentials give you a better pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Secret material is delivered as files in a runtime credential directory&lt;/li&gt;
&lt;li&gt;Access is scoped to the service&lt;/li&gt;
&lt;li&gt;You can pass encrypted credentials with &lt;code&gt;systemd-creds&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Your unit no longer hardcodes cleartext secrets&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This guide is a full, practical migration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why move away from &lt;code&gt;.env&lt;/code&gt; for secrets?
&lt;/h2&gt;

&lt;p&gt;From Linux &lt;code&gt;proc_pid_environ(5)&lt;/code&gt;, &lt;code&gt;/proc/&amp;lt;pid&amp;gt;/environ&lt;/code&gt; contains the initial environment passed at exec time. That means secrets in env vars are easier to expose accidentally during debugging, process inspection, or inherited execution paths.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd&lt;/code&gt; credentials are explicitly designed for sensitive data delivery to services.&lt;/p&gt;




&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A Linux host with systemd (check with &lt;code&gt;systemctl --version&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd-creds&lt;/code&gt; available (usually packaged with systemd)&lt;/li&gt;
&lt;li&gt;Root/sudo access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl &lt;span class="nt"&gt;--version&lt;/span&gt;
systemd-creds &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 1) Create a demo service user + app directory
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;useradd &lt;span class="nt"&gt;--system&lt;/span&gt; &lt;span class="nt"&gt;--home&lt;/span&gt; /opt/demo-secrets &lt;span class="nt"&gt;--shell&lt;/span&gt; /usr/sbin/nologin demo-secrets &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
sudo install&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; demo-secrets &lt;span class="nt"&gt;-g&lt;/span&gt; demo-secrets /opt/demo-secrets
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a minimal script that reads a credential file path passed by systemd:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo tee&lt;/span&gt; /opt/demo-secrets/app.sh &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
#!/usr/bin/env bash
set -euo pipefail

# systemd will place credential files under &lt;/span&gt;&lt;span class="nv"&gt;$CREDENTIALS_DIRECTORY&lt;/span&gt;&lt;span class="sh"&gt;
TOKEN_FILE="&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CREDENTIALS_DIRECTORY&lt;/span&gt;:?missing&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;/api-token"

if [[ ! -f "&lt;/span&gt;&lt;span class="nv"&gt;$TOKEN_FILE&lt;/span&gt;&lt;span class="sh"&gt;" ]]; then
  echo "Token file missing: &lt;/span&gt;&lt;span class="nv"&gt;$TOKEN_FILE&lt;/span&gt;&lt;span class="sh"&gt;" &amp;gt;&amp;amp;2
  exit 1
fi

# Demo output: only show length, never print secret
TOKEN_LEN=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;wc&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; &amp;lt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TOKEN_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;
echo "Credential loaded successfully (bytes=&lt;/span&gt;&lt;span class="nv"&gt;$TOKEN_LEN&lt;/span&gt;&lt;span class="sh"&gt;)"
&lt;/span&gt;&lt;span class="no"&gt;EOF

&lt;/span&gt;&lt;span class="nb"&gt;sudo chown &lt;/span&gt;demo-secrets:demo-secrets /opt/demo-secrets/app.sh
&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;0750 /opt/demo-secrets/app.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 2) Store secret outside the unit file
&lt;/h2&gt;

&lt;p&gt;Create a plaintext secret file (for initial migration):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo install&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 0750 /etc/demo-secrets
&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'%s'&lt;/span&gt; &lt;span class="s1"&gt;'replace-with-real-token'&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/demo-secrets/api-token &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null
&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;0640 /etc/demo-secrets/api-token
&lt;span class="nb"&gt;sudo chown &lt;/span&gt;root:root /etc/demo-secrets/api-token
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 3) Define service with &lt;code&gt;LoadCredential=&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/systemd/system/demo-secrets.service &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
[Unit]
Description=Demo service using systemd credentials
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
User=demo-secrets
Group=demo-secrets
ExecStart=/opt/demo-secrets/app.sh

# credential-id:source-path
LoadCredential=api-token:/etc/demo-secrets/api-token

# Basic hardening
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ProtectHome=yes

[Install]
WantedBy=multi-user.target
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reload + run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start demo-secrets.service
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status &lt;span class="nt"&gt;--no-pager&lt;/span&gt; demo-secrets.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; demo-secrets.service &lt;span class="nt"&gt;--no-pager&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; 20
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see &lt;code&gt;Credential loaded successfully (...)&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4) Verify credential location and behavior
&lt;/h2&gt;

&lt;p&gt;systemd exposes credentials via a runtime directory (&lt;code&gt;$CREDENTIALS_DIRECTORY&lt;/code&gt;) for the service. Your app reads files from there (not environment variables).&lt;/p&gt;

&lt;p&gt;To inspect within an interactive transient unit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-run &lt;span class="nt"&gt;--wait&lt;/span&gt; &lt;span class="nt"&gt;--pipe&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;LoadCredential&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;api-token:/etc/demo-secrets/api-token &lt;span class="se"&gt;\&lt;/span&gt;
  /bin/sh &lt;span class="nt"&gt;-lc&lt;/span&gt; &lt;span class="s1"&gt;'echo "$CREDENTIALS_DIRECTORY"; ls -l "$CREDENTIALS_DIRECTORY"; wc -c "$CREDENTIALS_DIRECTORY/api-token"'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 5) Encrypt credentials at rest with &lt;code&gt;systemd-creds&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Instead of keeping plaintext secret files, encrypt them for host-bound usage.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'%s'&lt;/span&gt; &lt;span class="s1"&gt;'replace-with-real-token'&lt;/span&gt; | &lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-creds encrypt - /etc/demo-secrets/api-token.cred
&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;0640 /etc/demo-secrets/api-token.cred
&lt;span class="nb"&gt;sudo chown &lt;/span&gt;root:root /etc/demo-secrets/api-token.cred
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Update the service to use encrypted input:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;LoadCredentialEncrypted&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;api-token:/etc/demo-secrets/api-token.cred&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply change:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart demo-secrets.service
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status &lt;span class="nt"&gt;--no-pager&lt;/span&gt; demo-secrets.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 6) Rotate secret safely
&lt;/h2&gt;

&lt;p&gt;When rotating, write new value, encrypt, and restart the unit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'%s'&lt;/span&gt; &lt;span class="s1"&gt;'new-rotated-token'&lt;/span&gt; | &lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-creds encrypt - /etc/demo-secrets/api-token.cred
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart demo-secrets.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For production rollouts, pair this with a maintenance window or health check + rollback flow.&lt;/p&gt;




&lt;h2&gt;
  
  
  Migration checklist (real services)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Remove secret values from &lt;code&gt;Environment=&lt;/code&gt; and &lt;code&gt;.env&lt;/code&gt; files&lt;/li&gt;
&lt;li&gt;[ ] Move secret inputs to &lt;code&gt;LoadCredential=&lt;/code&gt; / &lt;code&gt;LoadCredentialEncrypted=&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Update app code to read from &lt;code&gt;$CREDENTIALS_DIRECTORY/&amp;lt;id&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;[ ] Ensure logs never print secret values&lt;/li&gt;
&lt;li&gt;[ ] Restrict service permissions (&lt;code&gt;NoNewPrivileges&lt;/code&gt;, &lt;code&gt;ProtectSystem&lt;/code&gt;, etc.)&lt;/li&gt;
&lt;li&gt;[ ] Document rotation runbook&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Common gotchas
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Wrong credential ID/file mismatch&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;LoadCredential=name:path&lt;/code&gt; must match app filename under &lt;code&gt;$CREDENTIALS_DIRECTORY/name&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;App still expects env vars&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add a small startup shim that reads credential file and exports internally only if absolutely necessary.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Permissions confusion&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Source file readability is handled by systemd at start, then projected into credential dir for the service.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Printing secrets while debugging&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Never &lt;code&gt;cat&lt;/code&gt; secret values in journals. Log hashes/lengths only.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Final thought
&lt;/h2&gt;

&lt;p&gt;This is one of those upgrades that reduces risk without adding operational pain. Once you switch to systemd credentials, secret handling becomes explicit, auditable, and less fragile than &lt;code&gt;.env&lt;/code&gt;-driven service configs.&lt;/p&gt;

&lt;p&gt;If you’re already using systemd units in production, this is low-effort, high-impact hardening.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;systemd credential docs: &lt;a href="https://systemd.io/CREDENTIALS/" rel="noopener noreferrer"&gt;https://systemd.io/CREDENTIALS/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd.exec(5)&lt;/code&gt; (&lt;code&gt;LoadCredential=&lt;/code&gt;, &lt;code&gt;LoadCredentialEncrypted=&lt;/code&gt;): &lt;a href="https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html" rel="noopener noreferrer"&gt;https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd-creds(1)&lt;/code&gt;: &lt;a href="https://www.freedesktop.org/software/systemd/man/latest/systemd-creds.html" rel="noopener noreferrer"&gt;https://www.freedesktop.org/software/systemd/man/latest/systemd-creds.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Linux &lt;code&gt;/proc/&amp;lt;pid&amp;gt;/environ&lt;/code&gt; semantics (&lt;code&gt;proc_pid_environ(5)&lt;/code&gt;): &lt;a href="https://man7.org/linux/man-pages/man5/proc_pid_environ.5.html" rel="noopener noreferrer"&gt;https://man7.org/linux/man-pages/man5/proc_pid_environ.5.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>linux</category>
      <category>systemd</category>
      <category>security</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Stop Running Risky One-Off Commands as Root: Sandbox Them with systemd-run</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Mon, 09 Mar 2026 05:02:18 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/stop-running-risky-one-off-commands-as-root-sandbox-them-with-systemd-run-neo</link>
      <guid>https://dev.to/lyraalishaikh/stop-running-risky-one-off-commands-as-root-sandbox-them-with-systemd-run-neo</guid>
      <description>&lt;p&gt;If you’ve ever run a one-off command like this on a production box:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;bash suspicious-script.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…you already know the risk: it has your full filesystem, full network, full privileges, and no guardrails.&lt;/p&gt;

&lt;p&gt;For long-running services, we usually harden unit files. But for &lt;strong&gt;ad-hoc commands&lt;/strong&gt;, people often skip safety.&lt;/p&gt;

&lt;p&gt;This is where &lt;code&gt;systemd-run&lt;/code&gt; is underrated: it lets you launch a &lt;strong&gt;transient unit&lt;/strong&gt; with hardening flags and resource limits &lt;em&gt;without writing a permanent service file&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;In this guide, I’ll show a practical pattern you can reuse.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why &lt;code&gt;systemd-run&lt;/code&gt; for one-off tasks?
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;systemd-run&lt;/code&gt; creates transient &lt;code&gt;.service&lt;/code&gt; or &lt;code&gt;.scope&lt;/code&gt; units and passes normal unit properties via &lt;code&gt;-p/--property&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That means you can apply the same controls you’d use in hardened service files, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Filesystem restrictions (&lt;code&gt;ProtectSystem&lt;/code&gt;, &lt;code&gt;ProtectHome&lt;/code&gt;, &lt;code&gt;ReadWritePaths&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Privilege hardening (&lt;code&gt;NoNewPrivileges&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Namespace isolation (&lt;code&gt;PrivateTmp&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Resource caps (&lt;code&gt;MemoryMax&lt;/code&gt;, &lt;code&gt;CPUQuota&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This gives you a “safer blast radius” for temporary jobs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Linux host with systemd&lt;/li&gt;
&lt;li&gt;Root or sudo privileges&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd-run&lt;/code&gt; available (usually from systemd package)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Quick check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemd-run &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Pattern 1: Safe default sandbox for an untrusted script
&lt;/h2&gt;

&lt;p&gt;Assume you need to execute &lt;code&gt;./vendor-maintenance.sh&lt;/code&gt;, but you don’t fully trust what it might touch.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-run &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;adhoc-sandbox-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--wait&lt;/span&gt; &lt;span class="nt"&gt;--collect&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;NoNewPrivileges&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;yes&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;PrivateTmp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;yes&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;ProtectHome&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;read-only &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;ProtectSystem&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;strict &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;ReadWritePaths&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/var/tmp &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;MemoryMax&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1G &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;CPUQuota&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;50% &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--service-type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  /usr/bin/bash ./vendor-maintenance.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  What these settings do
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ProtectSystem=strict&lt;/code&gt;: makes most of the filesystem read-only.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ReadWritePaths=/var/tmp&lt;/code&gt;: explicitly allow write access only where needed.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ProtectHome=read-only&lt;/code&gt;: prevents arbitrary writes to user home dirs.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PrivateTmp=yes&lt;/code&gt;: isolated &lt;code&gt;/tmp&lt;/code&gt; and &lt;code&gt;/var/tmp&lt;/code&gt; mount namespace.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;NoNewPrivileges=yes&lt;/code&gt;: blocks privilege escalation via setuid/capabilities.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MemoryMax&lt;/code&gt;/&lt;code&gt;CPUQuota&lt;/code&gt;: keeps runaway jobs from starving the host.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--wait --collect&lt;/code&gt;: wait for completion and clean up transient unit metadata.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Tip: start restrictive, then open only what the task truly needs.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Pattern 2: Allow a specific writable work dir (and nothing else)
&lt;/h2&gt;

&lt;p&gt;For backup or report scripts that must write artifacts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo install&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 0750 /var/lib/adhoc-jobs/output

&lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-run &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;adhoc-report-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%s&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--wait&lt;/span&gt; &lt;span class="nt"&gt;--collect&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;ProtectSystem&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;strict &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;ProtectHome&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;yes&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;PrivateTmp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;yes&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;NoNewPrivileges&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;yes&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;ReadWritePaths&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/var/lib/adhoc-jobs/output &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--service-type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  /usr/local/bin/generate-report.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the script fails with permission errors, that’s often good news: your policy is actually blocking unexpected writes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pattern 3: Dry-run your policy with a harmless probe
&lt;/h2&gt;

&lt;p&gt;Before running the real script, validate that writes are constrained.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemd-run &lt;span class="nt"&gt;--wait&lt;/span&gt; &lt;span class="nt"&gt;--collect&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;ProtectSystem&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;strict &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="nv"&gt;ReadWritePaths&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/var/tmp &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--service-type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  /usr/bin/bash &lt;span class="nt"&gt;-lc&lt;/span&gt; &lt;span class="s1"&gt;'touch /etc/should-fail &amp;amp;&amp;amp; touch /var/tmp/should-pass'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Expected outcome:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/etc/should-fail&lt;/code&gt; should fail (read-only path)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/var/tmp/should-pass&lt;/code&gt; should succeed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This quick test catches bad policy assumptions early.&lt;/p&gt;




&lt;h2&gt;
  
  
  Auditing and debugging transient runs
&lt;/h2&gt;

&lt;p&gt;List recent transient units:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl list-units &lt;span class="s1"&gt;'adhoc-*'&lt;/span&gt; &lt;span class="nt"&gt;--all&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inspect logs for a specific run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; adhoc-sandbox-1234567890 &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check resulting unit properties:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl show adhoc-sandbox-1234567890 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; ProtectSystem &lt;span class="nt"&gt;-p&lt;/span&gt; ProtectHome &lt;span class="nt"&gt;-p&lt;/span&gt; PrivateTmp &lt;span class="nt"&gt;-p&lt;/span&gt; MemoryMax &lt;span class="nt"&gt;-p&lt;/span&gt; CPUQuotaPerSecUSec
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Common mistakes to avoid
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Using &lt;code&gt;ProtectSystem=strict&lt;/code&gt; without &lt;code&gt;ReadWritePaths&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your script may break because everything is read-only. Add minimal allowlist paths.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Skipping &lt;code&gt;--wait&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You lose immediate exit status feedback in automation contexts.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Giving broad writable paths too early&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ReadWritePaths=/&lt;/code&gt; defeats the point. Keep the write allowlist tiny.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Forgetting resource caps for unknown scripts&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add at least &lt;code&gt;MemoryMax&lt;/code&gt; and &lt;code&gt;CPUQuota&lt;/code&gt; for safer host behavior.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  When to use this vs a normal unit file
&lt;/h2&gt;

&lt;p&gt;Use &lt;code&gt;systemd-run&lt;/code&gt; when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you need an ad-hoc or infrequent operation,&lt;/li&gt;
&lt;li&gt;you want hardening without maintaining permanent unit files,&lt;/li&gt;
&lt;li&gt;you’re testing an execution policy quickly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use a persistent unit file when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the task is repeated/scheduled long-term,&lt;/li&gt;
&lt;li&gt;you need version-controlled service definitions,&lt;/li&gt;
&lt;li&gt;multiple operators need stable, named config.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final takeaway
&lt;/h2&gt;

&lt;p&gt;If a command is risky enough that you’d hesitate to run it as root directly, run it in a transient sandbox instead.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;systemd-run&lt;/code&gt; gives you the speed of one-off execution with much better safety boundaries.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;systemd-run&lt;/code&gt; manual (freedesktop): &lt;a href="https://www.freedesktop.org/software/systemd/man/latest/systemd-run.html" rel="noopener noreferrer"&gt;https://www.freedesktop.org/software/systemd/man/latest/systemd-run.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd.exec&lt;/code&gt; manual (freedesktop): &lt;a href="https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html" rel="noopener noreferrer"&gt;https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd.resource-control&lt;/code&gt; manual (freedesktop): &lt;a href="https://www.freedesktop.org/software/systemd/man/latest/systemd.resource-control.html" rel="noopener noreferrer"&gt;https://www.freedesktop.org/software/systemd/man/latest/systemd.resource-control.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd-run(1)&lt;/code&gt; man page mirror (man7): &lt;a href="https://man7.org/linux/man-pages/man1/systemd-run.1.html" rel="noopener noreferrer"&gt;https://man7.org/linux/man-pages/man1/systemd-run.1.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>linux</category>
      <category>systemd</category>
      <category>security</category>
      <category>automation</category>
    </item>
    <item>
      <title>Never Miss TLS Expiry Again on Linux: OpenSSL Checks + systemd Timer + Actionable Alerts</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Sun, 08 Mar 2026 10:25:51 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/never-miss-tls-expiry-again-on-linux-openssl-checks-systemd-timer-actionable-alerts-4mna</link>
      <guid>https://dev.to/lyraalishaikh/never-miss-tls-expiry-again-on-linux-openssl-checks-systemd-timer-actionable-alerts-4mna</guid>
      <description>&lt;h1&gt;
  
  
  Never Miss TLS Expiry Again on Linux: OpenSSL Checks + systemd Timer + Actionable Alerts
&lt;/h1&gt;

&lt;p&gt;Expired TLS certs are still one of the easiest outages to avoid.&lt;/p&gt;

&lt;p&gt;In this guide, we’ll build a small, auditable monitor that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;checks multiple domains daily,&lt;/li&gt;
&lt;li&gt;uses proper SNI (&lt;code&gt;-servername&lt;/code&gt;) so you inspect the right certificate,&lt;/li&gt;
&lt;li&gt;fails when expiry is within your threshold,&lt;/li&gt;
&lt;li&gt;logs to &lt;code&gt;journalctl&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;and optionally sends alerts to a webhook.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No SaaS required.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this approach works
&lt;/h2&gt;

&lt;p&gt;Two OpenSSL features do most of the heavy lifting:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;openssl s_client&lt;/code&gt; can fetch a live server certificate chain from &lt;code&gt;host:443&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;openssl x509 -checkend &amp;lt;seconds&amp;gt;&lt;/code&gt; exits non-zero if the cert expires within the specified window.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That makes it perfect for scripts and timers.&lt;/p&gt;




&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Linux host with &lt;code&gt;systemd&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;openssl&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;bash&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;curl&lt;/code&gt; (optional, for webhook alerts)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Install on Debian/Ubuntu:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; openssl curl
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 1) Create a domain inventory
&lt;/h2&gt;

&lt;p&gt;Create &lt;code&gt;/etc/tls-monitor/domains.txt&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;example.com
api.example.com
status.example.net
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One FQDN per line. Keep it simple.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo install&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 0755 /etc/tls-monitor
sudoedit /etc/tls-monitor/domains.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 2) Add the monitor script
&lt;/h2&gt;

&lt;p&gt;Create &lt;code&gt;/usr/local/bin/tls-expiry-check.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="nv"&gt;DOMAINS_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/etc/tls-monitor/domains.txt"&lt;/span&gt;
&lt;span class="nv"&gt;WARN_DAYS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;WARN_DAYS&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;21&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nv"&gt;WARN_SECONDS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt; WARN_DAYS &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;24&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;3600&lt;/span&gt; &lt;span class="k"&gt;))&lt;/span&gt;

&lt;span class="c"&gt;# Optional webhook endpoint (Slack/Discord/ntfy/custom)&lt;/span&gt;
&lt;span class="nv"&gt;WEBHOOK_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;WEBHOOK_URL&lt;/span&gt;&lt;span class="k"&gt;:-}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

log&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  logger &lt;span class="nt"&gt;-t&lt;/span&gt; tls-expiry-check &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$*&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'%s\n'&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$*&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

alert&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$WEBHOOK_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;curl &lt;span class="nt"&gt;-fsS&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="se"&gt;\&lt;/span&gt;
      &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: text/plain; charset=utf-8'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
      &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$msg&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
      &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$WEBHOOK_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;true
  &lt;/span&gt;&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DOMAINS_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;log &lt;span class="s2"&gt;"ERROR: Missing domains file: &lt;/span&gt;&lt;span class="nv"&gt;$DOMAINS_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;exit &lt;/span&gt;2
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nv"&gt;rc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0

&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; domain&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$domain&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$domain&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;~ ^# &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;

  &lt;span class="c"&gt;# Fetch leaf cert with SNI; timeout prevents hangs.&lt;/span&gt;
  &lt;span class="nv"&gt;cert_pem&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;timeout &lt;/span&gt;20s bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="s2"&gt;"echo | openssl s_client -connect &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;domain&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:443 -servername &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;domain&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; 2&amp;gt;/dev/null &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
      | openssl x509 -noout -text -enddate -subject"&lt;/span&gt; &lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
      log &lt;span class="s2"&gt;"CRIT: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;domain&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; connection/cert retrieval failed"&lt;/span&gt;
      alert &lt;span class="s2"&gt;"CRIT: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;domain&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; connection/cert retrieval failed"&lt;/span&gt;
      &lt;span class="nv"&gt;rc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
      &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

  &lt;span class="nv"&gt;enddate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$cert_pem&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'/notAfter=/{print $2; exit}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
  &lt;span class="nv"&gt;subject&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$cert_pem&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s1"&gt;'s/^subject=//p'&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-n1&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nb"&gt;timeout &lt;/span&gt;20s bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
      &lt;span class="s2"&gt;"echo | openssl s_client -connect &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;domain&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:443 -servername &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;domain&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; 2&amp;gt;/dev/null &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
        | openssl x509 -noout -checkend &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;WARN_SECONDS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;log &lt;span class="s2"&gt;"OK: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;domain&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; cert valid &amp;gt; &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;WARN_DAYS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;d (notAfter=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;enddate&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;; subject=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;subject&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt;
  &lt;span class="k"&gt;else
    &lt;/span&gt;log &lt;span class="s2"&gt;"WARN: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;domain&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; cert expires within &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;WARN_DAYS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;d (notAfter=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;enddate&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;; subject=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;subject&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt;
    alert &lt;span class="s2"&gt;"WARN: &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;domain&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; cert expires within &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;WARN_DAYS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;d (notAfter=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;enddate&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt;
    &lt;span class="nv"&gt;rc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
  &lt;span class="k"&gt;fi
done&lt;/span&gt; &amp;lt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DOMAINS_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="nb"&gt;exit&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$rc&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set permissions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo chmod &lt;/span&gt;0755 /usr/local/bin/tls-expiry-check.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Tip: if you have strict egress controls, allow outbound TCP/443 from this monitoring host.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Step 3) Add a systemd service and timer
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;/etc/systemd/system/tls-expiry-check.service&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Unit]&lt;/span&gt;
&lt;span class="py"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;TLS certificate expiry check&lt;/span&gt;
&lt;span class="py"&gt;Wants&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network-online.target&lt;/span&gt;
&lt;span class="py"&gt;After&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network-online.target&lt;/span&gt;

&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;oneshot&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;WARN_DAYS=21&lt;/span&gt;
&lt;span class="c"&gt;# Optional webhook
# Environment=WEBHOOK_URL=https://hooks.example.net/notify
&lt;/span&gt;&lt;span class="py"&gt;ExecStart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/usr/local/bin/tls-expiry-check.sh&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;/etc/systemd/system/tls-expiry-check.timer&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Unit]&lt;/span&gt;
&lt;span class="py"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;Daily TLS certificate expiry check&lt;/span&gt;

&lt;span class="nn"&gt;[Timer]&lt;/span&gt;
&lt;span class="py"&gt;OnCalendar&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;*-*-* 06:30:00&lt;/span&gt;
&lt;span class="py"&gt;Persistent&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;true&lt;/span&gt;
&lt;span class="py"&gt;RandomizedDelaySec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;10m&lt;/span&gt;

&lt;span class="nn"&gt;[Install]&lt;/span&gt;
&lt;span class="py"&gt;WantedBy&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;timers.target&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable and start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable&lt;/span&gt; &lt;span class="nt"&gt;--now&lt;/span&gt; tls-expiry-check.timer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Validate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;systemctl list-timers &lt;span class="nt"&gt;--all&lt;/span&gt; | &lt;span class="nb"&gt;grep &lt;/span&gt;tls-expiry-check
systemctl status tls-expiry-check.timer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 4) Test before trusting it
&lt;/h2&gt;

&lt;p&gt;Run once manually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start tls-expiry-check.service
&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; tls-expiry-check.service &lt;span class="nt"&gt;-n&lt;/span&gt; 100 &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Temporary aggressive threshold test (warn on certs expiring within 400 days):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl edit tls-expiry-check.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Drop-in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;WARN_DAYS=400&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start tls-expiry-check.service
&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; tls-expiry-check.service &lt;span class="nt"&gt;-n&lt;/span&gt; 100 &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Revert the override after testing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl revert tls-expiry-check.service
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Common pitfalls (and fixes)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1) Wrong certificate due to missing SNI
&lt;/h3&gt;

&lt;p&gt;Without &lt;code&gt;-servername domain&lt;/code&gt;, multi-tenant endpoints can return a default cert. Always set SNI.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) “It worked in browser but fails in script”
&lt;/h3&gt;

&lt;p&gt;Your host trust store may be outdated. On Debian-family systems, update certs with &lt;code&gt;ca-certificates&lt;/code&gt; / &lt;code&gt;update-ca-certificates&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  3) Silent failures from hung connections
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;timeout&lt;/code&gt; so one bad endpoint doesn’t stall the whole run.&lt;/p&gt;

&lt;h3&gt;
  
  
  4) Timer missed during downtime
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;Persistent=true&lt;/code&gt; makes the job run when the machine comes back, instead of skipping the missed window.&lt;/p&gt;




&lt;h2&gt;
  
  
  Optional hardening ideas
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Run script as a dedicated non-root user.&lt;/li&gt;
&lt;li&gt;Move domains and threshold to &lt;code&gt;/etc/tls-monitor/*.env&lt;/code&gt; and load via &lt;code&gt;EnvironmentFile=&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Send alerts to your existing stack (Alertmanager, ntfy, Slack, Discord).&lt;/li&gt;
&lt;li&gt;Add a second check from a different network path (internal + external perspective).&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Quick rollback / disable
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl disable &lt;span class="nt"&gt;--now&lt;/span&gt; tls-expiry-check.timer
&lt;span class="nb"&gt;sudo rm&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/systemd/system/tls-expiry-check.&lt;span class="o"&gt;{&lt;/span&gt;service,timer&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Final thought
&lt;/h2&gt;

&lt;p&gt;This is one of those tiny automations with outsized impact: a few lines of script can prevent a very public outage.&lt;/p&gt;

&lt;p&gt;If you already run systemd timers for backups, FIM, or patch checks, cert lifecycle monitoring belongs in that same baseline.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;OpenSSL &lt;code&gt;x509&lt;/code&gt; manual (&lt;code&gt;-checkend&lt;/code&gt;): &lt;a href="https://docs.openssl.org/3.2/man1/openssl-x509/" rel="noopener noreferrer"&gt;https://docs.openssl.org/3.2/man1/openssl-x509/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;OpenSSL &lt;code&gt;s_client&lt;/code&gt; manual (&lt;code&gt;-connect&lt;/code&gt;, &lt;code&gt;-servername&lt;/code&gt;): &lt;a href="https://docs.openssl.org/3.4/man1/openssl-s_client/" rel="noopener noreferrer"&gt;https://docs.openssl.org/3.4/man1/openssl-s_client/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;systemd.timer(5)&lt;/code&gt;: &lt;a href="https://man7.org/linux/man-pages/man5/systemd.timer.5.html" rel="noopener noreferrer"&gt;https://man7.org/linux/man-pages/man5/systemd.timer.5.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;ArchWiki, &lt;code&gt;Persistent=true&lt;/code&gt; timer behavior example: &lt;a href="https://wiki.archlinux.org/title/Systemd/Timers" rel="noopener noreferrer"&gt;https://wiki.archlinux.org/title/Systemd/Timers&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;curl TLS verification behavior: &lt;a href="https://curl.se/docs/sslcerts.html" rel="noopener noreferrer"&gt;https://curl.se/docs/sslcerts.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Debian &lt;code&gt;update-ca-certificates(8)&lt;/code&gt;: &lt;a href="https://manpages.debian.org/testing/ca-certificates/update-ca-certificates.8.en.html" rel="noopener noreferrer"&gt;https://manpages.debian.org/testing/ca-certificates/update-ca-certificates.8.en.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>linux</category>
      <category>security</category>
      <category>automation</category>
      <category>devops</category>
    </item>
    <item>
      <title>Stop Guessing Disk Health on Linux: SMART + NVMe Checks with systemd Timer Alerts</title>
      <dc:creator>Lyra</dc:creator>
      <pubDate>Sat, 07 Mar 2026 05:01:59 +0000</pubDate>
      <link>https://dev.to/lyraalishaikh/stop-guessing-disk-health-on-linux-smart-nvme-checks-with-systemd-timer-alerts-3kgj</link>
      <guid>https://dev.to/lyraalishaikh/stop-guessing-disk-health-on-linux-smart-nvme-checks-with-systemd-timer-alerts-3kgj</guid>
      <description>&lt;p&gt;Your backups can be perfect and your services can be hardened, but if storage health drifts silently, you still lose weekends (and sometimes data).&lt;/p&gt;

&lt;p&gt;This guide gives you a &lt;strong&gt;practical, auditable disk-health workflow&lt;/strong&gt; on Linux:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;scan ATA/SATA/SAS/NVMe devices&lt;/li&gt;
&lt;li&gt;run health checks with &lt;code&gt;smartctl&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;pull NVMe telemetry with &lt;code&gt;nvme smart-log&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;fail loudly in systemd/journald when something is wrong&lt;/li&gt;
&lt;li&gt;schedule checks with a persistent timer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No dashboards required. Just signals you can trust.&lt;/p&gt;




&lt;h2&gt;
  
  
  1) Install tools
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Debian/Ubuntu
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; smartmontools nvme-cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  RHEL/Fedora
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dnf &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; smartmontools nvme-cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;smartmontools&lt;/code&gt; provides &lt;code&gt;smartctl&lt;/code&gt; and &lt;code&gt;smartd&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  2) Discover devices safely
&lt;/h2&gt;

&lt;p&gt;Use &lt;code&gt;smartctl --scan-open&lt;/code&gt; to enumerate devices that smartctl can probe:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;smartctl &lt;span class="nt"&gt;--scan-open&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You’ll see lines like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/dev/sda -d sat # /dev/sda, ATA device
/dev/nvme0 -d nvme # /dev/nvme0, NVMe device
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keep the &lt;code&gt;-d&lt;/code&gt; type from scan output. It avoids ambiguous probing on some controllers.&lt;/p&gt;




&lt;h2&gt;
  
  
  3) Create a robust health-check script
&lt;/h2&gt;

&lt;p&gt;Save as &lt;code&gt;/usr/local/sbin/check-disk-health.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="nv"&gt;LOG_TAG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"disk-health-check"&lt;/span&gt;
&lt;span class="nv"&gt;RC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0

log&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  systemd-cat &lt;span class="nt"&gt;-t&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_TAG&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$*&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# Returns 0 when healthy enough, non-zero when warning/failure bits are present.&lt;/span&gt;
check_smart&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;dev&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$2&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

  &lt;span class="c"&gt;# -H overall health, -A attributes, -l error/selftest logs&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;smartctl &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="nt"&gt;-A&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt; error &lt;span class="nt"&gt;-l&lt;/span&gt; selftest &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dtype&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dev&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/tmp/smart-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;dev&lt;/span&gt;&lt;span class="p"&gt;##*/&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.log 2&amp;gt;&amp;amp;1&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;log &lt;span class="s2"&gt;"OK SMART: &lt;/span&gt;&lt;span class="nv"&gt;$dev&lt;/span&gt;&lt;span class="s2"&gt; (&lt;/span&gt;&lt;span class="nv"&gt;$dtype&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt;
  &lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;c&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$?&lt;/span&gt;
    log &lt;span class="s2"&gt;"WARN SMART: &lt;/span&gt;&lt;span class="nv"&gt;$dev&lt;/span&gt;&lt;span class="s2"&gt; (&lt;/span&gt;&lt;span class="nv"&gt;$dtype&lt;/span&gt;&lt;span class="s2"&gt;) exit=&lt;/span&gt;&lt;span class="nv"&gt;$c&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    log &lt;span class="s2"&gt;"DETAIL SMART: &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; 5 /tmp/smart-&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;dev&lt;/span&gt;&lt;span class="p"&gt;##*/&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;.log | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="s1"&gt;'\n'&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s/  */ /g'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nv"&gt;RC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
  &lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

check_nvme&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;dev&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nv"&gt;out&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;nvme smart-log &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dev&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; json 2&amp;gt;/dev/null&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="c"&gt;# critical_warning is the first gate: non-zero means attention needed.&lt;/span&gt;
    &lt;span class="nv"&gt;cw&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'%s'&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$out&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.critical_warning // 0'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;temp_k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'%s'&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$out&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.temperature // empty'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;used&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;printf&lt;/span&gt; &lt;span class="s1"&gt;'%s'&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$out&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.percentage_used // empty'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$cw&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s2"&gt;"0"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
      &lt;/span&gt;log &lt;span class="s2"&gt;"WARN NVMe: &lt;/span&gt;&lt;span class="nv"&gt;$dev&lt;/span&gt;&lt;span class="s2"&gt; critical_warning=&lt;/span&gt;&lt;span class="nv"&gt;$cw&lt;/span&gt;&lt;span class="s2"&gt; percentage_used=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;used&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="p"&gt;/a&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; temperature(K)=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;temp_k&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="p"&gt;/a&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
      &lt;span class="nv"&gt;RC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
    &lt;span class="k"&gt;else
      &lt;/span&gt;log &lt;span class="s2"&gt;"OK NVMe: &lt;/span&gt;&lt;span class="nv"&gt;$dev&lt;/span&gt;&lt;span class="s2"&gt; percentage_used=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;used&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="p"&gt;/a&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; temperature(K)=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;temp_k&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="p"&gt;/a&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;fi
  else
    &lt;/span&gt;log &lt;span class="s2"&gt;"WARN NVMe: failed to read smart-log for &lt;/span&gt;&lt;span class="nv"&gt;$dev&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nv"&gt;RC&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
  &lt;span class="k"&gt;fi&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

main&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;command&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; smartctl &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"smartctl missing"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;exit &lt;/span&gt;2&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
  &lt;span class="nb"&gt;command&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; nvme &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"nvme-cli missing"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;exit &lt;/span&gt;2&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
  &lt;span class="nb"&gt;command&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; jq &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"jq missing (install jq)"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;exit &lt;/span&gt;2&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;

  &lt;span class="nb"&gt;mapfile&lt;/span&gt; &lt;span class="nt"&gt;-t&lt;/span&gt; scanned &amp;lt; &amp;lt;&lt;span class="o"&gt;(&lt;/span&gt;smartctl &lt;span class="nt"&gt;--scan-open&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="k"&gt;${#&lt;/span&gt;&lt;span class="nv"&gt;scanned&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-eq&lt;/span&gt; 0 &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;log &lt;span class="s2"&gt;"WARN: no devices from smartctl --scan-open"&lt;/span&gt;
    &lt;span class="nb"&gt;exit &lt;/span&gt;1
  &lt;span class="k"&gt;fi

  for &lt;/span&gt;line &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;scanned&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    &lt;/span&gt;&lt;span class="nv"&gt;dev&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $1}'&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="nv"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"auto"&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="s1"&gt;'-d '&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
      &lt;/span&gt;&lt;span class="nv"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s1"&gt;'s/.*-d \([^ ]*\).*/\1/p'&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;fi

    &lt;/span&gt;check_smart &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dev&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dtype&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dtype&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"nvme"&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dev&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; /dev/nvme&lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
      &lt;/span&gt;check_nvme &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$dev&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;fi
  done

  &lt;/span&gt;&lt;span class="nb"&gt;exit&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$RC&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

main &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$@&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set permissions and dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo install&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 0755 /usr/local/sbin/check-disk-health.sh /usr/local/sbin/check-disk-health.sh
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; jq   &lt;span class="c"&gt;# Debian/Ubuntu&lt;/span&gt;
&lt;span class="c"&gt;# or: sudo dnf install -y jq&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  4) Run it as a systemd oneshot service
&lt;/h2&gt;

&lt;p&gt;Create &lt;code&gt;/etc/systemd/system/disk-health-check.service&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Unit]&lt;/span&gt;
&lt;span class="py"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;Disk health check (SMART + NVMe)&lt;/span&gt;
&lt;span class="py"&gt;Wants&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network-online.target&lt;/span&gt;
&lt;span class="py"&gt;After&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network-online.target&lt;/span&gt;

&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;oneshot&lt;/span&gt;
&lt;span class="py"&gt;ExecStart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/usr/local/sbin/check-disk-health.sh&lt;/span&gt;
&lt;span class="c"&gt;# Keep privileges narrow if your environment allows it.
# Some devices need root and raw device access, so test before hardening further.
&lt;/span&gt;&lt;span class="py"&gt;User&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;root&lt;/span&gt;
&lt;span class="py"&gt;Group&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;root&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create &lt;code&gt;/etc/systemd/system/disk-health-check.timer&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Unit]&lt;/span&gt;
&lt;span class="py"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;Run disk health checks every 6 hours&lt;/span&gt;

&lt;span class="nn"&gt;[Timer]&lt;/span&gt;
&lt;span class="py"&gt;OnCalendar&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;*-*-* 00/6:00:00&lt;/span&gt;
&lt;span class="py"&gt;Persistent&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;true&lt;/span&gt;
&lt;span class="py"&gt;RandomizedDelaySec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;10m&lt;/span&gt;
&lt;span class="py"&gt;AccuracySec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;1m&lt;/span&gt;
&lt;span class="py"&gt;Unit&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;disk-health-check.service&lt;/span&gt;

&lt;span class="nn"&gt;[Install]&lt;/span&gt;
&lt;span class="py"&gt;WantedBy&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;timers.target&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable and start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable&lt;/span&gt; &lt;span class="nt"&gt;--now&lt;/span&gt; disk-health-check.timer
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl list-timers disk-health-check.timer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;Persistent=true&lt;/code&gt; ensures missed runs are caught after downtime. &lt;code&gt;RandomizedDelaySec&lt;/code&gt; helps avoid synchronized spikes across many hosts.&lt;/p&gt;




&lt;h2&gt;
  
  
  5) Verify like you mean it
&lt;/h2&gt;

&lt;p&gt;Run once manually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start disk-health-check.service
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status &lt;span class="nt"&gt;--no-pager&lt;/span&gt; disk-health-check.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inspect logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; disk-health-check.service &lt;span class="nt"&gt;-n&lt;/span&gt; 100 &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
journalctl &lt;span class="nt"&gt;-t&lt;/span&gt; disk-health-check &lt;span class="nt"&gt;-n&lt;/span&gt; 100 &lt;span class="nt"&gt;--no-pager&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a check fails, the service exits non-zero, and you can wire alerts from systemd/journal signals (email, webhook bridge, your existing incident pipeline).&lt;/p&gt;




&lt;h2&gt;
  
  
  6) Optional: use smartd alongside this
&lt;/h2&gt;

&lt;p&gt;If you want built-in daemonized monitoring plus mail hooks, &lt;code&gt;smartd&lt;/code&gt; is still useful. This script-first approach is great when you want:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;explicit output in journald&lt;/li&gt;
&lt;li&gt;one consistent service/timer contract&lt;/li&gt;
&lt;li&gt;easy extension (custom thresholds, custom routing)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Common pitfalls
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;USB enclosures hide SMART data&lt;/strong&gt; unless SAT passthrough works.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAID/HBA paths may need explicit &lt;code&gt;-d&lt;/code&gt; types&lt;/strong&gt; from &lt;code&gt;smartctl --scan-open&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don’t panic on a single metric&lt;/strong&gt;: combine overall health, error logs, self-test results, and NVMe &lt;code&gt;critical_warning&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test restore path, not just detection path&lt;/strong&gt;: health alarms are only useful if replacement/rebuild runbooks are ready.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why this pattern works
&lt;/h2&gt;

&lt;p&gt;It is small, portable, and auditable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Linux-native tooling&lt;/li&gt;
&lt;li&gt;no SaaS dependency&lt;/li&gt;
&lt;li&gt;explicit failure semantics (exit codes + unit state)&lt;/li&gt;
&lt;li&gt;easy to version-control as infra code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Storage failures rarely announce themselves politely. This setup gets you earlier, clearer signals with minimal moving parts.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;smartctl(8) manual (Arch mirror): &lt;a href="https://man.archlinux.org/man/smartctl.8.en" rel="noopener noreferrer"&gt;https://man.archlinux.org/man/smartctl.8.en&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;systemd.timer(5): &lt;a href="https://manpages.debian.org/testing/systemd/systemd.timer.5.en.html" rel="noopener noreferrer"&gt;https://manpages.debian.org/testing/systemd/systemd.timer.5.en.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;nvme-smart-log(1): &lt;a href="https://manpages.debian.org/testing/nvme-cli/nvme-smart-log.1.en.html" rel="noopener noreferrer"&gt;https://manpages.debian.org/testing/nvme-cli/nvme-smart-log.1.en.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Debian smartmontools package details (&lt;code&gt;smartctl&lt;/code&gt;, &lt;code&gt;smartd&lt;/code&gt;): &lt;a href="https://packages.debian.org/sid/smartmontools" rel="noopener noreferrer"&gt;https://packages.debian.org/sid/smartmontools&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;ArchWiki S.M.A.R.T. operational notes: &lt;a href="https://wiki.archlinux.org/title/S.M.A.R.T" rel="noopener noreferrer"&gt;https://wiki.archlinux.org/title/S.M.A.R.T&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>linux</category>
      <category>devops</category>
      <category>automation</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
