<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Anguishe</title>
    <description>The latest articles on DEV Community by Anguishe (@bashsnippets).</description>
    <link>https://dev.to/bashsnippets</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3909567%2F885dee1e-f72c-48d7-965f-91ee8ade012a.jpeg</url>
      <title>DEV Community: Anguishe</title>
      <link>https://dev.to/bashsnippets</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bashsnippets"/>
    <language>en</language>
    <item>
      <title>I Spent 40 Minutes at 11pm Debugging a Deploy That Wasn't Broken</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Thu, 02 Jul 2026 03:21:37 +0000</pubDate>
      <link>https://dev.to/bashsnippets/i-spent-40-minutes-at-11pm-debugging-a-deploy-that-wasnt-broken-1p2</link>
      <guid>https://dev.to/bashsnippets/i-spent-40-minutes-at-11pm-debugging-a-deploy-that-wasnt-broken-1p2</guid>
      <description>&lt;p&gt;I once spent forty minutes at eleven at night debugging a deploy that wasn't broken. The release script ran the database migration, the migration threw &lt;code&gt;connection refused&lt;/code&gt;, the script exited non-zero, the deploy rolled itself back, and I got paged.&lt;/p&gt;

&lt;p&gt;So I did the things you do. I read the migration. I read the logs. I checked the database — it was up, it was healthy, it accepted my connection instantly. I re-ran the deploy and it worked. I chalked it up to gremlins and went to bed, which is the part I'm not proud of, because it happened again two days later. That time I watched the timing: the script brought up a fresh database container and started the migration about six seconds before Postgres finished initializing and began accepting connections. The migration was racing the database's boot. Most of the time it won. The times it lost, I lost forty minutes.&lt;/p&gt;

&lt;p&gt;The script wasn't wrong about anything except one assumption: that a dependency is ready the instant you ask for it.&lt;/p&gt;

&lt;h2&gt;
  
  
  In production, dependencies are &lt;em&gt;eventually&lt;/em&gt; ready
&lt;/h2&gt;

&lt;p&gt;That's the mental model shift. Networks blip. A service you call returns a 503 for the two seconds it takes to finish a rolling restart. An API rate-limits you with a 429 it fully expects you to retry. A fresh container's database isn't accepting connections for its first few seconds. Treating the first failure as fatal turns every one of these normal, transient conditions into a paged engineer — and the script that handles them isn't smarter — it declines to give up on the first try.&lt;/p&gt;

&lt;p&gt;But retrying naively is its own trap. Retry instantly and you hammer a recovering service into staying down. Retry forever and a genuinely dead dependency hangs your script indefinitely. Retry a 404 and you wait a minute to confirm what you already knew. Good retries are bounded, backed off, and selective.&lt;/p&gt;

&lt;h2&gt;
  
  
  A retry function you can reuse anywhere
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Purpose: survive transient failures instead of dying on the first error&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="nv"&gt;CHECK&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"✓"&lt;/span&gt;
&lt;span class="nv"&gt;CROSS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"✗"&lt;/span&gt;

&lt;span class="c"&gt;# retry &amp;lt;max_attempts&amp;gt; &amp;lt;command&amp;gt; [args...]&lt;/span&gt;
retry&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;max_attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;shift
    local &lt;/span&gt;&lt;span class="nv"&gt;attempt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1            &lt;span class="c"&gt;# base delay — doubles each round&lt;/span&gt;
    &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;max_delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;30       &lt;span class="c"&gt;# cap so the backoff never runs away&lt;/span&gt;

    &lt;span class="k"&gt;until&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$@&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
        if&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt; attempt &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; max_attempts &lt;span class="o"&gt;))&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
            &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CROSS&lt;/span&gt;&lt;span class="s2"&gt; '&lt;/span&gt;&lt;span class="nv"&gt;$*&lt;/span&gt;&lt;span class="s2"&gt;' failed after &lt;/span&gt;&lt;span class="nv"&gt;$attempt&lt;/span&gt;&lt;span class="s2"&gt; attempts"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2
            &lt;span class="k"&gt;return &lt;/span&gt;1
        &lt;span class="k"&gt;fi&lt;/span&gt;
        &lt;span class="c"&gt;# 0–2s of jitter so parallel callers don't all retry on the same beat&lt;/span&gt;
        &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;pause&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt; delay &lt;span class="o"&gt;+&lt;/span&gt; RANDOM &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="k"&gt;))&lt;/span&gt;
        &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CROSS&lt;/span&gt;&lt;span class="s2"&gt; attempt &lt;/span&gt;&lt;span class="nv"&gt;$attempt&lt;/span&gt;&lt;span class="s2"&gt; failed — retrying in &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;pause&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;s"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2
        &lt;span class="nb"&gt;sleep&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$pause&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="nv"&gt;attempt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt; attempt &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="k"&gt;))&lt;/span&gt;
        &lt;span class="nv"&gt;delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$((&lt;/span&gt; delay &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="k"&gt;))&lt;/span&gt;
        &lt;span class="o"&gt;((&lt;/span&gt; delay &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; max_delay &lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$max_delay&lt;/span&gt;
    &lt;span class="k"&gt;done

    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CHECK&lt;/span&gt;&lt;span class="s2"&gt; '&lt;/span&gt;&lt;span class="nv"&gt;$*&lt;/span&gt;&lt;span class="s2"&gt;' succeeded on attempt &lt;/span&gt;&lt;span class="nv"&gt;$attempt&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# The actual fix for my 11pm deploy: wait for Postgres to accept connections.&lt;/span&gt;
retry 6 nc &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="nt"&gt;-w&lt;/span&gt; 2 db.internal 5432
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CHECK&lt;/span&gt;&lt;span class="s2"&gt; database reachable — running migration"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The whole engine is &lt;code&gt;until "$@"; do ... done&lt;/code&gt;. &lt;code&gt;until&lt;/code&gt; runs the command and executes the loop body only when it &lt;em&gt;fails&lt;/em&gt;, exiting the instant it succeeds. Passing the command as &lt;code&gt;"$@"&lt;/code&gt; (after &lt;code&gt;shift&lt;/code&gt;-ing past the attempt count) means the function retries &lt;em&gt;anything&lt;/em&gt; — a &lt;code&gt;curl&lt;/code&gt;, an &lt;code&gt;ssh&lt;/code&gt;, a port check, your own script — without caring what it is.&lt;/p&gt;

&lt;p&gt;The backoff is the three lines at the bottom of the loop: sleep for the current delay plus a little jitter, then double the delay, capped at &lt;code&gt;max_delay&lt;/code&gt;. That gives you 1s, 2s, 4s, 8s, 16s, 30s, 30s…&lt;/p&gt;

&lt;h2&gt;
  
  
  The jitter is not decoration
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;RANDOM % 3&lt;/code&gt; looks trivial, and it's the line people delete to "clean up." Don't. Without jitter, a fleet of machines that all failed at the same instant — because the same service went down — will all retry at the same instant, and the same instant after that, producing a synchronized thundering herd that knocks the recovering service straight back over on every round. A few hundred milliseconds of randomness per client spreads the retries out so the service actually gets room to recover. At one machine it does nothing; at fifty it's the difference between recovery and a retry storm.&lt;/p&gt;

&lt;h2&gt;
  
  
  The mistake that makes retries dangerous
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Good: a transient failure that retrying can fix&lt;/span&gt;
retry 6 nc &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="nt"&gt;-w&lt;/span&gt; 2 db.internal 5432

&lt;span class="c"&gt;# Bad: retrying a deterministic failure just delays the error 30 seconds&lt;/span&gt;
retry 6 curl &lt;span class="nt"&gt;-fsS&lt;/span&gt; https://api.example.com/v1/thing-that-returns-404
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A retry loop is only as smart as what you point it at. A port check belongs in a loop because the answer &lt;em&gt;changes&lt;/em&gt; — "no" until the database boots, then "yes." A request that returns 404 returns 404 on attempt one and attempt six; the loop just postpones the failure and buries the real status under retry noise. Retry transient failures — timeouts, connection-refused, 429, 5xx, DNS hiccups. Don't retry deterministic ones — a 404, a 401, a syntax error, a missing file. When you can, branch on the exit code or HTTP status and loop only on the codes worth looping on.&lt;/p&gt;

&lt;p&gt;For plain &lt;code&gt;curl&lt;/code&gt;, its built-in &lt;code&gt;--retry 5 --retry-delay 2&lt;/code&gt; does most of this and is simpler; reach for the function when the thing you're retrying isn't curl, or when you want one backoff policy across a database probe, an ssh call, and a download at once.&lt;/p&gt;

&lt;h2&gt;
  
  
  Back to 11pm
&lt;/h2&gt;

&lt;p&gt;That deploy never paged me again once the migration &lt;em&gt;waited&lt;/em&gt; for the port instead of assuming it. The database still took its six seconds to boot, the network still blipped occasionally — retrying didn't make the dependencies faster. It stopped a normal, transient slowness from being treated as a fatal error, which is most of what "production-ready" means for a script.&lt;/p&gt;

&lt;p&gt;Full function with the wait-for-port pattern and the guidance on which failures to retry: &lt;a href="https://bashsnippets.xyz/snippets/bash-retry-with-backoff" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/snippets/bash-retry-with-backoff&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Retries are the third leg of an unattended job: &lt;a href="https://bashsnippets.xyz/snippets/bash-flock-single-instance" rel="noopener noreferrer"&gt;flock&lt;/a&gt; stops overlap, &lt;a href="https://bashsnippets.xyz/snippets/bash-timeout-command" rel="noopener noreferrer"&gt;timeout&lt;/a&gt; stops hangs, retry rides out the blip. The &lt;a href="https://bashsnippets.xyz/tools/cron-wrapper-generator" rel="noopener noreferrer"&gt;Hardened Cron Wrapper Generator&lt;/a&gt; wires all three into one wrapper, &lt;a href="https://bashsnippets.xyz/guides/bash-scripts-that-survive-cron" rel="noopener noreferrer"&gt;Bash Scripts That Survive Cron&lt;/a&gt; is the end-to-end version, and the rest of the library is at &lt;a href="https://bashsnippets.xyz" rel="noopener noreferrer"&gt;https://bashsnippets.xyz&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>devops</category>
      <category>linux</category>
      <category>cicd</category>
    </item>
    <item>
      <title>My Backup Hadn't Run in 9 Days and Nothing Told Me</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Fri, 26 Jun 2026 03:39:56 +0000</pubDate>
      <link>https://dev.to/bashsnippets/my-backup-hadnt-run-in-9-days-and-nothing-told-me-20hg</link>
      <guid>https://dev.to/bashsnippets/my-backup-hadnt-run-in-9-days-and-nothing-told-me-20hg</guid>
      <description>&lt;p&gt;The backup ran fine every night for fourteen months, and then it didn't run for nine days, and nothing told me. No error in the log. No failed-job alert. No bounced cron mail. The nightly &lt;code&gt;mysqldump&lt;/code&gt; had hung — the database had a long-held lock from a runaway analytics query, the dump opened its transaction and sat there waiting for it, forever.&lt;/p&gt;

&lt;p&gt;Cron launched it at 2am, it never exited, and here's the cruel part: because I'd been &lt;em&gt;smart&lt;/em&gt; enough to wrap it in a lock so two dumps couldn't run at once, every subsequent night's run saw the lock still held by the zombie from the 9th and skipped quietly. The clever lock turned a one-night hang into a nine-day outage. I found it when I went to restore a table and discovered my newest "backup" was a &lt;code&gt;mysqldump&lt;/code&gt; process that had been running since the previous Tuesday.&lt;/p&gt;

&lt;p&gt;That's nine days I'd have lost if anything had gone wrong with the live database. The ten minutes of feeling foolish when I traced it back was nothing next to that.&lt;/p&gt;

&lt;h2&gt;
  
  
  A hung command is worse than a failed one
&lt;/h2&gt;

&lt;p&gt;This is the lesson worth internalizing, because it's counterintuitive. A &lt;em&gt;failed&lt;/em&gt; command exits, frees its lock, and the next run tries again — the system self-heals. A &lt;em&gt;hung&lt;/em&gt; command exits never. It holds resources, blocks its own future runs, and produces exactly zero signal because it never gets far enough to log anything. Failures are loud. Hangs are silent, and silence is what kills you in unattended automation.&lt;/p&gt;

&lt;p&gt;You can't rely on a command to bound its own runtime, either. The whole problem is that it's wedged somewhere it can't time itself out of — blocked in the kernel waiting on a lock, or on a dead socket that will never send the FIN it's waiting for. So you bound it from the outside.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bounding the runtime with timeout
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Script: bounded-dump.sh&lt;/span&gt;
&lt;span class="c"&gt;# Purpose: Stop a hung command from running forever and jamming the cron slot&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="nv"&gt;CHECK&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"✓"&lt;/span&gt;
&lt;span class="nv"&gt;CROSS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"✗"&lt;/span&gt;

&lt;span class="nv"&gt;MAX_RUNTIME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"5m"&lt;/span&gt;   &lt;span class="c"&gt;# longer than the normal worst case, well under the interval&lt;/span&gt;
&lt;span class="nv"&gt;KILL_GRACE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"20s"&lt;/span&gt;   &lt;span class="c"&gt;# after SIGTERM, wait this long, then SIGKILL&lt;/span&gt;
&lt;span class="nv"&gt;DEST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/backup/mydb.sql"&lt;/span&gt;

&lt;span class="c"&gt;# In an `if` so set -e doesn't abort before we read the exit code.&lt;/span&gt;
&lt;span class="c"&gt;# Write to a .partial file so a timed-out run never leaves a corrupt "backup".&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nb"&gt;timeout&lt;/span&gt; &lt;span class="nt"&gt;-k&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$KILL_GRACE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$MAX_RUNTIME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        mysqldump &lt;span class="nt"&gt;--single-transaction&lt;/span&gt; mydb &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DEST&lt;/span&gt;&lt;span class="s2"&gt;.partial"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;mv&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DEST&lt;/span&gt;&lt;span class="s2"&gt;.partial"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DEST&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CHECK&lt;/span&gt;&lt;span class="s2"&gt; dump completed within &lt;/span&gt;&lt;span class="nv"&gt;$MAX_RUNTIME&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nv"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$?&lt;/span&gt;
    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DEST&lt;/span&gt;&lt;span class="s2"&gt;.partial"&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$code&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="k"&gt;in
        &lt;/span&gt;124&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CROSS&lt;/span&gt;&lt;span class="s2"&gt; dump exceeded &lt;/span&gt;&lt;span class="nv"&gt;$MAX_RUNTIME&lt;/span&gt;&lt;span class="s2"&gt; — terminated (SIGTERM)"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2 &lt;span class="p"&gt;;;&lt;/span&gt;
        137&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CROSS&lt;/span&gt;&lt;span class="s2"&gt; dump ignored SIGTERM — force-killed (SIGKILL)"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2 &lt;span class="p"&gt;;;&lt;/span&gt;
        &lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CROSS&lt;/span&gt;&lt;span class="s2"&gt; dump failed with exit code &lt;/span&gt;&lt;span class="nv"&gt;$code&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2 &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="k"&gt;esac&lt;/span&gt;
    &lt;span class="nb"&gt;exit&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$code&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;timeout -k "$KILL_GRACE" "$MAX_RUNTIME" mysqldump ...&lt;/code&gt; is the entire mechanism. At five minutes, &lt;code&gt;timeout&lt;/code&gt; sends the dump a SIGTERM. A well-behaved program treats SIGTERM as "wrap up and exit." But the dump from my outage wasn't misbehaving — it was blocked in the kernel waiting on a lock, and a process in that state physically cannot act on SIGTERM. That's what &lt;code&gt;-k 20s&lt;/code&gt; is for: twenty seconds after the polite signal, &lt;code&gt;timeout&lt;/code&gt; sends SIGKILL, which the kernel enforces unconditionally. Nothing survives SIGKILL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Read the exit code — it's the difference between a log and a mystery
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;124  &lt;span class="c"&gt;# still running at the deadline — SIGTERM was sent&lt;/span&gt;
137  &lt;span class="c"&gt;# 128 + 9 — ignored SIGTERM, had to be force-killed&lt;/span&gt;
&lt;span class="k"&gt;*&lt;/span&gt;    &lt;span class="c"&gt;# anything else is the command's own failure&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Collapsing all three into "backup failed" throws away the one piece of information that tells you whether you have a slow database, a wedged one, or a broken dump command. A &lt;code&gt;124&lt;/code&gt; says "this is taking too long — investigate the query." A &lt;code&gt;137&lt;/code&gt; says "this is wedged in I/O — investigate the lock or the mount." They point at different problems. (If you ever blank on which code is which, &lt;a href="https://bashsnippets.xyz/tools/bash-exit-code-lookup" rel="noopener noreferrer"&gt;bashsnippets.xyz/tools/bash-exit-code-lookup&lt;/a&gt; decodes 124 and 137 directly.)&lt;/p&gt;

&lt;h2&gt;
  
  
  The &lt;code&gt;.partial&lt;/code&gt; trick matters more than it looks
&lt;/h2&gt;

&lt;p&gt;If you redirect straight to the real backup file and the command times out mid-write, you've just replaced last night's good backup with a half-written, unrestorable file — and you won't know until the night you need it. Writing to a &lt;code&gt;.partial&lt;/code&gt; path and &lt;code&gt;mv&lt;/code&gt;-ing into place only on a clean exit means a failed or timed-out run leaves the previous good backup untouched. A &lt;code&gt;mv&lt;/code&gt; on the same filesystem is atomic; the redirect is not.&lt;/p&gt;

&lt;p&gt;For commands that talk to the network, layer the tool's own timeout underneath — &lt;code&gt;curl --max-time&lt;/code&gt;, &lt;code&gt;ssh -o ConnectTimeout&lt;/code&gt;, a &lt;code&gt;net_read_timeout&lt;/code&gt; on the dump. Those fire first and fail cleanly. &lt;code&gt;timeout&lt;/code&gt; is the outer hard stop for the night the inner one doesn't.&lt;/p&gt;

&lt;h2&gt;
  
  
  Back to the nine days
&lt;/h2&gt;

&lt;p&gt;A timeout is what makes a lock &lt;em&gt;safe&lt;/em&gt;. Locking a job to a single instance stops overlap, but a hang inside the locked job holds that lock forever — which is precisely how my nine-day gap happened. Bound the runtime and the lock always gets released, on a deadline, every time, and the failure becomes a loud &lt;code&gt;124&lt;/code&gt; in the log instead of a silent gap you discover during a restore.&lt;/p&gt;

&lt;p&gt;Full script with the exit-code branching and the FAQ on timing out a whole pipeline: &lt;a href="https://bashsnippets.xyz/snippets/bash-timeout-command" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/snippets/bash-timeout-command&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The lock that made my hang invisible is &lt;a href="https://bashsnippets.xyz/snippets/bash-flock-single-instance" rel="noopener noreferrer"&gt;flock&lt;/a&gt;, and the third guard is &lt;a href="https://bashsnippets.xyz/snippets/bash-retry-with-backoff" rel="noopener noreferrer"&gt;retry with backoff&lt;/a&gt;. The &lt;a href="https://bashsnippets.xyz/tools/cron-wrapper-generator" rel="noopener noreferrer"&gt;Hardened Cron Wrapper Generator&lt;/a&gt; composes all three, &lt;a href="https://bashsnippets.xyz/guides/bash-scripts-that-survive-cron" rel="noopener noreferrer"&gt;Bash Scripts That Survive Cron&lt;/a&gt; walks the whole decision, and the rest of the library is at &lt;a href="https://bashsnippets.xyz" rel="noopener noreferrer"&gt;https://bashsnippets.xyz&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>linux</category>
      <category>devops</category>
      <category>sysadmin</category>
    </item>
    <item>
      <title>A Cron Job Took Our Server to Load 41 by Attacking Itself</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Tue, 23 Jun 2026 00:18:35 +0000</pubDate>
      <link>https://dev.to/bashsnippets/a-cron-job-took-our-server-to-load-41-by-attacking-itself-3p6g</link>
      <guid>https://dev.to/bashsnippets/a-cron-job-took-our-server-to-load-41-by-attacking-itself-3p6g</guid>
      <description>&lt;p&gt;A &lt;code&gt;*/1&lt;/code&gt; rsync took our staging box to a load average of 41 one afternoon, and it took me longer than I want to admit to work out why. The sync normally finished in about twenty seconds. That day the backup target's NFS mount went sluggish, the sync started taking ninety seconds, and cron — which does not know or care whether the last run is still going — launched a fresh copy every single minute on top of it.&lt;/p&gt;

&lt;p&gt;Inside ten minutes there were a half-dozen rsyncs all reading the same tree off the same slow disk, each one making the disk slower, each new minute adding another. The box wasn't under attack. It was attacking itself, one polite copy at a time. The thing that stung was that nothing was &lt;em&gt;broken&lt;/em&gt; — every individual rsync was correct, the disk eventually recovered on its own, and the only reason it became an outage is that cron has no concept of "the last one is still running."&lt;/p&gt;

&lt;p&gt;That's the trap with scheduled jobs: a command that's perfectly fine when you run it by hand can take down a server the first time it runs longer than its interval with nobody watching.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix everyone reaches for first is the wrong one
&lt;/h2&gt;

&lt;p&gt;The instinct is a PID file: write &lt;code&gt;$$&lt;/code&gt; to &lt;code&gt;/var/run/job.pid&lt;/code&gt; on start, check whether that file exists on the next run, bail if it does. It almost works. Then one run gets &lt;code&gt;kill -9&lt;/code&gt;'d, or the box reboots mid-job, and the PID file is left behind pointing at a process that died on Tuesday. Now every future run sees a "lock" owned by a PID that no longer exists, and the job never runs again — the opposite failure, just as silent.&lt;/p&gt;

&lt;p&gt;There's also a race between the check and the write, and the times you most need the lock to be clean are exactly the times cleanup didn't happen, because the process died before it could clean up.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;flock&lt;/code&gt; has none of that. The lock isn't a file you create and delete — it's a lock the kernel holds on an &lt;em&gt;open file descriptor&lt;/em&gt;, and the kernel releases it automatically the instant that descriptor closes. The process exiting closes it. So does crashing. So does &lt;code&gt;kill -9&lt;/code&gt;. There is no state to leave behind, which is the entire reason it survives the failure modes a PID file can't.&lt;/p&gt;

&lt;h2&gt;
  
  
  The single-instance pattern
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Script: backup-with-lock.sh&lt;/span&gt;
&lt;span class="c"&gt;# Purpose: Stop a cron job from overlapping itself when one run runs long&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="nv"&gt;CHECK&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"✓"&lt;/span&gt;
&lt;span class="nv"&gt;CROSS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"✗"&lt;/span&gt;

&lt;span class="c"&gt;# /run/lock is tmpfs, cleared cleanly on reboot. Never /tmp — temp-cleaners&lt;/span&gt;
&lt;span class="c"&gt;# delete files there, and a deleted lock mid-run lets a second copy run.&lt;/span&gt;
&lt;span class="nv"&gt;LOCK_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/run/lock/&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;basename&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$0&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;.lock"&lt;/span&gt;

&lt;span class="c"&gt;# The &amp;gt; opens (and creates) the lock file on fd 200 and holds it open for the&lt;/span&gt;
&lt;span class="c"&gt;# whole script. The lock lives on this descriptor, not on the file existing.&lt;/span&gt;
&lt;span class="nb"&gt;exec &lt;/span&gt;200&amp;gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOCK_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# -n = non-blocking: if a previous run still holds the lock, give up now&lt;/span&gt;
&lt;span class="c"&gt;# instead of queueing another copy behind it.&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; flock &lt;span class="nt"&gt;-n&lt;/span&gt; 200&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CROSS&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="s1"&gt;'+%F %T'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt; previous run still active — skipping"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2
    &lt;span class="nb"&gt;exit &lt;/span&gt;0
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CHECK&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="s1"&gt;'+%F %T'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt; lock acquired — starting"&lt;/span&gt;
rsync &lt;span class="nt"&gt;-a&lt;/span&gt; &lt;span class="nt"&gt;--delete&lt;/span&gt; /data/ /mnt/backup/data/
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CHECK&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="s1"&gt;'+%F %T'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt; finished — kernel releases the lock on exit"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The two lines doing the work are &lt;code&gt;exec 200&amp;gt;"$LOCK_FILE"&lt;/code&gt; and &lt;code&gt;flock -n 200&lt;/code&gt;. The first opens the lock file on a descriptor that stays open for the life of the process. The second tries to grab the lock without waiting; if a sibling process already holds it, &lt;code&gt;flock&lt;/code&gt; returns non-zero, we log it and exit &lt;code&gt;0&lt;/code&gt; — a skipped run is normal, not an error, so we don't want it lighting up cron's mail.&lt;/p&gt;

&lt;p&gt;Notice there is no cleanup. No &lt;code&gt;trap&lt;/code&gt; to remove a PID file, no &lt;code&gt;rm&lt;/code&gt; at the end. When this script exits for any reason, fd 200 closes and the lock is gone. That "for any reason" is the whole point.&lt;/p&gt;

&lt;h2&gt;
  
  
  You can lock a job without editing it at all
&lt;/h2&gt;

&lt;p&gt;If the misbehaving job is already deployed and you don't want to touch it, wrap it from the crontab line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Skip the run if the last one is still going&lt;/span&gt;
&lt;span class="k"&gt;*&lt;/span&gt;/1 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; /usr/bin/flock &lt;span class="nt"&gt;-n&lt;/span&gt; /run/lock/sync.lock /usr/local/bin/sync.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;flock&lt;/code&gt; runs &lt;code&gt;sync.sh&lt;/code&gt; only if it can grab the lock; if last minute's run is still holding it, this minute's run exits immediately and does nothing. It's the fastest retrofit for a job that's already on fire — no redeploy.&lt;/p&gt;

&lt;p&gt;One thing worth burning into memory: &lt;code&gt;-n&lt;/code&gt; skips, &lt;code&gt;-w 30&lt;/code&gt; waits up to thirty seconds then gives up, and a &lt;em&gt;bare&lt;/em&gt; &lt;code&gt;flock&lt;/code&gt; with neither blocks forever. On a fast cron schedule that bare form turns your "skipped" runs into a pile of stuck processes — the exact thing you were trying to prevent.&lt;/p&gt;

&lt;h2&gt;
  
  
  The part that actually mattered
&lt;/h2&gt;

&lt;p&gt;The load-41 afternoon ended the moment I wrapped that rsync in &lt;code&gt;flock -n&lt;/code&gt;. The slow NFS mount was still slow, but now exactly one sync ran at a time and the extras skipped harmlessly until the disk recovered. Locking didn't fix the slow disk — it stopped a transient slow disk from becoming a self-inflicted outage. That's the difference between a script that works when you run it and one that survives unattended.&lt;/p&gt;

&lt;p&gt;A lock alone isn't the whole story, though. If the locked job itself &lt;em&gt;hangs&lt;/em&gt;, it holds the lock forever and every future run skips — so the job silently stops running and you find out days later. That's why locking pairs with bounding runtime with &lt;code&gt;timeout&lt;/code&gt; and retrying transient failures.&lt;/p&gt;

&lt;p&gt;Full script, the &lt;code&gt;-n&lt;/code&gt; vs &lt;code&gt;-w&lt;/code&gt; decision, and the FAQ on where the lock file should live: &lt;a href="https://bashsnippets.xyz/snippets/bash-flock-single-instance" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/snippets/bash-flock-single-instance&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you're hardening a cron job, the next two guards are &lt;a href="https://bashsnippets.xyz/snippets/bash-timeout-command" rel="noopener noreferrer"&gt;timeout&lt;/a&gt; and &lt;a href="https://bashsnippets.xyz/snippets/bash-retry-with-backoff" rel="noopener noreferrer"&gt;retry with backoff&lt;/a&gt;; the &lt;a href="https://bashsnippets.xyz/tools/cron-wrapper-generator" rel="noopener noreferrer"&gt;Hardened Cron Wrapper Generator&lt;/a&gt; stitches all three into one wrapper, and the full reasoning is in &lt;a href="https://bashsnippets.xyz/guides/bash-scripts-that-survive-cron" rel="noopener noreferrer"&gt;Bash Scripts That Survive Cron&lt;/a&gt;. The rest of the library is at &lt;a href="https://bashsnippets.xyz" rel="noopener noreferrer"&gt;https://bashsnippets.xyz&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>linux</category>
      <category>devops</category>
      <category>sysadmin</category>
    </item>
    <item>
      <title>A Function Without local Overwrote My Variable and rm -rf Deleted the Wrong Directory</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Mon, 22 Jun 2026 02:46:38 +0000</pubDate>
      <link>https://dev.to/bashsnippets/a-function-without-local-overwrote-my-variable-and-rm-rf-deleted-the-wrong-directory-36ne</link>
      <guid>https://dev.to/bashsnippets/a-function-without-local-overwrote-my-variable-and-rm-rf-deleted-the-wrong-directory-36ne</guid>
      <description>&lt;p&gt;The deploy script had been running in production for four months without a problem. It built releases into a temp directory, ran some validation, and then cleaned up by removing whatever &lt;code&gt;$target&lt;/code&gt; pointed at. &lt;code&gt;$target&lt;/code&gt; was set near the top of the script to the current release directory — the one the running application was serving from. A helper function called &lt;code&gt;prepare()&lt;/code&gt; also used a variable named &lt;code&gt;target&lt;/code&gt;, because the person who wrote it (me, four months earlier) did not think about scope.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;prepare()&lt;/code&gt; built the new release into a temp directory. On a good run, it set &lt;code&gt;target&lt;/code&gt; to the temp path, did its work, and returned. The main script then did its cleanup at the end and removed &lt;code&gt;$target&lt;/code&gt; — which, after calling &lt;code&gt;prepare()&lt;/code&gt;, was the temp directory. That worked correctly for four months.&lt;/p&gt;

&lt;p&gt;Then a deploy failed partway through &lt;code&gt;prepare()&lt;/code&gt;. The temp directory was half-built. &lt;code&gt;target&lt;/code&gt; was now pointing at the half-built temp path. The main script caught the failure, started its cleanup, and ran &lt;code&gt;rm -rf "$target"&lt;/code&gt;. It removed the half-built temp directory, which was correct. But then it kept going — there was a second cleanup step that also used &lt;code&gt;$target&lt;/code&gt; and expected it to still be the release directory. By the time the script finished, it had removed the running application's release directory. The application restarted and found nothing to serve.&lt;/p&gt;

&lt;p&gt;The users noticed before I did. The restart loop was filling logs, the application was returning 502, and I was sitting in the deploy output trying to figure out what had gone wrong in a failure path I had tested against a stub.&lt;/p&gt;

&lt;h2&gt;
  
  
  Variables in bash are global by default
&lt;/h2&gt;

&lt;p&gt;This is the single most surprising thing about bash functions if you have written code in almost any other language. In Python, a variable assigned inside a function is local to that function unless you explicitly declare it &lt;code&gt;global&lt;/code&gt;. In bash, it is the opposite. A variable assigned inside a function is visible — and writable — everywhere in the current shell unless you declare it &lt;code&gt;local&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/srv/release/current"&lt;/span&gt;

prepare&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nv"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;mktemp&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;    &lt;span class="c"&gt;# No local — this overwrites the global $target&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"building in &lt;/span&gt;&lt;span class="nv"&gt;$target&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

prepare
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"target is now: &lt;/span&gt;&lt;span class="nv"&gt;$target&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;   &lt;span class="c"&gt;# Prints the temp dir, not /srv/release/current&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The function does exactly what it looks like — it sets &lt;code&gt;target&lt;/code&gt;. The problem is that it sets &lt;code&gt;target&lt;/code&gt; everywhere, not just inside itself. The caller's &lt;code&gt;target&lt;/code&gt; is gone.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;local&lt;/code&gt; confines the assignment to the function scope:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;prepare&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;target              &lt;span class="c"&gt;# confined to this function&lt;/span&gt;
  &lt;span class="nv"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;mktemp&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$target&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;            &lt;span class="c"&gt;# hand the value out via stdout&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="nv"&gt;build_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;prepare&lt;span class="si"&gt;)&lt;/span&gt;        &lt;span class="c"&gt;# capture what the function echoed&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"built in: &lt;/span&gt;&lt;span class="nv"&gt;$build_dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"release dir still: &lt;/span&gt;&lt;span class="nv"&gt;$target&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;   &lt;span class="c"&gt;# unchanged&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;local target&lt;/code&gt; means: this variable exists only inside this function. When the function returns, the variable and its value vanish. The caller's &lt;code&gt;target&lt;/code&gt; is never touched. The habit I now enforce on every bash function I write: &lt;code&gt;local&lt;/code&gt; for every variable the function introduces, not just the ones I think might conflict. The conflict I do not predict is the one that deletes the wrong directory.&lt;/p&gt;

&lt;h2&gt;
  
  
  return is a status, not a value
&lt;/h2&gt;

&lt;p&gt;After the incident, I audited every function in the deploy script. I found a second bug:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;count_pending&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;n
  &lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;find &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$QUEUE_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-type&lt;/span&gt; f | &lt;span class="nb"&gt;wc&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;      &lt;span class="c"&gt;# WRONG if n &amp;gt; 255&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

count_pending
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt; &lt;span class="nt"&gt;-gt&lt;/span&gt; 0 &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"queue has items"&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;return&lt;/code&gt; sets an exit status. Exit statuses are a single byte: &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;255&lt;/code&gt;. &lt;code&gt;return 300&lt;/code&gt; wraps to &lt;code&gt;44&lt;/code&gt;. For months the queue count had been above 255 on busy days, and the &lt;code&gt;$?&lt;/code&gt; check was comparing against a wrapped value. The logic had been wrong for months and had accidentally worked because the wrapped values still triggered the &lt;code&gt;gt 0&lt;/code&gt; condition. But any script making real decisions based on the actual count — how many workers to spin up, whether to page someone — would have been working with garbage.&lt;/p&gt;

&lt;p&gt;The correct pattern is to echo the value and capture it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;count_pending&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;n
  &lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;find &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$QUEUE_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-type&lt;/span&gt; f | &lt;span class="nb"&gt;wc&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;         &lt;span class="c"&gt;# data goes to stdout&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;0          &lt;span class="c"&gt;# status: success&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="nv"&gt;pending&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;count_pending&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"queue depth: &lt;/span&gt;&lt;span class="nv"&gt;$pending&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;return&lt;/code&gt; answers "did this succeed." &lt;code&gt;echo&lt;/code&gt; plus command substitution answers "what is the value." These are two different questions and bash gives you two different mechanisms for a reason. Mixing them is how a function that counts 300 items makes the caller think it counted 44.&lt;/p&gt;

&lt;h2&gt;
  
  
  Arguments and the $@ quoting rule
&lt;/h2&gt;

&lt;p&gt;Inside a function, arguments arrive as positional parameters: &lt;code&gt;$1&lt;/code&gt;, &lt;code&gt;$2&lt;/code&gt;, all of them as &lt;code&gt;"$@"&lt;/code&gt;, the count as &lt;code&gt;$#&lt;/code&gt;. Quoting &lt;code&gt;"$@"&lt;/code&gt; is what keeps multi-word arguments intact:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;process_hosts&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"processing &lt;/span&gt;&lt;span class="nv"&gt;$# &lt;/span&gt;&lt;span class="s2"&gt;hosts"&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;host &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$@&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  checking: &lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;done&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

process_hosts &lt;span class="s2"&gt;"web-01"&lt;/span&gt; &lt;span class="s2"&gt;"db primary"&lt;/span&gt; &lt;span class="s2"&gt;"cache-02"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without the quotes around &lt;code&gt;"$@"&lt;/code&gt;, &lt;code&gt;db primary&lt;/code&gt; splits into two loop iterations and you are back to the word-splitting problem from a different angle. The quoted &lt;code&gt;"$@"&lt;/code&gt; is the way bash passes an array of arguments through a function call with each element preserved.&lt;/p&gt;

&lt;p&gt;This matters most when you are writing wrapper functions — functions that receive arguments and pass them to another command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;run_with_retry&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;1&lt;/span&gt;:?&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;shift
  local &lt;/span&gt;&lt;span class="nv"&gt;attempt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0
  &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;attempt &amp;lt; retries&lt;span class="o"&gt;))&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$@&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="k"&gt;return &lt;/span&gt;0
    &lt;span class="o"&gt;((&lt;/span&gt;attempt++&lt;span class="o"&gt;))&lt;/span&gt;
    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"retry &lt;/span&gt;&lt;span class="nv"&gt;$attempt&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;$retries&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="nb"&gt;sleep &lt;/span&gt;2
  &lt;span class="k"&gt;done
  return &lt;/span&gt;1
&lt;span class="o"&gt;}&lt;/span&gt;

run_with_retry 3 rsync &lt;span class="nt"&gt;-av&lt;/span&gt; &lt;span class="s2"&gt;"source dir/"&lt;/span&gt; remote:/dest/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;"$@"&lt;/code&gt; after the &lt;code&gt;shift&lt;/code&gt; is everything after the retry count — the command and all its arguments, each preserved as a separate item even if they contain spaces. Without the quotes, &lt;code&gt;"source dir/"&lt;/code&gt; splits and &lt;code&gt;rsync&lt;/code&gt; receives the wrong arguments.&lt;/p&gt;

&lt;h2&gt;
  
  
  getopts for anything with flags
&lt;/h2&gt;

&lt;p&gt;For one or two fixed positional arguments, reading &lt;code&gt;$1&lt;/code&gt; and &lt;code&gt;$2&lt;/code&gt; directly is fine. The moment a function or script takes optional flags in any order, do not parse them by hand:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0
&lt;span class="nv"&gt;output_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;

&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;getopts&lt;/span&gt; &lt;span class="s2"&gt;"vd:"&lt;/span&gt; opt&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  case&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$opt&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="k"&gt;in
    &lt;/span&gt;v&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="p"&gt;;;&lt;/span&gt;
    d&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;output_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$OPTARG&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
    &lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"usage: &lt;/span&gt;&lt;span class="nv"&gt;$0&lt;/span&gt;&lt;span class="s2"&gt; [-v] [-d dir]"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;exit &lt;/span&gt;1 &lt;span class="p"&gt;;;&lt;/span&gt;
  &lt;span class="k"&gt;esac&lt;/span&gt;
&lt;span class="k"&gt;done
&lt;/span&gt;&lt;span class="nb"&gt;shift&lt;/span&gt; &lt;span class="k"&gt;$((&lt;/span&gt;OPTIND &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="k"&gt;))&lt;/span&gt;   &lt;span class="c"&gt;# move past the flags to positional args&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The colon after &lt;code&gt;d&lt;/code&gt; marks it as requiring an argument, which arrives in &lt;code&gt;$OPTARG&lt;/code&gt;. &lt;code&gt;getopts&lt;/code&gt; handles flag bundling (&lt;code&gt;-vd dir&lt;/code&gt;), missing arguments (&lt;code&gt;-d&lt;/code&gt; with no path generates an error automatically), and unknown flags. Hand-rolled &lt;code&gt;$1&lt;/code&gt; parsing gets all of these wrong in subtle ways — it accepts &lt;code&gt;-d&lt;/code&gt; at the end without a value, it does not handle &lt;code&gt;-vd dir&lt;/code&gt;, and it requires the flags in a specific order.&lt;/p&gt;

&lt;p&gt;The deploy script that caused the incident had hand-rolled argument parsing. Among other things, it silently accepted a &lt;code&gt;--target&lt;/code&gt; flag with no value and proceeded with an empty string, which caused a different class of problem I had also not fully traced before the bigger incident made the whole thing visible.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the deploy script looks like now
&lt;/h2&gt;

&lt;p&gt;Every function declares &lt;code&gt;local&lt;/code&gt; for every variable. Values that need to cross function boundaries go through &lt;code&gt;echo&lt;/code&gt; and command substitution. Exit statuses communicate success or failure. &lt;code&gt;getopts&lt;/code&gt; handles the flags. There is a &lt;code&gt;trap&lt;/code&gt; on &lt;code&gt;EXIT&lt;/code&gt; that cleans up the temp directory using a local variable that only the cleanup function can see:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cleanup&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;local &lt;/span&gt;&lt;span class="nv"&gt;tmp_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TEMP_BUILD_DIR&lt;/span&gt;&lt;span class="k"&gt;:-}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;   &lt;span class="c"&gt;# local ref to the temp dir&lt;/span&gt;
  &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$tmp_dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$tmp_dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$tmp_dir&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="nb"&gt;trap &lt;/span&gt;cleanup EXIT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;$target&lt;/code&gt; variable in the main script is set once at the top and never touched by any function. Functions that need a temp directory create one, store it in a local variable, use it, and the cleanup trap handles removal. The variable naming conflict that caused the incident cannot happen because the pattern prevents it structurally.&lt;/p&gt;

&lt;p&gt;The application has been running cleanly since then. The deploy script has hit the failure path twice since the fix — different failures, unrelated causes — and both times the cleanup ran correctly and left the running release untouched.&lt;/p&gt;

&lt;p&gt;Full version with the local-scope fix, the echo-for-values pattern, the return-as-status trap, and a getopts template: &lt;a href="https://bashsnippets.xyz/snippets/bash-functions-arguments" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/snippets/bash-functions-arguments&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A function that fails should fail loudly — wrap the script in &lt;a href="https://bashsnippets.xyz/snippets/bash-error-handling" rel="noopener noreferrer"&gt;set -euo pipefail&lt;/a&gt; — and the &lt;a href="https://bashsnippets.xyz/tools/bash-boilerplate-generator" rel="noopener noreferrer"&gt;bash boilerplate generator&lt;/a&gt; can scaffold all of this with the right traps and argument parsing wired in from the start. The rest is at &lt;a href="https://bashsnippets.xyz" rel="noopener noreferrer"&gt;https://bashsnippets.xyz&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>linux</category>
      <category>sysadmin</category>
      <category>devops</category>
    </item>
    <item>
      <title>The Alert Never Fired Because the Loop Skipped the Last Line of the File</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Fri, 19 Jun 2026 17:58:09 +0000</pubDate>
      <link>https://dev.to/bashsnippets/the-alert-never-fired-because-the-loop-skipped-the-last-line-of-the-file-3il8</link>
      <guid>https://dev.to/bashsnippets/the-alert-never-fired-because-the-loop-skipped-the-last-line-of-the-file-3il8</guid>
      <description>&lt;p&gt;We kept a plaintext file of hostnames, one per line, and a monitoring script read the file and pinged each host every five minutes. When a host failed to respond, the script sent an email alert. The system had been running for months and it worked — we had caught three actual outages with it, which gave us real confidence in the setup.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;app-07&lt;/code&gt; was added to the list on a Thursday afternoon. The engineer who added it was using VS Code on a Mac, and VS Code by default does not add a trailing newline to a file when you append to it using certain editing workflows. The file had ended in a newline before the edit. After the edit, the last line — &lt;code&gt;app-07&lt;/code&gt; — had no trailing newline.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;app-07&lt;/code&gt; went down the following Sunday afternoon at 2:17pm. The monitoring script ran at 2:20, 2:25, 2:30, all the way through the evening. No alert ever fired. The on-call engineer found out at 8pm when a client emailed. The system had been down for almost six hours.&lt;/p&gt;

&lt;p&gt;When I looked at the script, the bug was immediately obvious once I knew what to look for. But I had written that script, I had tested it, and I had been looking at the monitoring confirmation emails for months without ever noticing. The confirmation email listed the hosts it checked. &lt;code&gt;app-07&lt;/code&gt; was never in the list. I had been reading those emails without actually counting the hosts. I just scanned for the OK lines and moved on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why read drops the last line
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;read&lt;/code&gt; returns a success exit status when it reads a line and finds the newline that terminates it. When the file does not end in a newline, &lt;code&gt;read&lt;/code&gt; still populates the variable with the final line's content, but it returns a non-zero (failure) exit status because it hit end-of-file before finding a terminator. A &lt;code&gt;while read host&lt;/code&gt; loop checks the return status to decide whether to execute the loop body. On the final, newline-less line, &lt;code&gt;read&lt;/code&gt; puts &lt;code&gt;app-07&lt;/code&gt; into &lt;code&gt;host&lt;/code&gt; and then returns failure. The &lt;code&gt;while&lt;/code&gt; loop sees failure and exits without running the body. The content is there. The variable is populated. The loop throws it away.&lt;/p&gt;

&lt;p&gt;This behavior is documented in the POSIX spec for &lt;code&gt;read&lt;/code&gt;. It is not a bash quirk. Any POSIX shell handles the missing-final-newline case this way. A plain &lt;code&gt;while read line&lt;/code&gt; loop is incorrect for any file you do not personally control the formatting of, which in practice means nearly any file.&lt;/p&gt;

&lt;p&gt;The fix is one extra clause:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Script: check-hosts.sh&lt;/span&gt;
&lt;span class="c"&gt;# Purpose: ping every host in a file, including a newline-less final line&lt;/span&gt;
&lt;span class="c"&gt;# Usage: ./check-hosts.sh hosts.txt&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="nv"&gt;CHECK&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"✓"&lt;/span&gt;
&lt;span class="nv"&gt;CROSS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"✗"&lt;/span&gt;
&lt;span class="nv"&gt;HOST_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;1&lt;/span&gt;:?Usage:&lt;span class="p"&gt; check-hosts.sh &amp;lt;host-file&amp;gt;&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; host &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="se"&gt;\#&lt;/span&gt;&lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="k"&gt;continue
  if &lt;/span&gt;ping &lt;span class="nt"&gt;-c1&lt;/span&gt; &lt;span class="nt"&gt;-W2&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null 2&amp;gt;&amp;amp;1&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CHECK&lt;/span&gt;&lt;span class="s2"&gt; up:   &lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CROSS&lt;/span&gt;&lt;span class="s2"&gt; down: &lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;fi
done&lt;/span&gt; &amp;lt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOST_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;|| [[ -n "$host" ]]&lt;/code&gt; says: if &lt;code&gt;read&lt;/code&gt; returned failure but the variable is non-empty, run the loop body anyway. That is precisely the leftover-final-line case. &lt;code&gt;read&lt;/code&gt; failed because it hit end-of-file, but it populated &lt;code&gt;host&lt;/code&gt; with &lt;code&gt;app-07&lt;/code&gt; before returning. The &lt;code&gt;||&lt;/code&gt; catches it. &lt;code&gt;app-07&lt;/code&gt; gets pinged.&lt;/p&gt;

&lt;h2&gt;
  
  
  What IFS= and -r actually do
&lt;/h2&gt;

&lt;p&gt;You see &lt;code&gt;while IFS= read -r line&lt;/code&gt; written in every correct read-loop example, and it is worth being specific about what each piece prevents because both have their own failure mode.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;IFS=&lt;/code&gt; sets the field separator to empty for the duration of the &lt;code&gt;read&lt;/code&gt; command. Without it, &lt;code&gt;read&lt;/code&gt; strips leading and trailing whitespace from each line. A hostname like &lt;code&gt;app-07&lt;/code&gt; (with leading spaces, which some editors produce) becomes &lt;code&gt;app-07&lt;/code&gt;, which might be correct. An indented config value, a Python-style YAML string, a log line that starts with spaces for alignment — all of these are silently modified. Setting &lt;code&gt;IFS=&lt;/code&gt; tells &lt;code&gt;read&lt;/code&gt; to take the line exactly as it appears.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;-r&lt;/code&gt; prevents &lt;code&gt;read&lt;/code&gt; from interpreting backslash sequences. Without &lt;code&gt;-r&lt;/code&gt;, a line like &lt;code&gt;C:\temp\logs&lt;/code&gt; has its backslashes consumed as escape characters and arrives as &lt;code&gt;C:templogs&lt;/code&gt;. This matters less for hostname files and enormously for any script that processes Windows paths, config files that use backslash as a line-continuation character, or log files from mixed-OS environments. The &lt;code&gt;-r&lt;/code&gt; flag is essentially free protection; there is no reason not to include it.&lt;/p&gt;

&lt;p&gt;A bare &lt;code&gt;read line&lt;/code&gt; without either flag silently mangles both whitespace and backslashes. The script works correctly on clean input and produces wrong output on input with edge cases. The wrong output does not produce an error. You find out when the data that mattered was the indented or backslash-containing kind.&lt;/p&gt;

&lt;h2&gt;
  
  
  The subshell trap that kills your counters
&lt;/h2&gt;

&lt;p&gt;This is the one that is most likely to make you question your sanity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# This looks correct. It is not.&lt;/span&gt;
&lt;span class="nv"&gt;fails&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0
&lt;span class="nb"&gt;cat &lt;/span&gt;hosts.txt | &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; host&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;ping &lt;span class="nt"&gt;-c1&lt;/span&gt; &lt;span class="nt"&gt;-W2&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null 2&amp;gt;&amp;amp;1 &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;fails++&lt;span class="o"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;done
&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Total failures: &lt;/span&gt;&lt;span class="nv"&gt;$fails&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;   &lt;span class="c"&gt;# Always prints 0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pipe creates a subshell for the right side. The &lt;code&gt;while&lt;/code&gt; loop runs inside that subshell. &lt;code&gt;fails&lt;/code&gt; increments correctly inside the subshell. When the subshell exits, the parent shell's &lt;code&gt;fails&lt;/code&gt; is still &lt;code&gt;0&lt;/code&gt;, because the increment happened in a different process. The parent echo sees the original value.&lt;/p&gt;

&lt;p&gt;This catches people because the loop body itself works — the pings happen, the increment logic is correct — but any state the loop was supposed to accumulate for later use is silently discarded. I spent forty minutes on a version of this problem before I remembered that pipes create subshells. It is the kind of thing that feels like a bash bug until you understand that it is behaving exactly as documented.&lt;/p&gt;

&lt;p&gt;The fix is to redirect the file into the loop instead of piping into it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;fails&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0
&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; host &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;ping &lt;span class="nt"&gt;-c1&lt;/span&gt; &lt;span class="nt"&gt;-W2&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;/dev/null 2&amp;gt;&amp;amp;1 &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;fails++&lt;span class="o"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt; &amp;lt; hosts.txt
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Total failures: &lt;/span&gt;&lt;span class="nv"&gt;$fails&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;   &lt;span class="c"&gt;# Now correct&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;done &amp;lt; hosts.txt&lt;/code&gt; feeds the file to the loop's stdin without a pipe. The loop runs in the current shell. &lt;code&gt;fails&lt;/code&gt; accumulates in the current shell. The echo sees the real count.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why for loop is wrong for this
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Never do this — iterates words, not lines&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;line &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;hosts.txt&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;ping &lt;span class="nt"&gt;-c1&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;$(cat hosts.txt)&lt;/code&gt; is command substitution. Bash captures the text output and word-splits it on IFS — spaces, tabs, newlines. For a file with one hostname per line and no spaces in the hostnames, this accidentally produces the right behavior. For any file with spaces — log lines, config values, paths with spaces, anything a non-developer might have generated — it splits lines into fragments and each fragment becomes a loop iteration.&lt;/p&gt;

&lt;p&gt;There is no version of &lt;code&gt;for line in $(cat file)&lt;/code&gt; that is correct for reading lines. The right tool is always &lt;code&gt;while IFS= read -r line || [[ -n "$line" ]]; do ... done &amp;lt; file&lt;/code&gt;. The for loop is the right tool for iterating a known list that you control directly, not for reading file content.&lt;/p&gt;

&lt;h2&gt;
  
  
  The monitoring system, after the fix
&lt;/h2&gt;

&lt;p&gt;After the &lt;code&gt;app-07&lt;/code&gt; incident we made three changes. The obvious one was fixing the read loop with the &lt;code&gt;|| [[ -n "$host" ]]&lt;/code&gt; guard. The second was adding a sanity check at the top of the script that counted the lines in the hosts file and compared it to the number of hosts the loop actually processed — a mismatch meant the file was malformed or something else was wrong. The third was adding a nightly email that included the count of hosts checked, not just the status of each one, so a future addition to the file that somehow got lost would show up as "expected 12 hosts, checked 11."&lt;/p&gt;

&lt;p&gt;The confirmation email count was something I should have had from the start. If I had been looking at "checked 11/12 hosts" instead of a list of OK lines, I would have noticed &lt;code&gt;app-07&lt;/code&gt; missing on the first night. The monitoring was working. The observability of the monitoring was not.&lt;/p&gt;

&lt;p&gt;Full version with CSV-field parsing, comment-skipping, and the subshell-safe redirect form: &lt;a href="https://bashsnippets.xyz/snippets/bash-read-file-line-by-line" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/snippets/bash-read-file-line-by-line&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To iterate a list of files rather than a file's contents, reach for a &lt;a href="https://bashsnippets.xyz/snippets/bash-for-loop-examples" rel="noopener noreferrer"&gt;for loop&lt;/a&gt; instead, and wrap anything that acts on what it reads in &lt;a href="https://bashsnippets.xyz/snippets/bash-error-handling" rel="noopener noreferrer"&gt;set -euo pipefail&lt;/a&gt;. More at &lt;a href="https://bashsnippets.xyz" rel="noopener noreferrer"&gt;https://bashsnippets.xyz&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>linux</category>
      <category>sysadmin</category>
      <category>devops</category>
    </item>
    <item>
      <title>A For Loop Skipped Every File With a Space and Called the Backup a Success</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Thu, 18 Jun 2026 19:29:49 +0000</pubDate>
      <link>https://dev.to/bashsnippets/a-for-loop-skipped-every-file-with-a-space-and-called-the-backup-a-success-392e</link>
      <guid>https://dev.to/bashsnippets/a-for-loop-skipped-every-file-with-a-space-and-called-the-backup-a-success-392e</guid>
      <description>&lt;p&gt;The nightly backup looped over &lt;code&gt;for f in $(ls /data/exports)&lt;/code&gt; and copied each file to a backup volume. It exited clean every night. For three weeks, green exit codes, no errors, nothing in the logs to suggest anything was wrong. The backup script had been written by someone who left the company six months before I started, and it had never been tested against files with spaces in their names because the original export directory only ever had files like &lt;code&gt;Q3.xlsx&lt;/code&gt; and &lt;code&gt;report-final.xlsx&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Then someone generated &lt;code&gt;Q3 final.xlsx&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It took three weeks because that file was generated once a quarter and the next time someone needed it was a quarter later. The person who needed it was the CFO. The CFO does not particularly enjoy being told that the backup system that was supposed to protect the quarterly export has been silently failing for an indeterminate period of time and we are not sure which other files it missed.&lt;/p&gt;

&lt;p&gt;I know the specific backup had been running for three weeks because that was the date the file appeared in the exports directory. Every night since then, the loop had been splitting &lt;code&gt;Q3 final.xlsx&lt;/code&gt; into two items — &lt;code&gt;Q3&lt;/code&gt; and &lt;code&gt;final.xlsx&lt;/code&gt; — trying to copy two paths that did not exist, logging two harmless "no such file or directory" lines that nobody read, and moving on. Every file without a space in its name backed up fine. The script looked correct because most of the time it was.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why $(ls) word-splits
&lt;/h2&gt;

&lt;p&gt;When you write &lt;code&gt;for f in $(ls /data/exports)&lt;/code&gt;, bash runs &lt;code&gt;ls&lt;/code&gt;, captures its text output, and splits it on IFS — the internal field separator, which defaults to spaces, tabs, and newlines. Filenames are just text in the output of &lt;code&gt;ls&lt;/code&gt;. A file named &lt;code&gt;Q3 final.xlsx&lt;/code&gt; is one filename, but &lt;code&gt;ls&lt;/code&gt; outputs it as the string &lt;code&gt;Q3 final.xlsx&lt;/code&gt;, and bash splits that string on the space into two separate items before the loop ever starts.&lt;/p&gt;

&lt;p&gt;This is not a bug in &lt;code&gt;ls&lt;/code&gt;. This is not a bug in bash. This is exactly what &lt;code&gt;$( )&lt;/code&gt; does to any command's output — it captures text and bash processes it as text. The problem is that filenames are not reliably text-safe; they can contain any character except null and the path separator. Spaces are common. Tabs are less common but legal. Newlines are technically legal. Treating command output as a filename list breaks the moment any of those show up.&lt;/p&gt;

&lt;p&gt;The fix is to stop treating command output as a filename list and let bash build the list from the filesystem directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Script: backup-exports.sh&lt;/span&gt;
&lt;span class="c"&gt;# Purpose: copy export files without losing ones with spaces in their names&lt;/span&gt;
&lt;span class="c"&gt;# Usage: ./backup-exports.sh&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="nv"&gt;CHECK&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"✓"&lt;/span&gt;
&lt;span class="nv"&gt;CROSS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"✗"&lt;/span&gt;
&lt;span class="nv"&gt;SRC_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/data/exports"&lt;/span&gt;
&lt;span class="nv"&gt;DEST_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/backup/exports"&lt;/span&gt;
&lt;span class="nb"&gt;shopt&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; nullglob   &lt;span class="c"&gt;# empty glob expands to nothing, not the literal pattern&lt;/span&gt;

&lt;span class="k"&gt;for &lt;/span&gt;f &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SRC_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;/&lt;span class="k"&gt;*&lt;/span&gt;.xlsx&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  if &lt;/span&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DEST_DIR&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CHECK&lt;/span&gt;&lt;span class="s2"&gt; backed up: &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;basename&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;else
    &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CROSS&lt;/span&gt;&lt;span class="s2"&gt; failed: &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;basename&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;fi
done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;for f in "$SRC_DIR"/*.xlsx&lt;/code&gt; asks bash to expand the glob. Bash talks directly to the filesystem and gets back a properly-separated list of matching paths. No command output, no text splitting, no ambiguity about what the separator is. A file named &lt;code&gt;Q3 final.xlsx&lt;/code&gt; stays one item because it was never turned into text and split back apart. The &lt;code&gt;--&lt;/code&gt; before &lt;code&gt;"$f"&lt;/code&gt; tells &lt;code&gt;cp&lt;/code&gt; to stop reading flags, so a filename that starts with a dash does not get interpreted as a &lt;code&gt;cp&lt;/code&gt; option.&lt;/p&gt;

&lt;h2&gt;
  
  
  The second half: quoting on use
&lt;/h2&gt;

&lt;p&gt;The glob fixes the loop header. Quoting fixes the point of use. These are separate.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;for &lt;/span&gt;f &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SRC_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;/&lt;span class="k"&gt;*&lt;/span&gt;.xlsx&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="nv"&gt;$f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DEST_DIR&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;    &lt;span class="c"&gt;# WRONG — $f re-splits here&lt;/span&gt;
  &lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DEST_DIR&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;  &lt;span class="c"&gt;# RIGHT — the quotes prevent the split&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even with a correct glob, an unquoted &lt;code&gt;$f&lt;/code&gt; in the &lt;code&gt;cp&lt;/code&gt; command re-splits on spaces. Bash has already expanded the variable to &lt;code&gt;Q3 final.xlsx&lt;/code&gt;, but when you use it unquoted, the shell processes word splitting again on that value and &lt;code&gt;cp&lt;/code&gt; receives two arguments: &lt;code&gt;Q3&lt;/code&gt; and &lt;code&gt;final.xlsx&lt;/code&gt;. The quotes around &lt;code&gt;"$f"&lt;/code&gt; tell bash to pass the entire value as a single argument. The rule is: glob over parse to build the list, quote on use to keep each item intact.&lt;/p&gt;

&lt;p&gt;I have seen this mistake repeated in scripts at three different companies. In each case the scripts had been running for months or years and the problem was invisible because the majority of files had no spaces. The ones that did — client names, quarterly reports, anything a non-technical person named — were being silently skipped or mishandled.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ranges and counters: where the brace trap hides
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5

&lt;span class="c"&gt;# This does NOT produce a range — it produces the literal string {1..5}&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;i &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;1..&lt;span class="nv"&gt;$n&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"attempt &lt;/span&gt;&lt;span class="nv"&gt;$i&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;done&lt;/span&gt;

&lt;span class="c"&gt;# This works — C-style, evaluates variables at runtime&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;i &lt;span class="o"&gt;=&lt;/span&gt; 1&lt;span class="p"&gt;;&lt;/span&gt; i &amp;lt;&lt;span class="o"&gt;=&lt;/span&gt; n&lt;span class="p"&gt;;&lt;/span&gt; i++&lt;span class="o"&gt;))&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"attempt &lt;/span&gt;&lt;span class="nv"&gt;$i&lt;/span&gt;&lt;span class="s2"&gt; of &lt;/span&gt;&lt;span class="nv"&gt;$n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Brace expansion happens before variable expansion in bash's order of operations. By the time &lt;code&gt;$n&lt;/code&gt; is replaced with its value, the brace expansion step has already passed and &lt;code&gt;{1..$n}&lt;/code&gt; is just a string. This is the kind of thing that fails silently in a test environment where the count is small and hardcoded, and causes weird output in production where it is a variable.&lt;/p&gt;

&lt;p&gt;The C-style loop is the correct form any time the bound is a variable. It evaluates at runtime and handles arithmetic naturally. If the count is genuinely fixed at write time, &lt;code&gt;{1..10}&lt;/code&gt; works fine. If it is ever going to be a variable, use &lt;code&gt;for (( ))&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Arrays: the original bug wearing a different hat
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;servers&lt;/span&gt;&lt;span class="o"&gt;=(&lt;/span&gt;&lt;span class="s2"&gt;"web-01"&lt;/span&gt; &lt;span class="s2"&gt;"db primary"&lt;/span&gt; &lt;span class="s2"&gt;"cache-02"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# WRONG — word-splits "db primary" into two iterations&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;s &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;servers&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;ping &lt;span class="nt"&gt;-c1&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$s&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;

&lt;span class="c"&gt;# RIGHT — quotes keep each element intact&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;s &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;servers&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;ping &lt;span class="nt"&gt;-c1&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$s&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;"${servers[@]}"&lt;/code&gt; with the quotes and &lt;code&gt;[@]&lt;/code&gt; is the form that preserves each element as a single item regardless of what is in it. Without the quotes, bash word-splits the array expansion and &lt;code&gt;db primary&lt;/code&gt; becomes two separate loop iterations — neither of which is a real hostname. This is exactly the same word-splitting mechanism as the &lt;code&gt;$(ls)&lt;/code&gt; problem, just manifesting in arrays instead of command output.&lt;/p&gt;

&lt;p&gt;Once you internalize that word-splitting happens wherever an unquoted variable or expansion appears, the rule becomes one rule instead of several: always quote expansions. The specific context changes; the mechanism does not.&lt;/p&gt;

&lt;h2&gt;
  
  
  The nullglob case
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;shopt&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; nullglob
&lt;span class="k"&gt;for &lt;/span&gt;f &lt;span class="k"&gt;in&lt;/span&gt; /data/exports/&lt;span class="k"&gt;*&lt;/span&gt;.xlsx&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$f&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without &lt;code&gt;shopt -s nullglob&lt;/code&gt;, if no &lt;code&gt;.xlsx&lt;/code&gt; files exist, the glob &lt;code&gt;*.xlsx&lt;/code&gt; does not expand — it stays as the literal string &lt;code&gt;*.xlsx&lt;/code&gt;. The loop runs once with &lt;code&gt;f&lt;/code&gt; set to the literal string &lt;code&gt;/data/exports/*.xlsx&lt;/code&gt;. Your script then tries to process a file with that exact path, which does not exist, and produces an error or silently does nothing depending on what you do with it.&lt;/p&gt;

&lt;p&gt;Setting &lt;code&gt;nullglob&lt;/code&gt; tells bash to expand a non-matching glob to nothing (an empty list), so the loop simply does not run. This is almost always the right behavior when you are iterating files that might not exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  What happened to the Q3 export
&lt;/h2&gt;

&lt;p&gt;We recovered it from the CFO's local machine, where she had downloaded it before the backup was supposed to preserve it. The fix to the script took four minutes. The conversation about why the backup system had been failing silently for three weeks took longer. The monitoring that we added afterwards — a nightly check that the backup directory has at least as many files as the source directory — took another twenty minutes.&lt;/p&gt;

&lt;p&gt;The monitoring should have been there from the start. So should the glob. So should the &lt;code&gt;set -euo pipefail&lt;/code&gt; that would have made the copy failures loud instead of silent. These are things you add before something breaks, and the only reason to know you need them is to have seen, or caused, or read about what happens when they are missing.&lt;/p&gt;

&lt;p&gt;Full examples with the safe glob, counter, C-style loop, array form, and nullglob guard: &lt;a href="https://bashsnippets.xyz/snippets/bash-for-loop-examples" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/snippets/bash-for-loop-examples&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For reading a file's lines one at a time, a for loop is the wrong tool — use &lt;a href="https://bashsnippets.xyz/snippets/bash-read-file-line-by-line" rel="noopener noreferrer"&gt;while IFS= read -r&lt;/a&gt; — and wrap any loop that touches real files in &lt;a href="https://bashsnippets.xyz/snippets/bash-error-handling" rel="noopener noreferrer"&gt;set -euo pipefail&lt;/a&gt;. The rest is at &lt;a href="https://bashsnippets.xyz" rel="noopener noreferrer"&gt;https://bashsnippets.xyz&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>linux</category>
      <category>sysadmin</category>
      <category>devops</category>
    </item>
    <item>
      <title>find . -delete Ran Before the Filter and Emptied the Whole Tree</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Wed, 17 Jun 2026 23:59:08 +0000</pubDate>
      <link>https://dev.to/bashsnippets/find-delete-ran-before-the-filter-and-emptied-the-whole-tree-3298</link>
      <guid>https://dev.to/bashsnippets/find-delete-ran-before-the-filter-and-emptied-the-whole-tree-3298</guid>
      <description>&lt;p&gt;I meant to delete the &lt;code&gt;.cache&lt;/code&gt; files under a data directory. The server had been running for two months and the cache layer had grown to about 14GB. The application team told me it was safe to purge it — they'd rebuilt the cache logic and the old files were just dead weight. I typed &lt;code&gt;find /data -delete -name "*.cache"&lt;/code&gt; because I was moving fast and I figured the order of arguments to &lt;code&gt;find&lt;/code&gt; did not matter. It does. &lt;code&gt;find&lt;/code&gt; evaluates its expression left to right, and &lt;code&gt;-delete&lt;/code&gt; is not a filter — it is an action. It fired on every path &lt;code&gt;find&lt;/code&gt; walked, starting at &lt;code&gt;/data&lt;/code&gt; itself, and &lt;code&gt;-name "*.cache"&lt;/code&gt; never got a chance to narrow anything. By the time I hit Ctrl-C the tree was roughly 80% gone.&lt;/p&gt;

&lt;p&gt;That was not a test server.&lt;/p&gt;

&lt;p&gt;The restore from backup took forty minutes. The application was down for those forty minutes. The postmortem was a forty-five minute conversation with people who did not particularly enjoy having it. I have run hundreds of &lt;code&gt;find&lt;/code&gt; commands since then and I verify the expression order before every single one that has any destructive action attached to it — not because I've forgotten the rule, but because the cost of forgetting it once is not recoverable with an apology.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the order is the program
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;find&lt;/code&gt; does not have a flag parser that groups tests and actions separately. It walks a directory tree and evaluates its arguments as a logical expression, left to right, short-circuiting on false. When you write:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find /data &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.cache"&lt;/span&gt; &lt;span class="nt"&gt;-delete&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;it evaluates &lt;code&gt;-name "*.cache"&lt;/code&gt; first on each path. If the name does not match, it short-circuits and &lt;code&gt;-delete&lt;/code&gt; never runs on that path. When you write:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find /data &lt;span class="nt"&gt;-delete&lt;/span&gt; &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.cache"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;it evaluates &lt;code&gt;-delete&lt;/code&gt; first. &lt;code&gt;-delete&lt;/code&gt; always succeeds — it removes the path and returns true, which means the expression continues to &lt;code&gt;-name&lt;/code&gt;. The name check runs after the deletion, on a file that no longer exists, which is meaningless. The effect is that everything gets deleted and nothing is filtered.&lt;/p&gt;

&lt;p&gt;This is not a bug. It is exactly how the man page says &lt;code&gt;find&lt;/code&gt; works. It is just not how anyone instinctively reads a command the first few times they use it.&lt;/p&gt;

&lt;p&gt;The rule is: &lt;strong&gt;tests filter, actions act, and actions must come after the tests that are supposed to narrow them.&lt;/strong&gt; Write it in that order every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The quoting trap, right behind the ordering one
&lt;/h2&gt;

&lt;p&gt;Here is the one that is even more subtle, because it causes the wrong behavior and produces no error at all:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# WRONG — the shell expands *.cache before find ever sees it&lt;/span&gt;
find /data &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt;.cache

&lt;span class="c"&gt;# RIGHT — quote the pattern so find does the matching&lt;/span&gt;
find /data &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.cache"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If there happens to be a single &lt;code&gt;.cache&lt;/code&gt; file in your current working directory when you run this, the shell expands &lt;code&gt;*.cache&lt;/code&gt; to that one filename and passes it to &lt;code&gt;find -name&lt;/code&gt; as a literal string. &lt;code&gt;find&lt;/code&gt; then searches for files named exactly that, everywhere under &lt;code&gt;/data&lt;/code&gt;. It finds some, it finds none, but it is definitely not doing a wildcard search. If there are multiple &lt;code&gt;.cache&lt;/code&gt; files in your current directory, &lt;code&gt;find&lt;/code&gt; receives too many arguments for &lt;code&gt;-name&lt;/code&gt; and errors out with something confusing about paths needing to precede the expression.&lt;/p&gt;

&lt;p&gt;Either way you did not get what you intended, and if you added &lt;code&gt;-delete&lt;/code&gt;, you deleted the wrong things silently.&lt;/p&gt;

&lt;p&gt;Quoting the pattern is the fix. The quotes prevent the shell from expanding the glob so &lt;code&gt;find&lt;/code&gt; receives the literal &lt;code&gt;*.cache&lt;/code&gt; pattern and handles the wildcard itself, across the directory tree you pointed it at.&lt;/p&gt;

&lt;h2&gt;
  
  
  The age sign that everyone reverses at least once
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find /var/log &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.log"&lt;/span&gt; &lt;span class="nt"&gt;-mtime&lt;/span&gt; +30   &lt;span class="c"&gt;# older than 30 days&lt;/span&gt;
find /var/log &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.log"&lt;/span&gt; &lt;span class="nt"&gt;-mtime&lt;/span&gt; &lt;span class="nt"&gt;-1&lt;/span&gt;    &lt;span class="c"&gt;# modified within the last day&lt;/span&gt;
find /var/log &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.log"&lt;/span&gt; &lt;span class="nt"&gt;-mtime&lt;/span&gt; 30    &lt;span class="c"&gt;# exactly day 30 (almost never what you want)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;+30&lt;/code&gt; is older than thirty days. &lt;code&gt;-1&lt;/code&gt; is within the last day. A bare &lt;code&gt;30&lt;/code&gt; means precisely thirty days ago, which is almost never the thing you're trying to match. The sign convention is the opposite of what feels natural — you want files "older than" a threshold and the intuitive symbol for "bigger number" is &lt;code&gt;+&lt;/code&gt;, but a lot of people read &lt;code&gt;+30&lt;/code&gt; as "in the last thirty days" the first time they see it.&lt;/p&gt;

&lt;p&gt;I have reversed this twice in production. Once I kept the wrong logs. Once I deleted logs I needed for an audit the following week. Neither was catastrophic but both were embarrassing, and both happened under the kind of mild time pressure where double-checking the man page feels slower than it actually is. The builder I built for this labels the output as "older than" or "within the last" in plain English next to the value, which removes the one decision in a &lt;code&gt;find -delete&lt;/code&gt; job most likely to go wrong when you are already stressed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The -exec batching difference nobody explains
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Runs the command once per file — slower, one PID per file&lt;/span&gt;
find /data &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.cache"&lt;/span&gt; &lt;span class="nt"&gt;-exec&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt; &lt;span class="se"&gt;\;&lt;/span&gt;

&lt;span class="c"&gt;# Batches files into one invocation — faster, one rm for many files&lt;/span&gt;
find /data &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.cache"&lt;/span&gt; &lt;span class="nt"&gt;-exec&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt; +
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;\;&lt;/code&gt; runs the command once per file. &lt;code&gt;+&lt;/code&gt; collects as many paths as it can and passes them all to one invocation of the command, the same way &lt;code&gt;xargs&lt;/code&gt; does. For something like &lt;code&gt;rm&lt;/code&gt;, which accepts multiple arguments, the &lt;code&gt;+&lt;/code&gt; form is faster and produces less process overhead. For something like a custom script that must process exactly one file at a time, &lt;code&gt;\;&lt;/code&gt; is correct.&lt;/p&gt;

&lt;p&gt;Most resources either do not mention this difference or mention it once in a reference table. The practical consequence is real — on a directory with ten thousand files, &lt;code&gt;\;&lt;/code&gt; spawns ten thousand processes. The &lt;code&gt;+&lt;/code&gt; form spawns a handful. On a cleanup job that runs in cron, the difference shows up in CPU load.&lt;/p&gt;

&lt;h2&gt;
  
  
  The preview step I skipped the day I broke things
&lt;/h2&gt;

&lt;p&gt;The most reliable way to avoid the ordering and quoting mistakes is to build the command in two steps. First, run it with &lt;code&gt;-print&lt;/code&gt; instead of &lt;code&gt;-delete&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find /data &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"*.cache"&lt;/span&gt; &lt;span class="nt"&gt;-print&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read every line of that output. Confirm the list is what you intended. Then, and only then, swap &lt;code&gt;-print&lt;/code&gt; for &lt;code&gt;-delete&lt;/code&gt;. This adds maybe thirty seconds to the workflow. It would have saved me forty minutes and a postmortem. I skipped it because I was confident. Confidence is not a useful substitute for verification when the operation is irreversible.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://bashsnippets.xyz/tools/find-command-builder" rel="noopener noreferrer"&gt;find command builder&lt;/a&gt; enforces this by showing a warning whenever you select &lt;code&gt;-delete&lt;/code&gt; or &lt;code&gt;-exec&lt;/code&gt; as the action, and offering to generate the &lt;code&gt;-print&lt;/code&gt; version of the command first. It is the kind of nudge I would have appreciated having that day.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the builder actually does
&lt;/h2&gt;

&lt;p&gt;It assembles the expression in the legally correct order — tests first, action last — so you cannot accidentally replicate the mistake I made. You set the starting path, add tests in whatever order feels natural to you (name, type, age, size, exclude path), and pick an action. The output command always has the tests before the action, regardless of the order you clicked things in.&lt;/p&gt;

&lt;p&gt;Every active flag gets a plain-English description inline. &lt;code&gt;-mtime +30&lt;/code&gt; reads as "modified more than 30 days ago." &lt;code&gt;-name "*.cache"&lt;/code&gt; reads as "name matches the glob &lt;code&gt;*.cache&lt;/code&gt;, quoted so find handles the wildcard." The &lt;code&gt;-exec {} +&lt;/code&gt; form is the default when you pick &lt;code&gt;-exec&lt;/code&gt;, with a note explaining why it is faster.&lt;/p&gt;

&lt;p&gt;The whole point is to get from "I need to find and delete files matching these conditions" to a verified, copy-paste command without the thirty-second loop of man-page reading and second-guessing that I used to do and, on one bad morning, skipped.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I upgraded while I was at it
&lt;/h2&gt;

&lt;p&gt;The same discipline applies to the other tools. The rsync command builder now has presets for the three setups people build most often — local backup, push to remote, mirror — because those cover maybe ninety percent of rsync jobs and getting the flags right from scratch every time is where the dangerous ones like &lt;code&gt;--delete&lt;/code&gt; get misapplied. The cron builder previews the next five run times after you build an expression, because a cron job I once deployed ran at 3am UTC instead of 3am local time and I did not notice until it fired on the wrong schedule for a week. You can also paste an existing crontab line and get the human-readable schedule back. The chmod builder now accepts an octal you paste in and sets the checkboxes — two-directional, because reading a file's permissions and understanding what they mean is just as common a task as setting them from scratch.&lt;/p&gt;

&lt;p&gt;These are not features I planned in advance. They are all things I needed at 2am and did not have. That is the pattern most of this site runs on.&lt;/p&gt;

&lt;p&gt;Build a &lt;code&gt;find&lt;/code&gt; command with tests ordered before actions and every flag explained: &lt;a href="https://bashsnippets.xyz/tools/find-command-builder" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/tools/find-command-builder&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you are scoping files before searching or transforming them, the whole pipeline is documented in &lt;a href="https://bashsnippets.xyz/guides/bash-text-processing" rel="noopener noreferrer"&gt;Bash Text Processing: find, grep, sed, and awk&lt;/a&gt;. The rest of the free tools are at &lt;a href="https://bashsnippets.xyz/tools" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/tools&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>linux</category>
      <category>devops</category>
      <category>sysadmin</category>
    </item>
    <item>
      <title>A for Loop Skipped 23 Files and Called It a Successful Backup</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Sun, 14 Jun 2026 17:26:03 +0000</pubDate>
      <link>https://dev.to/bashsnippets/a-for-loop-skipped-23-files-and-called-it-a-successful-backup-2j3f</link>
      <guid>https://dev.to/bashsnippets/a-for-loop-skipped-23-files-and-called-it-a-successful-backup-2j3f</guid>
      <description>&lt;p&gt;The backup ran every night at 2am and emailed me a green "847 files archived" summary. I'd built it, tested it against my own home directory where every file was named like &lt;code&gt;report_2024.csv&lt;/code&gt;, watched it sail through, and shipped it. For weeks the summary said everything was fine.&lt;/p&gt;

&lt;p&gt;Then a coworker asked me to restore a file. &lt;code&gt;Q3 forecast.xlsx&lt;/code&gt;. It wasn't in the archive. Neither was &lt;code&gt;Annual Review FINAL.pdf&lt;/code&gt;, or &lt;code&gt;meeting notes (draft).md&lt;/code&gt;, or any of the other 23 files someone had named the way normal humans name files — with spaces in them. The backup had been quietly skipping the most important files on the share for a week, and the nightly email had been telling me the whole time that nothing was wrong.&lt;/p&gt;

&lt;p&gt;Here's the line that did it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;for &lt;/span&gt;file &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Processing &lt;/span&gt;&lt;span class="nv"&gt;$file&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The problem is word splitting, and it's invisible until the day it isn't. An unquoted &lt;code&gt;$(ls ...)&lt;/code&gt; splits its output on every space, so &lt;code&gt;Q3 forecast.xlsx&lt;/code&gt; doesn't arrive as one filename — it arrives as two iterations, &lt;code&gt;Q3&lt;/code&gt; and &lt;code&gt;forecast.xlsx&lt;/code&gt;. Neither one exists on disk. The loop tries both, finds nothing, shrugs, and moves on without a single error. The script "succeeds" because from its point of view it did exactly what it was told.&lt;/p&gt;

&lt;p&gt;I've now got two habits burned in, and I don't write a loop without both. The first: glob the directory, never parse &lt;code&gt;ls&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;for &lt;/span&gt;file &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;/&lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$file&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;   &lt;span class="c"&gt;# the glob yields a literal '*' on an empty dir&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Processing &lt;/span&gt;&lt;span class="nv"&gt;$file&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;"$DIR"/*&lt;/code&gt; hands you each real path as a single unit. No subshell, no string to re-split, no &lt;code&gt;ls&lt;/code&gt; output to misread. The &lt;code&gt;[[ -e "$file" ]] || continue&lt;/code&gt; guard covers the one quirk of globbing: when a directory is empty, &lt;code&gt;*&lt;/code&gt; expands to the literal character &lt;code&gt;*&lt;/code&gt;, and without the guard you'd try to process a file named &lt;code&gt;*&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The second habit: quote every expansion, every time. &lt;code&gt;"$file"&lt;/code&gt;, never &lt;code&gt;$file&lt;/code&gt;. The day a filename has a space in it, the unquoted version splits into two arguments and your command operates on paths that were never there.&lt;/p&gt;

&lt;p&gt;Arrays follow the exact same rule, and the distinction that matters is &lt;code&gt;[@]&lt;/code&gt; versus &lt;code&gt;[*]&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;servers&lt;/span&gt;&lt;span class="o"&gt;=(&lt;/span&gt;&lt;span class="s2"&gt;"web-01"&lt;/span&gt; &lt;span class="s2"&gt;"db-prod 02"&lt;/span&gt; &lt;span class="s2"&gt;"cache-03"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for &lt;/span&gt;host &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;servers&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Connecting to &lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;   &lt;span class="c"&gt;# three clean iterations, the space survives&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;"${servers[@]}"&lt;/code&gt; in double quotes gives you one word per element — &lt;code&gt;db-prod 02&lt;/code&gt; stays whole. &lt;code&gt;"${servers[*]}"&lt;/code&gt; joins everything into a single string and is almost never what you want in a loop. Drop the quotes on either and you're back to the bug that ate my backup.&lt;/p&gt;

&lt;p&gt;When you genuinely need a counter — walking &lt;code&gt;app.log.1&lt;/code&gt; through &lt;code&gt;app.log.9&lt;/code&gt;, numbering batches, counting retries — use the C-style form. Do not reach for &lt;code&gt;for i in {1..$n}&lt;/code&gt;; brace expansion runs &lt;em&gt;before&lt;/em&gt; variable expansion, so &lt;code&gt;$n&lt;/code&gt; never gets substituted and you loop once over the literal text &lt;code&gt;{1..$n}&lt;/code&gt;. Ask me how I know.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;i &lt;span class="o"&gt;=&lt;/span&gt; 1&lt;span class="p"&gt;;&lt;/span&gt; i &amp;lt;&lt;span class="o"&gt;=&lt;/span&gt; MAX_ROTATED&lt;span class="p"&gt;;&lt;/span&gt; i++&lt;span class="o"&gt;))&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nv"&gt;log&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_DIR&lt;/span&gt;&lt;span class="s2"&gt;/app.log.&lt;/span&gt;&lt;span class="nv"&gt;$i&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$log&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="k"&gt;continue
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Scanning &lt;/span&gt;&lt;span class="nv"&gt;$log&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And reading a file line by line — &lt;code&gt;for line in $(cat file)&lt;/code&gt; is wrong in two directions at once. It splits on whitespace instead of newlines, so a line with spaces becomes several iterations and blank lines vanish, and it globs, so a line containing &lt;code&gt;*&lt;/code&gt; expands to filenames in your current directory. The form that survives real files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; line&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Line: &lt;/span&gt;&lt;span class="nv"&gt;$line&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt; &amp;lt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INPUT_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;IFS=&lt;/code&gt; stops bash from trimming leading and trailing whitespace. &lt;code&gt;-r&lt;/code&gt; stops it from eating backslashes. One line per iteration, exactly as written.&lt;/p&gt;

&lt;p&gt;The production script I keep on the site ties all of this together — globs the directory, quotes every use, counts successes and failures separately, and exits non-zero if anything failed. That last part is the one people skip, and it's the one that matters: a loop that processes files but always exits 0 is how you end up with a nightly email that says 847 when the real number is 824.&lt;/p&gt;

&lt;p&gt;The cost of my mistake was a week of bad backups and the specific discomfort of a coworker finding the gap before my own tooling did. I'm not interested in repeating that, so the quoting is muscle memory now. Yours might as well be too.&lt;/p&gt;

&lt;p&gt;Full script with all four loop forms — files, arrays, counters, and line-by-line reads — production-ready and ShellCheck-clean: &lt;a href="https://bashsnippets.xyz/snippets/bash-for-loop-examples" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/snippets/bash-for-loop-examples&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If your loops are the thing crashing scripts, the &lt;a href="https://bashsnippets.xyz/snippets/bash-error-handling" rel="noopener noreferrer"&gt;Bash Error Handling&lt;/a&gt; snippet pairs with this one, and the rest of the library is at &lt;a href="https://bashsnippets.xyz" rel="noopener noreferrer"&gt;https://bashsnippets.xyz&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>linux</category>
      <category>devops</category>
      <category>sysadmin</category>
    </item>
    <item>
      <title>The Pipeline Was Green for Three Weeks. It Had Been Shipping a Build That Never Compiled.</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Fri, 12 Jun 2026 16:27:11 +0000</pubDate>
      <link>https://dev.to/bashsnippets/the-pipeline-was-green-for-three-weeks-it-had-been-shipping-a-build-that-never-compiled-3k91</link>
      <guid>https://dev.to/bashsnippets/the-pipeline-was-green-for-three-weeks-it-had-been-shipping-a-build-that-never-compiled-3k91</guid>
      <description>&lt;p&gt;For three weeks a deployment pipeline reported every step green and shipped a build that had failed to compile on every single run. The build step ended in &lt;code&gt;npm run build | tee build.log&lt;/code&gt; so the output could be archived. That pipe is the whole story: bash returns the exit status of the &lt;em&gt;last&lt;/em&gt; command in a pipeline, which was &lt;code&gt;tee&lt;/code&gt;, and &lt;code&gt;tee&lt;/code&gt; always succeeds at copying text. The compiler's non-zero exit got thrown away the instant the pipe handed off. The error was sitting right there in &lt;code&gt;build.log&lt;/code&gt;. GitHub Actions saw exit code 0, painted the step green, and deployed the broken artifact. Nobody read the log, because the checkmark said there was nothing to read.&lt;/p&gt;

&lt;p&gt;That's the defining property of bash in CI, and it's why I treat pipeline scripts differently from anything I run in a terminal: &lt;strong&gt;a silent failure can present as success.&lt;/strong&gt; On a server you watch a command fail in front of you. In a pipeline, a swallowed exit code produces a green checkmark over broken code, and the gap between "the logs show an error" and "the pipeline reports failure" is exactly where outages are born. I wrote the full guide because I've now been burned by every variation of this, and there's a consistent set of habits that close the gap.&lt;/p&gt;

&lt;p&gt;There are four failure modes that are specific to CI and barely ever bite you at an interactive prompt. &lt;strong&gt;Exit codes swallowed by a pipe&lt;/strong&gt; — the story above, any &lt;code&gt;command | tee&lt;/code&gt;, &lt;code&gt;command | grep&lt;/code&gt;, &lt;code&gt;command | sort&lt;/code&gt;. &lt;strong&gt;Shell provisioning differences&lt;/strong&gt; — &lt;code&gt;ubuntu-latest&lt;/code&gt; gives you bash 5.x, &lt;code&gt;macos-latest&lt;/code&gt; gives you bash 3.2 from 2007, and a script using associative arrays or &lt;code&gt;${var,,}&lt;/code&gt; passes on one runner and throws a syntax error on the other in the same workflow. &lt;strong&gt;Environment variable gaps&lt;/strong&gt; — CI sets variables you don't control and omits ones you assume exist, and without &lt;code&gt;set -u&lt;/code&gt; a missing &lt;code&gt;$DEPLOY_TARGET&lt;/code&gt; becomes an empty string and does something quietly wrong. &lt;strong&gt;Interactive-shell assumptions&lt;/strong&gt; — CI runs a non-interactive, non-login shell that never sources your &lt;code&gt;.bashrc&lt;/code&gt;, so a command that works when you type it dies with &lt;code&gt;command not found&lt;/code&gt; because the thing that defined it was never loaded.&lt;/p&gt;

&lt;p&gt;The header that closes most of these is short, and every line earns its place:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail
&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;$'&lt;/span&gt;&lt;span class="se"&gt;\n\t&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;set -e&lt;/code&gt; exits the moment any command fails, so the step exits non-zero and the workflow actually registers a failure. &lt;code&gt;set -u&lt;/code&gt; treats an unset variable as an error, so a typo'd &lt;code&gt;$DPLOY_TARGET&lt;/code&gt; dies immediately instead of expanding to nothing and corrupting a path. &lt;code&gt;set -o pipefail&lt;/code&gt; makes a pipeline return the first non-zero exit among its commands rather than only the last — that one flag is the direct fix for the &lt;code&gt;| tee&lt;/code&gt; bug that ran green for three weeks.&lt;/p&gt;

&lt;p&gt;Secrets deserve their own paragraph because CI hands you &lt;code&gt;env:&lt;/code&gt; values and &lt;code&gt;secrets:&lt;/code&gt; values identically — the shell can't tell them apart, the only difference is that Actions masks the secret's literal string in the log. The trap is that the moment you transform a secret (base64-decode it, slice it, interpolate it), the transformed value no longer matches the mask and prints in clear text. Validate required values up front with &lt;code&gt;${VAR:?}&lt;/code&gt; so a missing secret fails at startup with a clear message instead of on line 47 with a cryptic &lt;code&gt;permission denied&lt;/code&gt;, and be very careful with &lt;code&gt;set -x&lt;/code&gt; in any step that touches a secret.&lt;/p&gt;

&lt;p&gt;The pipe-exit-code problem is worth one concrete tool beyond &lt;code&gt;pipefail&lt;/code&gt;: &lt;code&gt;PIPESTATUS&lt;/code&gt; is an array holding the exit code of every command in the last pipeline, read immediately after:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm run build | &lt;span class="nb"&gt;tee &lt;/span&gt;build.log
&lt;span class="nv"&gt;build_rc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;PIPESTATUS&lt;/span&gt;&lt;span class="p"&gt;[0]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;   &lt;span class="c"&gt;# npm's code, not tee's&lt;/span&gt;
&lt;span class="o"&gt;[[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$build_rc&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-eq&lt;/span&gt; 0 &lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"build failed: &lt;/span&gt;&lt;span class="nv"&gt;$build_rc&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nb"&gt;exit&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$build_rc&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;pipefail&lt;/code&gt; has one well-known false positive — &lt;code&gt;grep&lt;/code&gt; returns 1 when it finds no matches, which is often fine, and under &lt;code&gt;pipefail&lt;/code&gt; plus &lt;code&gt;set -e&lt;/code&gt; that aborts the script. Absorb it deliberately with &lt;code&gt;|| true&lt;/code&gt; only where a non-match is genuinely acceptable, and nowhere else, because blanketing every command in &lt;code&gt;|| true&lt;/code&gt; just reinvents the silent-success problem you're trying to kill.&lt;/p&gt;

&lt;p&gt;Docker has its own landmine: every entrypoint script must end with &lt;code&gt;exec "$@"&lt;/code&gt;. Without it, your script stays PID 1 and your app runs as a child, so when the orchestrator sends SIGTERM on &lt;code&gt;docker stop&lt;/code&gt; or a rolling deploy, the signal hits the &lt;em&gt;script&lt;/em&gt;, which doesn't forward it, and after the grace period the orchestrator escalates to SIGKILL — abrupt termination, dropped connections, lost in-flight work. &lt;code&gt;exec "$@"&lt;/code&gt; replaces the shell with your app so it &lt;em&gt;becomes&lt;/em&gt; PID 1 and receives signals directly. The guide pairs this with a &lt;code&gt;wait_for&lt;/code&gt; dependency-check pattern and a trap, and the &lt;a href="https://bashsnippets.xyz/tools/bash-trap-builder" rel="noopener noreferrer"&gt;Bash trap &amp;amp; Signal Handler Builder&lt;/a&gt; generates the exact signal block an entrypoint needs.&lt;/p&gt;

&lt;p&gt;Deploys get the same treatment: deploy into a fresh timestamped directory, flip a &lt;code&gt;current&lt;/code&gt; symlink atomically with &lt;code&gt;ln -sfn&lt;/code&gt; so traffic never sees a half-written release, keep the last several releases so rollback is just re-pointing the symlink, run a health check after the swap and fail the deploy if it doesn't pass, and stamp the git SHA into the release so "what's running right now" always has an answer. And when a step fails and the logs won't say why, &lt;code&gt;set -x&lt;/code&gt; around just the suspect section shows you each command with its variables expanded — a doubled slash or an empty segment in the trace is usually your bug standing in plain sight.&lt;/p&gt;

&lt;p&gt;The full guide is the field manual version of all of this — the four failure modes, the safe header, secret validation with a &lt;code&gt;validate_env&lt;/code&gt; function, &lt;code&gt;PIPESTATUS&lt;/code&gt; and &lt;code&gt;pipefail&lt;/code&gt;, Docker entrypoints, the atomic-symlink deploy script with rollback and health check, debugging with &lt;code&gt;set -x&lt;/code&gt;, and a production-ready checklist at the end: &lt;a href="https://bashsnippets.xyz/guides/bash-scripting-for-ci-cd-pipelines" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/guides/bash-scripting-for-ci-cd-pipelines&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If your pipeline scripts are dying on the small stuff first — unquoted loops, bad argument parsing, missing traps — the snippet library that feeds into this guide is at &lt;a href="https://bashsnippets.xyz" rel="noopener noreferrer"&gt;https://bashsnippets.xyz&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>devops</category>
      <category>cicd</category>
      <category>docker</category>
    </item>
    <item>
      <title>My Script Crashed and Left a Lock File Behind. Every Run After That Refused to Start.</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Thu, 11 Jun 2026 16:56:45 +0000</pubDate>
      <link>https://dev.to/bashsnippets/my-script-crashed-and-left-a-lock-file-behind-every-run-after-that-refused-to-start-1ik4</link>
      <guid>https://dev.to/bashsnippets/my-script-crashed-and-left-a-lock-file-behind-every-run-after-that-refused-to-start-1ik4</guid>
      <description>&lt;p&gt;A backup script of mine created a lock file on startup so two copies couldn't run at once — sensible. Then one night it hit an error partway through, &lt;code&gt;set -e&lt;/code&gt; killed it on the spot, and it died without ever reaching the line that removes the lock. The lock file sat there. Every scheduled run for the next three days started up, saw the lock, printed "already running," and exited immediately. No backups ran. The cron job was firing perfectly on time and doing nothing, and the only symptom was an absence — backups that simply weren't there — until I went looking and found a stale lock from Tuesday.&lt;/p&gt;

&lt;p&gt;The fix is a &lt;code&gt;trap&lt;/code&gt;. A trap registers a cleanup handler that runs when the script exits &lt;em&gt;for any reason&lt;/em&gt; — clean finish, &lt;code&gt;set -e&lt;/code&gt; failure, Ctrl+C, &lt;code&gt;kill&lt;/code&gt;. Put the lock removal in an EXIT trap and it runs no matter how the script dies. The lock would have been gone the instant that backup crashed, and the next run would have started fine.&lt;/p&gt;

&lt;p&gt;So why did I write the script without one? Because I could never remember the syntax cold. Single quotes or double? Which signals? Does EXIT fire on &lt;code&gt;exit 1&lt;/code&gt; or only on a clean finish? How do I get the exit code inside the handler? Every time I needed a trap I ended up with three browser tabs open, reading the same Stack Overflow answers, second-guessing the quoting. Under pressure, in the middle of fixing something else, that friction is exactly when people skip the trap entirely — which is how I ended up with the stale lock in the first place.&lt;/p&gt;

&lt;p&gt;So I built the thing I kept wishing existed: a &lt;strong&gt;Bash trap &amp;amp; Signal Handler Builder&lt;/strong&gt;. You pick the signals you want to handle, check off the cleanup actions you need, and it writes a correct, ShellCheck-clean trap block you paste into your script. No tabs, no second-guessing the quoting.&lt;/p&gt;

&lt;p&gt;It covers the signals that actually come up. &lt;strong&gt;EXIT&lt;/strong&gt; — fires on every exit, the one that should carry your cleanup. &lt;strong&gt;ERR&lt;/strong&gt; — fires when a command fails under &lt;code&gt;set -e&lt;/code&gt;, the one that logs the exact failing line with &lt;code&gt;$LINENO&lt;/code&gt;. &lt;strong&gt;INT&lt;/strong&gt; — Ctrl+C. &lt;strong&gt;TERM&lt;/strong&gt; — what &lt;code&gt;kill&lt;/code&gt;, &lt;code&gt;systemctl stop&lt;/code&gt;, and &lt;code&gt;docker stop&lt;/code&gt; send. &lt;strong&gt;HUP&lt;/strong&gt; — terminal closed or SSH dropped. &lt;strong&gt;PIPE&lt;/strong&gt; — writing to a closed pipe. Each one has a one-line reminder of when it fires and what it's good for, because half the battle is just remembering that TERM is the one Docker sends.&lt;/p&gt;

&lt;p&gt;The cleanup actions are the things people forget until a crash makes them care: remove temp files (with the &lt;code&gt;TMPFILE=$(mktemp)&lt;/code&gt; declaration wired in up top), remove a lock file — the exact failure that bit me — stop background jobs the script started, log the exit reason with the code, and restore the terminal cursor. Tick the ones you need and they land in the handler.&lt;/p&gt;

&lt;p&gt;The generated code isn't a toy snippet. It single-quotes the trap so the handler resolves when the signal fires instead of at definition time — the SC2064 gotcha most hand-written traps get wrong. It includes an idempotency guard so that when ERR and EXIT both fire on the same failure, your cleanup runs exactly once instead of twice. The ERR handler captures &lt;code&gt;$LINENO&lt;/code&gt; so you find out which line actually blew up. I ran the generator's output through ShellCheck across a dozen different configurations while building it, and every one comes back clean — the whole point was that you can paste it and trust it, not paste it and then go debug the thing that was supposed to save you debugging.&lt;/p&gt;

&lt;p&gt;There are two copy buttons, because there are two situations. "Copy trap block only" when you've already got a script and just want to drop the traps in. "Copy complete script header" when you're starting fresh and want the shebang, &lt;code&gt;set -euo pipefail&lt;/code&gt;, the &lt;code&gt;CHECK&lt;/code&gt;/&lt;code&gt;CROSS&lt;/code&gt; vars, the resource declarations, and the handler all assembled in the right order. It only declares the variables the generated code actually uses, so you never paste in a &lt;code&gt;CROSS&lt;/code&gt; you never reference and get a ShellCheck warning for your trouble.&lt;/p&gt;

&lt;p&gt;The lock-file incident cost me three days of silently missing backups and ten minutes of feeling foolish when I found the cause. The trap that would have prevented it is four lines. The reason I didn't have those four lines was pure friction — I couldn't recall the syntax fast enough to be bothered in the moment. This tool removes the friction, which is the only thing that was ever standing between me and a correct script.&lt;/p&gt;

&lt;p&gt;Build your trap block here — pick signals, pick cleanup actions, copy clean code: &lt;a href="https://bashsnippets.xyz/tools/bash-trap-builder" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/tools/bash-trap-builder&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you want the full picture of where traps fit — strict mode, cleanup, the failure modes that make them necessary — the &lt;a href="https://bashsnippets.xyz/snippets/bash-error-handling" rel="noopener noreferrer"&gt;Bash Error Handling&lt;/a&gt; snippet is the companion read, and the rest of the tools are at &lt;a href="https://bashsnippets.xyz/tools" rel="noopener noreferrer"&gt;https://bashsnippets.xyz/tools&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>linux</category>
      <category>devops</category>
      <category>beginners</category>
    </item>
    <item>
      <title>I Packaged the Scripts I Copy to Every New Server Into a $9 Toolkit. Here's What's In It and Why.</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Tue, 09 Jun 2026 15:44:19 +0000</pubDate>
      <link>https://dev.to/bashsnippets/i-packaged-the-scripts-i-copy-to-every-new-server-into-a-9-toolkit-heres-whats-in-it-and-why-cn6</link>
      <guid>https://dev.to/bashsnippets/i-packaged-the-scripts-i-copy-to-every-new-server-into-a-9-toolkit-heres-whats-in-it-and-why-cn6</guid>
      <description>&lt;p&gt;Every time I provision a new server — whether it's a $5 DigitalOcean droplet for a side project or a client's production box — there's a set of scripts I copy to &lt;code&gt;/opt/scripts&lt;/code&gt; before I do anything else.&lt;/p&gt;

&lt;p&gt;Not after the app is deployed. Not after the first incident. Before I touch the application at all. Before I configure nginx. Before I set up the database. The monitoring layer goes in first because the first time you need it, you needed it yesterday.&lt;/p&gt;

&lt;p&gt;Disk monitoring that fires before the outage. A backup pipeline with automatic retention. SSL certificate checks that run daily at 8am so I'm not finding out from a user's email. A service watchdog that restarts nginx or Postgres within 60 seconds of a crash instead of six hours later when someone notices the site is down.&lt;/p&gt;

&lt;p&gt;These scripts took me about two years of production incidents to build. Not because they're complicated — they're not. Each one is 15-50 lines. Because each one was built in response to a specific failure where I didn't have the thing I needed and had to build it under pressure at a bad time of day.&lt;/p&gt;

&lt;p&gt;I open-sourced the basic versions on BashSnippets.xyz. Those are free and they'll stay free. But the versions I actually run in production are different from the tutorial versions in ways that matter — and that gap is what I packaged into the toolkit.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Different Between the Free Snippets and the Toolkit
&lt;/h2&gt;

&lt;p&gt;The free snippets on the site are single-purpose scripts that each solve one problem. They're complete. They work. If all you need is a disk space check, the free version does that.&lt;/p&gt;

&lt;p&gt;The toolkit versions are built as a system.&lt;/p&gt;

&lt;p&gt;There's a shared library — &lt;code&gt;bashlib.sh&lt;/code&gt; — with 31 functions that every script sources. Logging, color output, email alerts, error handling, lock file management, threshold checks, dry-run support. Instead of each script reimplementing its own &lt;code&gt;log()&lt;/code&gt; function and its own error handling and its own email logic, they all call &lt;code&gt;bashlib.sh&lt;/code&gt; and get consistent behavior across the board.&lt;/p&gt;

&lt;p&gt;That means when I change how logging works, it changes everywhere. When I add Slack webhook support to the alert function, every script that calls &lt;code&gt;alert()&lt;/code&gt; gets Slack notifications without any changes to the script itself. The shared library is the infrastructure layer that turns six standalone scripts into a cohesive system.&lt;/p&gt;

&lt;p&gt;The free snippets don't have this because a shared library adds a dependency — &lt;code&gt;bashlib.sh&lt;/code&gt; has to exist at a known path, the scripts have to source it at startup, and if someone downloads one script without the library it breaks. That's fine for a toolkit you install as a package. It's bad for a tutorial page where someone wants to copy-paste one script and have it work immediately. Both approaches are correct for their context.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's in the Box
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;6 production scripts:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each one follows the same structure — &lt;code&gt;set -euo pipefail&lt;/code&gt;, sourcing &lt;code&gt;bashlib.sh&lt;/code&gt;, named variables for every threshold and path (no magic numbers buried in command pipelines), comments explaining not just what each line does but why it exists, and explicit non-zero exits on every failure path.&lt;/p&gt;

&lt;p&gt;The scripts cover disk space monitoring, database backup with retention, SSL certificate expiry checking across multiple domains, service watchdog with automatic restart, log rotation and cleanup, and system health reporting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;bashlib.sh — the shared library (31 functions):&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the part I'm most particular about. Functions for timestamped logging with severity levels. Color-coded terminal output that degrades gracefully when piped to a file (no escape codes in your log files). Email and webhook alerting. Lock file acquisition with stale-lock detection. Dry-run mode that every function respects. PID file management. Threshold comparison helpers. Configuration file loading.&lt;/p&gt;

&lt;p&gt;Every function is documented with a usage comment. Every function handles its own error cases. The library passes ShellCheck with zero warnings at every severity level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;template.sh:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A starter template that sources &lt;code&gt;bashlib.sh&lt;/code&gt;, sets up traps, parses arguments, and has placeholder sections for your own logic. When I need a new script on a server, I copy this template, fill in the business logic, and the error handling and logging are already done.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;52-page field guide (PDF):&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not an API reference. Not a man page reformatted as a PDF. A field guide — structured as the things you need to know in the order you need to know them when you're setting up automation on a new server.&lt;/p&gt;

&lt;p&gt;Covers the why behind every pattern in the scripts. Why &lt;code&gt;set -euo pipefail&lt;/code&gt; and what each flag actually prevents. Why traps on EXIT instead of just INT. Why lock files need stale detection. Why backup retention has to be a separate step from the backup itself. Why SSL monitoring should be independent of your renewal tool.&lt;/p&gt;

&lt;p&gt;Each section includes the real failure scenario that motivated the pattern, because "best practice" without the consequence attached is advice that gets skipped.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why $9
&lt;/h2&gt;

&lt;p&gt;I thought about this for a while. The scripts are worth more than $9 to anyone who's going to use them — preventing one 4am incident pays for the toolkit immediately. But I also know what it's like to be the person running a $5 VPS on a budget, and I wanted the price to be low enough that buying it doesn't require approval from anyone or a second thought.&lt;/p&gt;

&lt;p&gt;$9. MIT license. Unlimited personal and commercial use. You can deploy these on client servers, modify them however you want, include them in your own automation. No subscription. No upsell. No "starter tier" with premium features behind another paywall.&lt;/p&gt;

&lt;p&gt;The free snippets on the site are not going away. They're not a demo. They're complete, working scripts that I actively maintain. The toolkit is for people who want the production layer — the shared library, the integrated system, the field guide that ties it together — and want it in one download instead of building it themselves.&lt;/p&gt;




&lt;h2&gt;
  
  
  Who This Is For
&lt;/h2&gt;

&lt;p&gt;If you're managing one or more Linux servers and you don't already have automated monitoring, backups, and alerting set up — this is the fastest path to having all three. Copy the scripts to the server, edit the config variables at the top of each file, add the cron entries from the guide, and you have a monitoring layer that didn't exist 10 minutes ago.&lt;/p&gt;

&lt;p&gt;If you already have these things set up and you built them yourself, you probably don't need this. You might find something useful in &lt;code&gt;bashlib.sh&lt;/code&gt; — the stale lock detection or the dry-run mode — but the scripts themselves won't tell you anything new.&lt;/p&gt;

&lt;p&gt;If you're learning bash and want to see how production scripts are structured differently from tutorial scripts, the field guide is probably the most useful part. The "why" sections explain patterns that most bash tutorials skip because they're not relevant to a single-file script running on a laptop.&lt;/p&gt;




&lt;h2&gt;
  
  
  Every Script Passes ShellCheck at Zero Warnings
&lt;/h2&gt;

&lt;p&gt;This one matters to me. ShellCheck is the static analysis tool for bash. It catches the bugs that work on happy-path input and break on edge cases — unquoted variables that split on whitespace, pipelines that swallow exit codes, deprecated syntax that newer bash versions handle differently.&lt;/p&gt;

&lt;p&gt;Every script in the toolkit, including &lt;code&gt;bashlib.sh&lt;/code&gt;, passes ShellCheck at the strictest severity level with zero warnings. Not "a few style notes we decided were fine." Zero. I treat ShellCheck warnings the same way I treat compiler warnings in C — they are bugs I haven't hit yet, and ignoring them is technical debt with a guaranteed due date.&lt;/p&gt;

&lt;p&gt;If you run ShellCheck on these files and get output, something went wrong and I want to know about it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Link
&lt;/h2&gt;

&lt;p&gt;→ &lt;a href="https://bashsnippets.xyz/starter-kit" rel="noopener noreferrer"&gt;bashsnippets.xyz/starter-kit&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;$9. Instant download. 6 production scripts + &lt;code&gt;bashlib.sh&lt;/code&gt; shared library (31 functions) + &lt;code&gt;template.sh&lt;/code&gt; + 52-page field guide. ShellCheck-clean. MIT license.&lt;/p&gt;

&lt;p&gt;Already have scripts and need somewhere to run them? DigitalOcean droplets start at $4/month and you can get $200 in free credit to start:&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://m.do.co/c/7a196437764c" rel="noopener noreferrer"&gt;Get $200 free credit — DigitalOcean&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The free snippet library with 17+ scripts and 7 interactive tools is at:&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://bashsnippets.xyz" rel="noopener noreferrer"&gt;bashsnippets.xyz&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>linux</category>
      <category>sysadmin</category>
      <category>beginners</category>
    </item>
    <item>
      <title>6 small things we shipped across the BashSnippets tools this week</title>
      <dc:creator>Anguishe</dc:creator>
      <pubDate>Tue, 09 Jun 2026 03:49:07 +0000</pubDate>
      <link>https://dev.to/bashsnippets/6-small-things-we-shipped-across-the-bashsnippets-tools-this-week-287d</link>
      <guid>https://dev.to/bashsnippets/6-small-things-we-shipped-across-the-bashsnippets-tools-this-week-287d</guid>
      <description>&lt;p&gt;Nobody announces small features. You ship them, they're in there, and the people who find them either notice or they don't. I want to start documenting these because some of them are the kind of thing that makes a tool actually worth using day-to-day instead of being something you visit once and close.&lt;/p&gt;

&lt;p&gt;Six updates across six tools this week. None of them are headline features. All of them are fixes for specific annoyances that came up in my own usage, which is the only real source I trust for "does this actually help."&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Bash Boilerplate Generator — Trap &amp;amp; Cleanup Handler
&lt;/h2&gt;

&lt;p&gt;There's a now a toggle for trap handling. When on, the generated template includes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cleanup&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="c"&gt;# Remove temp files, release locks, undo partial changes&lt;/span&gt;
  &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TMPFILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="nb"&gt;trap &lt;/span&gt;cleanup EXIT INT TERM
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this matters: &lt;code&gt;trap cleanup EXIT&lt;/code&gt; means the &lt;code&gt;cleanup()&lt;/code&gt; function runs no matter how the script exits. Normal completion. &lt;code&gt;Ctrl+C&lt;/code&gt;. An uncaught error. A &lt;code&gt;kill&lt;/code&gt; signal. The cleanup function runs. Every time.&lt;/p&gt;

&lt;p&gt;Without this pattern, scripts that create temp files leave them behind when they're interrupted. Scripts that acquire lock files leave them locked. Scripts that make partial changes — creating a directory, writing part of a config — leave the system in an inconsistent state because the cleanup code at the end of the script never ran.&lt;/p&gt;

&lt;p&gt;The trap-on-exit pattern is not optional for anything running unattended. It's the thing that separates "runs fine when nothing goes wrong" from "also recovers gracefully when something does." I've seen this pattern left out of boilerplate generators enough times that I wanted to make it explicit and easy to include.&lt;/p&gt;

&lt;p&gt;The generated stub includes a &lt;code&gt;TMPFILE=$(mktemp)&lt;/code&gt; line as a concrete example of something that needs cleanup. Replace it with whatever state your script actually manages.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://bashsnippets.xyz/tools/bash-boilerplate-generator" rel="noopener noreferrer"&gt;bashsnippets.xyz/tools/bash-boilerplate-generator&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Bash Exit Code Lookup — Export as &lt;code&gt;case&lt;/code&gt; Statement
&lt;/h2&gt;

&lt;p&gt;After looking up an exit code, there's a button that generates a ready-to-paste &lt;code&gt;case $? in ... esac&lt;/code&gt; block with the explanation inline as a comment.&lt;/p&gt;

&lt;p&gt;Before this, the lookup gave you the meaning of the exit code and you had to write the case handler yourself. That's not hard — the case syntax is simple — but it's friction. You looked up what &lt;code&gt;126&lt;/code&gt; means ("command found but not executable — check permissions"), now you have to translate that into:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="nv"&gt;$?&lt;/span&gt; &lt;span class="k"&gt;in
  &lt;/span&gt;0&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Success"&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
  1&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"General error"&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
  126&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Command found but not executable — check file permissions"&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;  &lt;span class="c"&gt;# ← the one you just looked up&lt;/span&gt;
  127&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Command not found — check PATH or typo"&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
  &lt;span class="k"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Unknown exit code: &lt;/span&gt;&lt;span class="nv"&gt;$?&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="p"&gt;;;&lt;/span&gt;
&lt;span class="k"&gt;esac&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The export button generates that block for you, with the specific code you looked up highlighted in the right place and the explanation preserved as a comment. Paste it directly into your script. No rewriting the syntax from scratch.&lt;/p&gt;

&lt;p&gt;The case template includes the four most common exit codes (0, 1, 126, 127) plus the wildcard catch-all. Delete the ones you don't need.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://bashsnippets.xyz/tools/bash-exit-code-lookup" rel="noopener noreferrer"&gt;bashsnippets.xyz/tools/bash-exit-code-lookup&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Cron Job Builder — Next 5 Run Times
&lt;/h2&gt;

&lt;p&gt;This is the one I wanted for myself.&lt;/p&gt;

&lt;p&gt;Enter any cron expression — say, &lt;code&gt;0 3 * * 1-5&lt;/code&gt; — and it now immediately shows the next five times it will fire:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Next run times for: 0 3 * * 1-5

1.  Mon Jun 09 2026  03:00:00
2.  Tue Jun 10 2026  03:00:00
3.  Wed Jun 11 2026  03:00:00
4.  Thu Jun 12 2026  03:00:00
5.  Fri Jun 13 2026  03:00:00
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The problem this solves: cron expressions are easy to get subtly wrong. &lt;code&gt;0 3 * * *&lt;/code&gt; fires at 3am. But 3am what — your server's local time or UTC? And is your server set to UTC? And does &lt;code&gt;*/6&lt;/code&gt; mean every 6 hours starting at midnight, or starting at the first minute it's defined? And does &lt;code&gt;0 9 * * 1&lt;/code&gt; fire on Monday or Sunday, depending on which cron implementation counts week starts?&lt;/p&gt;

&lt;p&gt;Without something that shows you the actual fire times, you add the job, wait until the next day, and find out whether your assumptions were right or wrong. If they were wrong, you change it and wait another day. It can take three or four days to confirm a cron expression fires when you want it to, purely because of iteration time.&lt;/p&gt;

&lt;p&gt;Showing the next five run times eliminates that cycle entirely. You know immediately whether your expression does what you think it does. Change it, verify it, add it to your crontab — all in under a minute.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://bashsnippets.xyz/tools/cron-job-builder" rel="noopener noreferrer"&gt;bashsnippets.xyz/tools/cron-job-builder&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Chmod Calculator — World-Writable Warning and Live Symbolic Mirror
&lt;/h2&gt;

&lt;p&gt;Two updates here bundled together because they're related.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;World-writable warning:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Enabling any world-write bit (the &lt;code&gt;w&lt;/code&gt; in the "others" column) now shows a warning banner:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ World-writable permissions mean any user on the system can modify this file or directory. This is rarely correct. Verify this is intentional.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the &lt;code&gt;chmod 777&lt;/code&gt; trap. I've seen &lt;code&gt;chmod 777&lt;/code&gt; applied as a "quick fix" for permission errors more times than I can count. It works immediately, which is why people do it. The fact that it means "literally every user and process on this system can write to this file" gets lost in the urgency of making the error go away.&lt;/p&gt;

&lt;p&gt;The warning doesn't prevent you from setting world-writable permissions. It just makes sure you've seen the words "any user on the system can modify this" before you click confirm. If you intended that, the warning is just noise. If you didn't, it might save you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Live symbolic ↔ octal mirror:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every permission you configure now shows both forms simultaneously:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Octal:    &lt;span class="nb"&gt;chmod &lt;/span&gt;755
Symbolic: &lt;span class="nb"&gt;chmod &lt;/span&gt;&lt;span class="nv"&gt;u&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;rwx,go&lt;span class="o"&gt;=&lt;/span&gt;rx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both update in real time as you toggle permissions. This is useful for two reasons: you see the octal code for when you need to type it quickly in a terminal, and you see the symbolic form for when you need to express the same permission in a script where the explicit form is easier to read and maintain. Neither is "more correct" — they're the same thing expressed two ways, and having both removes the mental step of converting between them.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://bashsnippets.xyz/tools/chmod-permissions-builder" rel="noopener noreferrer"&gt;bashsnippets.xyz/tools/chmod-permissions-builder&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  5. PATH Debugger — Duplicate Entry Detector
&lt;/h2&gt;

&lt;p&gt;Duplicate entries in &lt;code&gt;$PATH&lt;/code&gt; accumulate over years. Every time a tool's installer adds itself to your PATH, every time you add an entry to &lt;code&gt;.bashrc&lt;/code&gt;, every time a package manager prepends its bin directory to your environment — the list grows. On machines that have been around for a while, or on developer laptops where tools get installed and removed and reinstalled, you can end up with &lt;code&gt;/usr/local/bin&lt;/code&gt; listed four times.&lt;/p&gt;

&lt;p&gt;Duplicate entries don't usually cause obvious breakage. The correct binary still runs. It just runs with slightly more PATH resolution overhead on every command, and the duplicate entries make it harder to reason about which version of a tool will actually be picked when you have multiple installed. If &lt;code&gt;/usr/local/bin&lt;/code&gt; appears before &lt;code&gt;/usr/bin&lt;/code&gt;, the local install wins. When it appears four times, you've lost track of the actual resolution order.&lt;/p&gt;

&lt;p&gt;The debugger now flags repeated entries and generates a one-liner to deduplicate and export a clean PATH:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generated dedup command:&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="s1"&gt;':'&lt;/span&gt; &lt;span class="s1"&gt;'\n'&lt;/span&gt; | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'!seen[$0]++'&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="s1"&gt;'\n'&lt;/span&gt; &lt;span class="s1"&gt;':'&lt;/span&gt; | &lt;span class="nb"&gt;sed&lt;/span&gt; &lt;span class="s1"&gt;'s/:$//'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That pipeline: converts PATH from colon-separated to newline-separated, uses awk to keep only the first occurrence of each entry, converts back to colon-separated, and strips the trailing colon. The resulting PATH has the same resolution order as the original but with all duplicates removed.&lt;/p&gt;

&lt;p&gt;Add that line to the bottom of your &lt;code&gt;.bashrc&lt;/code&gt; and your PATH will deduplicate itself on every new shell session.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://bashsnippets.xyz/tools/path-debugger" rel="noopener noreferrer"&gt;bashsnippets.xyz/tools/path-debugger&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  6. ShellCheck Error Decoder — Inline Script Scanner
&lt;/h2&gt;

&lt;p&gt;ShellCheck is the best static analysis tool for bash scripts. If you're writing bash and you're not running your scripts through ShellCheck, you're missing real bugs. I'll say that plainly.&lt;/p&gt;

&lt;p&gt;The problem is that ShellCheck error codes (SC2086, SC2046, SC1091, etc.) are meaningful if you know what they mean and opaque if you don't. When ShellCheck tells you &lt;code&gt;SC2086: Double quote to prevent globbing and word splitting&lt;/code&gt;, that makes sense. When it just gives you &lt;code&gt;SC2086&lt;/code&gt; in isolation, you have to look it up.&lt;/p&gt;

&lt;p&gt;The inline script scanner adds a paste box where you put your bash script directly. The tool maps recognized SC error codes to the lines in your script where they appear — not in the abstract documentation, but in your specific code. You see the line, the SC code, and the explanation together.&lt;/p&gt;

&lt;p&gt;Important caveat I want to be direct about: &lt;strong&gt;this is not a replacement for running actual ShellCheck&lt;/strong&gt;. The real ShellCheck tool parses bash properly, handles edge cases, and catches issues that a pattern-matcher won't. If you're writing scripts that matter, install ShellCheck and run it directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
apt &lt;span class="nb"&gt;install &lt;/span&gt;shellcheck       &lt;span class="c"&gt;# Debian/Ubuntu&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;shellcheck      &lt;span class="c"&gt;# macOS&lt;/span&gt;

&lt;span class="c"&gt;# Run&lt;/span&gt;
shellcheck yourscript.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What the decoder in BashSnippets tools is useful for: understanding what a specific SC code means in the context of your own code rather than in a reference page. It's a learning tool and a quick lookup, not a substitute for the real thing.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://bashsnippets.xyz/tools/shellcheck-error-decoder" rel="noopener noreferrer"&gt;bashsnippets.xyz/tools/shellcheck-error-decoder&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;All tools are free, no login, no account required. The full tools index is at:&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://bashsnippets.xyz/tools" rel="noopener noreferrer"&gt;bashsnippets.xyz/tools&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://bashsnippets.xyz" rel="noopener noreferrer"&gt;bashsnippets.xyz&lt;/a&gt;&lt;/p&gt;

</description>
      <category>bash</category>
      <category>linux</category>
      <category>devops</category>
      <category>resources</category>
    </item>
  </channel>
</rss>
