The Alert Never Fired Because the Loop Skipped the Last Line of the File

#devops #linux #bash #sysadmin

We kept a plaintext file of hostnames, one per line, and a monitoring script read the file and pinged each host every five minutes. When a host failed to respond, the script sent an email alert. The system had been running for months and it worked — we had caught three actual outages with it, which gave us real confidence in the setup.

app-07 was added to the list on a Thursday afternoon. The engineer who added it was using VS Code on a Mac, and VS Code by default does not add a trailing newline to a file when you append to it using certain editing workflows. The file had ended in a newline before the edit. After the edit, the last line — app-07 — had no trailing newline.

app-07 went down the following Sunday afternoon at 2:17pm. The monitoring script ran at 2:20, 2:25, 2:30, all the way through the evening. No alert ever fired. The on-call engineer found out at 8pm when a client emailed. The system had been down for almost six hours.

When I looked at the script, the bug was immediately obvious once I knew what to look for. But I had written that script, I had tested it, and I had been looking at the monitoring confirmation emails for months without ever noticing. The confirmation email listed the hosts it checked. app-07 was never in the list. I had been reading those emails without actually counting the hosts. I just scanned for the OK lines and moved on.

Why read drops the last line

read returns a success exit status when it reads a line and finds the newline that terminates it. When the file does not end in a newline, read still populates the variable with the final line's content, but it returns a non-zero (failure) exit status because it hit end-of-file before finding a terminator. A while read host loop checks the return status to decide whether to execute the loop body. On the final, newline-less line, read puts app-07 into host and then returns failure. The while loop sees failure and exits without running the body. The content is there. The variable is populated. The loop throws it away.

This behavior is documented in the POSIX spec for read. It is not a bash quirk. Any POSIX shell handles the missing-final-newline case this way. A plain while read line loop is incorrect for any file you do not personally control the formatting of, which in practice means nearly any file.

The fix is one extra clause:

#!/bin/bash
# Script: check-hosts.sh
# Purpose: ping every host in a file, including a newline-less final line
# Usage: ./check-hosts.sh hosts.txt
set -euo pipefail

CHECK="✓"
CROSS="✗"
HOST_FILE="${1:?Usage: check-hosts.sh <host-file>}"

while IFS= read -r host || [[ -n "$host" ]]; do
  [[ -z "$host" || "$host" == \#* ]] && continue
  if ping -c1 -W2 "$host" >/dev/null 2>&1; then
    echo "$CHECK up:   $host"
  else
    echo "$CROSS down: $host"
  fi
done < "$HOST_FILE"

|| [[ -n "$host" ]] says: if read returned failure but the variable is non-empty, run the loop body anyway. That is precisely the leftover-final-line case. read failed because it hit end-of-file, but it populated host with app-07 before returning. The || catches it. app-07 gets pinged.

What IFS= and -r actually do

You see while IFS= read -r line written in every correct read-loop example, and it is worth being specific about what each piece prevents because both have their own failure mode.

IFS= sets the field separator to empty for the duration of the read command. Without it, read strips leading and trailing whitespace from each line. A hostname like app-07 (with leading spaces, which some editors produce) becomes app-07, which might be correct. An indented config value, a Python-style YAML string, a log line that starts with spaces for alignment — all of these are silently modified. Setting IFS= tells read to take the line exactly as it appears.

-r prevents read from interpreting backslash sequences. Without -r, a line like C:\temp\logs has its backslashes consumed as escape characters and arrives as C:templogs. This matters less for hostname files and enormously for any script that processes Windows paths, config files that use backslash as a line-continuation character, or log files from mixed-OS environments. The -r flag is essentially free protection; there is no reason not to include it.

A bare read line without either flag silently mangles both whitespace and backslashes. The script works correctly on clean input and produces wrong output on input with edge cases. The wrong output does not produce an error. You find out when the data that mattered was the indented or backslash-containing kind.

The subshell trap that kills your counters

This is the one that is most likely to make you question your sanity:

# This looks correct. It is not.
fails=0
cat hosts.txt | while IFS= read -r host; do
  ping -c1 -W2 "$host" >/dev/null 2>&1 || ((fails++))
done
echo "Total failures: $fails"   # Always prints 0

The pipe creates a subshell for the right side. The while loop runs inside that subshell. fails increments correctly inside the subshell. When the subshell exits, the parent shell's fails is still 0, because the increment happened in a different process. The parent echo sees the original value.

This catches people because the loop body itself works — the pings happen, the increment logic is correct — but any state the loop was supposed to accumulate for later use is silently discarded. I spent forty minutes on a version of this problem before I remembered that pipes create subshells. It is the kind of thing that feels like a bash bug until you understand that it is behaving exactly as documented.

The fix is to redirect the file into the loop instead of piping into it:

fails=0
while IFS= read -r host || [[ -n "$host" ]]; do
  ping -c1 -W2 "$host" >/dev/null 2>&1 || ((fails++))
done < hosts.txt
echo "Total failures: $fails"   # Now correct

done < hosts.txt feeds the file to the loop's stdin without a pipe. The loop runs in the current shell. fails accumulates in the current shell. The echo sees the real count.

Why for loop is wrong for this

# Never do this — iterates words, not lines
for line in $(cat hosts.txt); do
  ping -c1 "$line"
done

$(cat hosts.txt) is command substitution. Bash captures the text output and word-splits it on IFS — spaces, tabs, newlines. For a file with one hostname per line and no spaces in the hostnames, this accidentally produces the right behavior. For any file with spaces — log lines, config values, paths with spaces, anything a non-developer might have generated — it splits lines into fragments and each fragment becomes a loop iteration.

There is no version of for line in $(cat file) that is correct for reading lines. The right tool is always while IFS= read -r line || [[ -n "$line" ]]; do ... done < file. The for loop is the right tool for iterating a known list that you control directly, not for reading file content.

The monitoring system, after the fix

After the app-07 incident we made three changes. The obvious one was fixing the read loop with the || [[ -n "$host" ]] guard. The second was adding a sanity check at the top of the script that counted the lines in the hosts file and compared it to the number of hosts the loop actually processed — a mismatch meant the file was malformed or something else was wrong. The third was adding a nightly email that included the count of hosts checked, not just the status of each one, so a future addition to the file that somehow got lost would show up as "expected 12 hosts, checked 11."

The confirmation email count was something I should have had from the start. If I had been looking at "checked 11/12 hosts" instead of a list of OK lines, I would have noticed app-07 missing on the first night. The monitoring was working. The observability of the monitoring was not.

Full version with CSV-field parsing, comment-skipping, and the subshell-safe redirect form: https://bashsnippets.xyz/snippets/bash-read-file-line-by-line

To iterate a list of files rather than a file's contents, reach for a for loop instead, and wrap anything that acts on what it reads in set -euo pipefail. More at https://bashsnippets.xyz