The deploy script had been running in production for four months without a problem. It built releases into a temp directory, ran some validation, and then cleaned up by removing whatever $target pointed at. $target was set near the top of the script to the current release directory — the one the running application was serving from. A helper function called prepare() also used a variable named target, because the person who wrote it (me, four months earlier) did not think about scope.
prepare() built the new release into a temp directory. On a good run, it set target to the temp path, did its work, and returned. The main script then did its cleanup at the end and removed $target — which, after calling prepare(), was the temp directory. That worked correctly for four months.
Then a deploy failed partway through prepare(). The temp directory was half-built. target was now pointing at the half-built temp path. The main script caught the failure, started its cleanup, and ran rm -rf "$target". It removed the half-built temp directory, which was correct. But then it kept going — there was a second cleanup step that also used $target and expected it to still be the release directory. By the time the script finished, it had removed the running application's release directory. The application restarted and found nothing to serve.
The users noticed before I did. The restart loop was filling logs, the application was returning 502, and I was sitting in the deploy output trying to figure out what had gone wrong in a failure path I had tested against a stub.
Variables in bash are global by default
This is the single most surprising thing about bash functions if you have written code in almost any other language. In Python, a variable assigned inside a function is local to that function unless you explicitly declare it global. In bash, it is the opposite. A variable assigned inside a function is visible — and writable — everywhere in the current shell unless you declare it local.
target="/srv/release/current"
prepare() {
target=$(mktemp -d) # No local — this overwrites the global $target
echo "building in $target"
}
prepare
echo "target is now: $target" # Prints the temp dir, not /srv/release/current
The function does exactly what it looks like — it sets target. The problem is that it sets target everywhere, not just inside itself. The caller's target is gone.
local confines the assignment to the function scope:
prepare() {
local target # confined to this function
target=$(mktemp -d)
echo "$target" # hand the value out via stdout
}
build_dir=$(prepare) # capture what the function echoed
echo "built in: $build_dir"
echo "release dir still: $target" # unchanged
local target means: this variable exists only inside this function. When the function returns, the variable and its value vanish. The caller's target is never touched. The habit I now enforce on every bash function I write: local for every variable the function introduces, not just the ones I think might conflict. The conflict I do not predict is the one that deletes the wrong directory.
return is a status, not a value
After the incident, I audited every function in the deploy script. I found a second bug:
count_pending() {
local n
n=$(find "$QUEUE_DIR" -type f | wc -l)
return "$n" # WRONG if n > 255
}
count_pending
if [[ $? -gt 0 ]]; then
echo "queue has items"
fi
return sets an exit status. Exit statuses are a single byte: 0 to 255. return 300 wraps to 44. For months the queue count had been above 255 on busy days, and the $? check was comparing against a wrapped value. The logic had been wrong for months and had accidentally worked because the wrapped values still triggered the gt 0 condition. But any script making real decisions based on the actual count — how many workers to spin up, whether to page someone — would have been working with garbage.
The correct pattern is to echo the value and capture it:
count_pending() {
local n
n=$(find "$QUEUE_DIR" -type f | wc -l)
echo "$n" # data goes to stdout
return 0 # status: success
}
pending=$(count_pending)
echo "queue depth: $pending"
return answers "did this succeed." echo plus command substitution answers "what is the value." These are two different questions and bash gives you two different mechanisms for a reason. Mixing them is how a function that counts 300 items makes the caller think it counted 44.
Arguments and the $@ quoting rule
Inside a function, arguments arrive as positional parameters: $1, $2, all of them as "$@", the count as $#. Quoting "$@" is what keeps multi-word arguments intact:
process_hosts() {
echo "processing $# hosts"
for host in "$@"; do
echo " checking: $host"
done
}
process_hosts "web-01" "db primary" "cache-02"
Without the quotes around "$@", db primary splits into two loop iterations and you are back to the word-splitting problem from a different angle. The quoted "$@" is the way bash passes an array of arguments through a function call with each element preserved.
This matters most when you are writing wrapper functions — functions that receive arguments and pass them to another command:
run_with_retry() {
local retries="${1:?}"
shift
local attempt=0
while ((attempt < retries)); do
"$@" && return 0
((attempt++))
echo "retry $attempt/$retries"
sleep 2
done
return 1
}
run_with_retry 3 rsync -av "source dir/" remote:/dest/
"$@" after the shift is everything after the retry count — the command and all its arguments, each preserved as a separate item even if they contain spaces. Without the quotes, "source dir/" splits and rsync receives the wrong arguments.
getopts for anything with flags
For one or two fixed positional arguments, reading $1 and $2 directly is fine. The moment a function or script takes optional flags in any order, do not parse them by hand:
verbose=0
output_dir=""
while getopts "vd:" opt; do
case "$opt" in
v) verbose=1 ;;
d) output_dir="$OPTARG" ;;
*) echo "usage: $0 [-v] [-d dir]"; exit 1 ;;
esac
done
shift $((OPTIND - 1)) # move past the flags to positional args
The colon after d marks it as requiring an argument, which arrives in $OPTARG. getopts handles flag bundling (-vd dir), missing arguments (-d with no path generates an error automatically), and unknown flags. Hand-rolled $1 parsing gets all of these wrong in subtle ways — it accepts -d at the end without a value, it does not handle -vd dir, and it requires the flags in a specific order.
The deploy script that caused the incident had hand-rolled argument parsing. Among other things, it silently accepted a --target flag with no value and proceeded with an empty string, which caused a different class of problem I had also not fully traced before the bigger incident made the whole thing visible.
What the deploy script looks like now
Every function declares local for every variable. Values that need to cross function boundaries go through echo and command substitution. Exit statuses communicate success or failure. getopts handles the flags. There is a trap on EXIT that cleans up the temp directory using a local variable that only the cleanup function can see:
cleanup() {
local tmp_dir="${TEMP_BUILD_DIR:-}" # local ref to the temp dir
[[ -n "$tmp_dir" && -d "$tmp_dir" ]] && rm -rf "$tmp_dir"
}
trap cleanup EXIT
The $target variable in the main script is set once at the top and never touched by any function. Functions that need a temp directory create one, store it in a local variable, use it, and the cleanup trap handles removal. The variable naming conflict that caused the incident cannot happen because the pattern prevents it structurally.
The application has been running cleanly since then. The deploy script has hit the failure path twice since the fix — different failures, unrelated causes — and both times the cleanup ran correctly and left the running release untouched.
Full version with the local-scope fix, the echo-for-values pattern, the return-as-status trap, and a getopts template: https://bashsnippets.xyz/snippets/bash-functions-arguments
A function that fails should fail loudly — wrap the script in set -euo pipefail — and the bash boilerplate generator can scaffold all of this with the right traps and argument parsing wired in from the start. The rest is at https://bashsnippets.xyz
Top comments (0)