Why your bash scripts keep breaking — and how to escape the trap

#bash #ruby #shell #devops

The script that ate my afternoon

I had a "five-minute" bash script last Tuesday. By Thursday it was 380 lines, three functions deep, and silently swallowing errors that took me an hour to track down. The actual bug? A filename with a space in it. Of course.

If you've written shell scripts for more than a year, you know exactly the feeling. Bash is wonderful for one-liners and glue code. Anything past that and it starts pushing back. Hard.

This post is about why that happens, and what to actually do when you hit the wall — including a relatively new approach I've been poking at lately: writing your shell scripts in Ruby instead.

Why bash scripts rot so quickly

The root cause is simple but easy to miss when you're knee-deep in sed calls: bash has one data type, and it's a string.

Arrays exist, but they're awkward and don't nest. There are no hashes worth using until bash 4+ associative arrays, which still don't nest. Every variable expansion is a re-parse opportunity for the shell, which is why you've memorized incantations like "${arr[@]}" and still occasionally guess wrong.

Three specific failure modes show up over and over:

Quoting bugs. Filenames with spaces, newlines, or glob characters break naive scripts in ways that pass code review.
Error handling. Without set -euo pipefail, your script will cheerfully continue after a failed command. With it, you still get cryptic exits with no stack trace.
No real structure. Once you need a list of objects with properties, you're either parsing strings or shelling out to jq for everything.

Here's the classic foot-cannon, slightly disguised:

#!/bin/bash
# Looks innocent. Isn't.
for file in $(ls *.log); do        # word-splits on whitespace
  size=$(du -k $file | cut -f1)    # unquoted $file: broken on spaces
  if [ $size -gt 1000 ]; then      # numeric compare; hopes size isn't empty
    gzip $file                     # silent failure if file vanished
  fi
done

Every line above has a known sharp edge. The "fixed" version uses find -print0, xargs -0, proper quoting, and exit-code checks. By the time it works robustly, it is no longer readable.

Step one: stop trying to make bash do real programming

The first thing I do now when a script crosses about 50 lines is ask whether it should still be bash. If the answer involves "I need a hash map," "I need to parse JSON," or "I need real error handling" — the answer is no.

The traditional escape hatch is Python or Ruby with subprocess / backticks. That works, but it has its own friction: every external command becomes a multi-line ceremony involving argument arrays, stdout capture, and exception handling.

# Ruby's stdlib approach: safer, but verbose
require 'open3'

stdout, stderr, status = Open3.capture3('ls', '-la', '/tmp')
raise "ls failed: #{stderr}" unless status.success?
puts stdout

That's safer than bash, but you've traded one kind of verbosity for another. The reason people keep writing bash despite the pain is that piping commands together is genuinely ergonomic in a shell, and reproducing that ergonomics in a general-purpose language is hard.

The middle path: a real language that looks like a shell

This is where shells written in higher-level languages get interesting. The general idea has been around for a while — xonsh for Python, elvish and nushell with their own languages, and more recently Rubish, which is described as a Unix shell written in pure Ruby.

I haven't run Rubish in production (it's clearly an experiment, and you should treat it that way), but the concept is worth understanding even if you don't adopt the specific tool. The pitch is: your shell IS Ruby. Commands are method calls, pipes are operators, variables are real Ruby objects.

What that buys you, conceptually:

Real data structures inside pipelines (hashes, arrays of objects, not strings).
Exceptions instead of silent exit codes.
Method chaining instead of fragile string parsing.
You can require libraries from the standard library inside a script.

The tradeoff is real too: you're now depending on a Ruby runtime for ops work, your colleagues need to read Ruby, and a shell like this is nowhere near as battle-tested as bash. For one-off automation on your own machine, that's fine. For a production deploy script that runs on a stranger's server, it's a harder sell.

A pragmatic prevention checklist

After enough afternoons lost to broken bash, here's the checklist I actually use before writing a new script:

Will this be under 30 lines and run in one place? Use bash. Add set -euo pipefail on line 2. Quote every variable.
Do I need any data structure more complex than a flat list of strings? Use a real language.
Will anyone other than me run this? Use a real language with explicit error messages.
Is this part of a build/deploy pipeline? Use a real language. Future-you will be debugging it at 2am.
Am I parsing JSON, YAML, or anything structured? Use a real language. jq is amazing but not a programming environment.

And for the bash you do keep, a few habits that have saved me repeatedly:

#!/usr/bin/env bash
set -euo pipefail   # exit on error, undefined var, or pipe failure
IFS=$'\n\t'         # safer word-splitting than the default space-tab-newline

# Use arrays, not space-separated strings
files=()
while IFS= read -r -d '' file; do   # -d '' handles null-delimited input
  files+=("$file")
done < <(find . -name "*.log" -print0)

# Quote every expansion, every time
for file in "${files[@]}"; do
  printf 'processing %s\n' "$file"
done

That's not pretty, but it's robust against filenames that try to ruin your day. The IFS=$'\n\t' trick alone has saved me from at least three production incidents — the default IFS splitting on spaces is what makes naive loops blow up.

When to actually try a Ruby-based shell

If you write a lot of Ruby already, experimenting with something like Rubish on your local machine is genuinely fun. The friction of shelling out from a Ruby script disappears, and you get to use the language you already know for tasks where bash would normally win on ergonomics.

I'd hold off on putting it in CI or shipping it to teammates until it has more mileage — I haven't tested it thoroughly enough to vouch for edge cases. But for personal automation (backup scripts, log triage, one-off data wrangling), it's exactly the kind of tool that pays off if it sticks.

The deeper lesson, though, isn't really about Ruby or any specific shell. It's about recognizing the moment when bash stops being the right tool. If you can spot that moment 100 lines earlier than you used to, you've already won.