I meant to delete the .cache files under a data directory. The server had been running for two months and the cache layer had grown to about 14GB. The application team told me it was safe to purge it — they'd rebuilt the cache logic and the old files were just dead weight. I typed find /data -delete -name "*.cache" because I was moving fast and I figured the order of arguments to find did not matter. It does. find evaluates its expression left to right, and -delete is not a filter — it is an action. It fired on every path find walked, starting at /data itself, and -name "*.cache" never got a chance to narrow anything. By the time I hit Ctrl-C the tree was roughly 80% gone.
That was not a test server.
The restore from backup took forty minutes. The application was down for those forty minutes. The postmortem was a forty-five minute conversation with people who did not particularly enjoy having it. I have run hundreds of find commands since then and I verify the expression order before every single one that has any destructive action attached to it — not because I've forgotten the rule, but because the cost of forgetting it once is not recoverable with an apology.
Why the order is the program
find does not have a flag parser that groups tests and actions separately. It walks a directory tree and evaluates its arguments as a logical expression, left to right, short-circuiting on false. When you write:
find /data -name "*.cache" -delete
it evaluates -name "*.cache" first on each path. If the name does not match, it short-circuits and -delete never runs on that path. When you write:
find /data -delete -name "*.cache"
it evaluates -delete first. -delete always succeeds — it removes the path and returns true, which means the expression continues to -name. The name check runs after the deletion, on a file that no longer exists, which is meaningless. The effect is that everything gets deleted and nothing is filtered.
This is not a bug. It is exactly how the man page says find works. It is just not how anyone instinctively reads a command the first few times they use it.
The rule is: tests filter, actions act, and actions must come after the tests that are supposed to narrow them. Write it in that order every time.
The quoting trap, right behind the ordering one
Here is the one that is even more subtle, because it causes the wrong behavior and produces no error at all:
# WRONG — the shell expands *.cache before find ever sees it
find /data -name *.cache
# RIGHT — quote the pattern so find does the matching
find /data -name "*.cache"
If there happens to be a single .cache file in your current working directory when you run this, the shell expands *.cache to that one filename and passes it to find -name as a literal string. find then searches for files named exactly that, everywhere under /data. It finds some, it finds none, but it is definitely not doing a wildcard search. If there are multiple .cache files in your current directory, find receives too many arguments for -name and errors out with something confusing about paths needing to precede the expression.
Either way you did not get what you intended, and if you added -delete, you deleted the wrong things silently.
Quoting the pattern is the fix. The quotes prevent the shell from expanding the glob so find receives the literal *.cache pattern and handles the wildcard itself, across the directory tree you pointed it at.
The age sign that everyone reverses at least once
find /var/log -name "*.log" -mtime +30 # older than 30 days
find /var/log -name "*.log" -mtime -1 # modified within the last day
find /var/log -name "*.log" -mtime 30 # exactly day 30 (almost never what you want)
+30 is older than thirty days. -1 is within the last day. A bare 30 means precisely thirty days ago, which is almost never the thing you're trying to match. The sign convention is the opposite of what feels natural — you want files "older than" a threshold and the intuitive symbol for "bigger number" is +, but a lot of people read +30 as "in the last thirty days" the first time they see it.
I have reversed this twice in production. Once I kept the wrong logs. Once I deleted logs I needed for an audit the following week. Neither was catastrophic but both were embarrassing, and both happened under the kind of mild time pressure where double-checking the man page feels slower than it actually is. The builder I built for this labels the output as "older than" or "within the last" in plain English next to the value, which removes the one decision in a find -delete job most likely to go wrong when you are already stressed.
The -exec batching difference nobody explains
# Runs the command once per file — slower, one PID per file
find /data -name "*.cache" -exec rm {} \;
# Batches files into one invocation — faster, one rm for many files
find /data -name "*.cache" -exec rm {} +
\; runs the command once per file. + collects as many paths as it can and passes them all to one invocation of the command, the same way xargs does. For something like rm, which accepts multiple arguments, the + form is faster and produces less process overhead. For something like a custom script that must process exactly one file at a time, \; is correct.
Most resources either do not mention this difference or mention it once in a reference table. The practical consequence is real — on a directory with ten thousand files, \; spawns ten thousand processes. The + form spawns a handful. On a cleanup job that runs in cron, the difference shows up in CPU load.
The preview step I skipped the day I broke things
The most reliable way to avoid the ordering and quoting mistakes is to build the command in two steps. First, run it with -print instead of -delete:
find /data -name "*.cache" -print
Read every line of that output. Confirm the list is what you intended. Then, and only then, swap -print for -delete. This adds maybe thirty seconds to the workflow. It would have saved me forty minutes and a postmortem. I skipped it because I was confident. Confidence is not a useful substitute for verification when the operation is irreversible.
The find command builder enforces this by showing a warning whenever you select -delete or -exec as the action, and offering to generate the -print version of the command first. It is the kind of nudge I would have appreciated having that day.
What the builder actually does
It assembles the expression in the legally correct order — tests first, action last — so you cannot accidentally replicate the mistake I made. You set the starting path, add tests in whatever order feels natural to you (name, type, age, size, exclude path), and pick an action. The output command always has the tests before the action, regardless of the order you clicked things in.
Every active flag gets a plain-English description inline. -mtime +30 reads as "modified more than 30 days ago." -name "*.cache" reads as "name matches the glob *.cache, quoted so find handles the wildcard." The -exec {} + form is the default when you pick -exec, with a note explaining why it is faster.
The whole point is to get from "I need to find and delete files matching these conditions" to a verified, copy-paste command without the thirty-second loop of man-page reading and second-guessing that I used to do and, on one bad morning, skipped.
What I upgraded while I was at it
The same discipline applies to the other tools. The rsync command builder now has presets for the three setups people build most often — local backup, push to remote, mirror — because those cover maybe ninety percent of rsync jobs and getting the flags right from scratch every time is where the dangerous ones like --delete get misapplied. The cron builder previews the next five run times after you build an expression, because a cron job I once deployed ran at 3am UTC instead of 3am local time and I did not notice until it fired on the wrong schedule for a week. You can also paste an existing crontab line and get the human-readable schedule back. The chmod builder now accepts an octal you paste in and sets the checkboxes — two-directional, because reading a file's permissions and understanding what they mean is just as common a task as setting them from scratch.
These are not features I planned in advance. They are all things I needed at 2am and did not have. That is the pattern most of this site runs on.
Build a find command with tests ordered before actions and every flag explained: https://bashsnippets.xyz/tools/find-command-builder
If you are scoping files before searching or transforming them, the whole pipeline is documented in Bash Text Processing: find, grep, sed, and awk. The rest of the free tools are at https://bashsnippets.xyz/tools
Top comments (0)