DEV Community

Mickey Hu
Mickey Hu

Posted on

I tested two ways to keep cron jobs healthy. The boring one won.

I tested two ways to keep cron jobs healthy. The boring one won.

I used to think small automation bugs were cheap.

A temp file left behind? Fine.
A cron job that logs a little too much? Fine.
A shell script that works only if you run it from one exact directory? Also fine, apparently.

That mindset is how you end up with a pile of tiny failures that all look harmless on their own and then eat your afternoon when they line up.

So I ran a simple comparison in my own setup.

On one side: ignore the little stuff and keep shipping.
On the other: fix the tiny problem as soon as I spot it.

I expected the first one to be faster. It is faster, if you only count the five minutes you save right now.
But if you count the time I lost later, the second one wins hard.

The setup

I had a few cron jobs doing basic work:

  • publishing tasks
  • cleanup jobs
  • log rotation
  • a couple of scripts that chained together other scripts

Nothing fancy. Just enough moving parts to get annoying when one link breaks.

The pain was not one huge bug. It was a bunch of small ones:

  • paths that depended on where the command was launched
  • temp files that stayed around after failures
  • scripts that kept running after a command failed
  • logs that grew until they became useless
  • cleanup jobs that were supposed to be harmless but failed silently

Each issue was small. The combined effect was not small at all.

Comparison: ignore it vs fix it now

Here’s the part that mattered.

Thing I compared Ignore it Fix it now
Time spent today lower a bit higher
Time spent this week higher lower
Confidence in automation shaky boringly solid
Debugging on Friday night common rare
Number of weird edge cases keeps growing shrinks over time
How I feel after deploys tense calm

The “ignore it” path feels efficient because the bill arrives later.
The “fix it now” path feels slower because you actually pay the bill when it shows up.
That is the whole trick.

What happened when I ignored the small stuff

This is the version people choose when they are busy.
I did it too.

A script would fail in the middle, but the next step would still run because I had not made failure loud enough.
So I would get a half-finished output and not notice until much later.

A cron job would write logs into the same file forever.
That sounds harmless until you open the file and it has become a junk drawer.
Searching it is awful.
Reading it is worse.

One script used relative paths because it worked when I tested it manually.
That is a great way to fool yourself.
Cron does not care about your current directory.
It just runs the thing and leaves you with your assumptions.

None of this broke every day.
That was the annoying part.
It broke just often enough to stay in my head.

The result was weird.
I spent less time “maintaining” the system and more time recovering from it.
That is not a win.
That is just deferred pain.

What changed when I started fixing things immediately

I stopped waiting for a big refactor.
I started making tiny changes the moment I saw them.

Not heroic changes.
Small ones.

1. I made failures loud

For shell scripts, I started using the boring flags that everyone recommends and nobody loves until they need them:

set -euo pipefail
Enter fullscreen mode Exit fullscreen mode

This did not make the scripts clever.
It made them honest.
If something failed, the script stopped pretending everything was okay.

That one change saved me from a few false-success runs.
Those are the worst ones, because they waste your trust.

2. I cleaned up temp files on exit

Instead of hoping cleanup would happen later, I tied it to exit.

TMP_DIR="$(mktemp -d)"
cleanup() {
  rm -rf "$TMP_DIR"
}
trap cleanup EXIT
Enter fullscreen mode Exit fullscreen mode

Simple.
Not sexy.
Also one of the best things I did.

It turned a bunch of annoying manual cleanup into something the script handled itself.
That meant fewer leftovers, fewer surprises, and fewer “wait, why is this folder full again?” moments.

3. I stopped relying on relative paths

This one is embarrassingly basic.
It still bites people all the time.

Cron jobs should not depend on your shell history or your current folder.
If a script needs a file, I now give it the full path.

That sounds tedious, but it removes a whole class of bugs.
And honestly, I prefer being bored over being wrong.

4. I made the cleanup idempotent

If a cleanup job runs twice, it should not care.
That was not true at first.
So I fixed it.

Now if the file is already gone, the script moves on.
If the folder is empty, it does not panic.
If the job is rerun after a failed deploy, it just does the same thing again.

That is what I want from automation.
I want it to be boring when it works and boring when it gets repeated.

The weird part: the “slower” path felt faster after a week

At first, I hated this.
Every tiny fix interrupted my flow.
I would be in the middle of something and think, “Ugh, this is a ten-minute cleanup.”

But those ten-minute cleanups kept removing future interruptions.

After a week, I noticed I was opening logs less.
I was re-running jobs less.
I was checking weird output less.
That was the real gain.

Not abstract “productivity.”
Just fewer interruptions.

I think that is why this stuff matters so much in automation work.
You are not really fighting one bug.
You are fighting the tax that every small bug adds to your attention.

What I’d choose again

If I had to choose between:

  • shipping a slightly messy script today
  • or taking another 15 minutes to make it reliable

I’d pick the second one now.
Not because I became noble.
Because I got tired of paying interest on tiny mistakes.

That is the best way I can describe it.
Messy automation is debt.
It does not look huge at first.
Then you pay it with surprise debugging sessions, manual retries, and a brain that never fully relaxes.

Clean automation is not magic.
It just stops asking for attention every other day.

The lesson I actually keep

I used to treat small infra issues like background noise.
Now I treat them like signal.

If a script needs me to remember a trick, I write the trick down in the script.
If cleanup is easy to forget, I automate it.
If a path is fragile, I make it explicit.
If a job can fail silently, I make it complain.

That is the whole game.
Not bigger architecture.
Not more clever code.
Just fewer little lies.

And yeah, that sounds boring.
It is boring.
That is the point.

The boring version is the one that keeps running when you are tired, distracted, or off doing something else.
That is the version I trust now.

If you want the short version:

Ignore small automation bugs, and they become your hobby. Fix them early, and they stop stealing your weekends.

That’s the trade.
I know which side I’m on.

Top comments (0)