If you're using symlink swaps for zero-downtime deployments, you've probably written something like this:
ln -sfn releases/20260112 current
Looks atomic. It's not.
The Bug
Run that command through strace:
symlink("releases/20260112", "current") = -1 EEXIST
unlink("current") = 0
symlink("releases/20260112", "current") = 0
See the problem? Between unlink and symlink, the current symlink doesn't exist. Under load, some percentage of requests hit that gap and get ENOENT.
Your "zero-downtime" deploy just caused downtime.
The Linux Fix
This is well-documented:
ln -s releases/20260112 .tmp/current.$$
mv -T .tmp/current.$$ current
Create a temp symlink, then use mv -T to atomically replace the target. The -T flag makes mv call rename(2), which is atomic on POSIX filesystems.
Problem solved. Unless you're on macOS.
The macOS Problem
BSD mv doesn't have -T. And it follows symlinks differently - if current is a symlink to a directory, mv .tmp/current.$$ current moves the temp symlink into the directory instead of replacing it.
The Capistrano community has known about this for over a decade. Their workaround is clever - create the symlink in a subdirectory and move it via relative path - but it requires their Ruby runtime.
Most deployment tools just accept the race condition on Mac, or tell you to develop on Linux.
A Different Approach
I needed something that works on both platforms. I manage infrastructure across Linux servers and Mac dev machines, and "just use Linux" wasn't an option.
Python's os.replace() calls rename(2) directly on all POSIX systems:
detect_platform() {
if mv --version 2>/dev/null | grep -q 'GNU'; then
printf 'linux'
else
printf 'bsd'
fi
}
activate_release() {
local tmp_link=".tmp/current.$$"
ln -s "releases/$release_id" "$tmp_link"
if [[ "$(detect_platform)" == "linux" ]]; then
mv -T "$tmp_link" "current"
else
python3 -c "import os; os.replace('$tmp_link', 'current')"
fi
}
Platform detection checks for GNU coreutils by running mv --version rather than relying on uname. This handles edge cases like Homebrew GNU coreutils on Mac.
Proving the Race Condition
Want to see the bug yourself? Here's a test:
#!/bin/bash
mkdir -p releases/v1 releases/v2
echo "v1" > releases/v1/version
echo "v2" > releases/v2/version
ln -s releases/v1 current
# Reader loop in background
(
for i in {1..10000}; do
cat current/version 2>/dev/null || echo "ENOENT"
done
) > reads.log &
reader_pid=$!
# Writer loop - rapidly swap symlink
for i in {1..1000}; do
ln -sfn releases/v1 current
ln -sfn releases/v2 current
done
wait $reader_pid
errors=$(grep -c ENOENT reads.log || true)
echo "Errors: $errors / 10000 reads"
On a typical system, you'll see 10-50 errors per run. With the atomic approach, zero.
The Full Script
I wrapped this into a deployment script with:
- Atomic symlink swap on both Linux and macOS
- Directory-based locking with stale PID detection
- Automatic rollback on SIGINT/SIGTERM
- State machine cleanup that knows whether to rollback or just clean up temp files
It's a single file, no dependencies beyond bash and python3.
GitHub: github.com/mojoatomic/atomic-deploy
What It Doesn't Do
Intentionally out of scope:
- Shared directories - No Capistrano-style shared folder symlinking
- Remote deployment - Wrap it in ssh/rsync
- Release pruning - Add a cron job
- Service restarts - Use your init system
One thing, done right.
FAQ
Why not just use Capistrano/Deployer?
They're great if you're already in Ruby/PHP. This is a single script you can drop into any CI pipeline.
Why not containers?
Not everyone is on Kubernetes. VMs, bare metal, and edge devices still exist.
Python is a dependency.
Yes, but python3 ships with macOS and virtually every Linux distro. It's as ubiquitous as bash.
What about renameat2() with RENAME_EXCHANGE?
That's Linux 3.15+ with glibc 2.28+. It does a true atomic swap, but it's not portable.
Does this work on NFS?
No. rename(2) atomicity guarantees don't hold on network filesystems.
Further Reading
- Atomic symlinks - Deep dive on the problem
-
Things UNIX can do atomically - The
mv -Tinsight - Capistrano issue #346 - Original bug report from 2013
Top comments (0)