patch: The Format That Taught Us to Ship Changes

#unix #freebsd #history #devops

Technical Beauty — Episode 38

A maintainer sends you a fix. Not as a branch on a server somewhere, not as a pull request awaiting your authentication token, not as an invitation to fetch a remote you have not heard of. A text file, attached to an email or pasted into an issue, with plus signs and minus signs and a few lines of plain context around them. You drop it onto your tree, type one command, and your code is current. No credentials, no network, no infrastructure.

The format that taught the world to ship a change in a dozen lines arrived in 1985, written by a man who would invent Perl two years later. It is, by some distance, the smallest unit of code distribution in widespread use, and it is the format every modern code-review tool, every commit, every diff in your terminal still speaks.

The Format

A patch is a recipe, not a snapshot. It does not contain the new file; it contains the change that turns the old file into the new one. That is the whole conceptual move, and it is the reason a fix to a million-line codebase can travel as twenty lines of email.

The file header names the source and destination paths. Each hunk in the file is a small unit of change with its own little header (the line ranges in the old and new versions: @@ -42,7 +42,9 @@), followed by the lines themselves: a few lines of context, each prefixed with a space; the lines to remove, each prefixed with -; the lines to add, each prefixed with +; and a few more lines of context to close. The context is what allows the apply-tool to find the right place in the file even if the surrounding code has drifted by a few lines since the patch was generated.

That is the entire vocabulary. A header, hunks, plus and minus and space. The whole standard fits on a page, and it has carried every code change in every open-source project for the better part of four decades.

The Surface

In daily use the idiom is two commands. Make a patch:

diff -u original modified > change.patch

Apply a patch:

patch -p1 < change.patch

The -u flag tells diff to use the unified format, which is the one every modern tool produces. The -p1 flag tells patch to strip one leading directory from the paths recorded in the file, which is how you make a patch generated against a/src/main.c apply to src/main.c in your tree. The leading-directory game is a convention that grew up around how patches were generated against named source trees and is, after thirty years, simply the gesture one learns.

A handful of further flags carry most of the rest of the daily work:

patch -R < change.patch         # reverse: undo the patch
patch --dry-run -p1 < change.patch  # check, do not write
patch -p1 -i change.patch       # explicit input file
patch -p1 < change.patch.bz2 | bzcat  # compressed input via pipeline

When a hunk does not fit cleanly, the patch tool tries to apply it with what it calls line-fuzz: it allows the context lines to be a little wrong, within a small tolerance, on the theory that the file has drifted but not changed beyond recognition. If even that fails, it writes the rejected hunk to a .rej file beside the target, plainly, with the failing hunk in the original format. You can read the reject, edit by hand, and try again. The tool is honest about its limits, and it tells you exactly what it could not do. Beauty, here, includes telling the truth about failure.

On FreeBSD

FreeBSD ships patch in the base system at /usr/bin/patch, BSD-licensed, descended directly from Larry Wall's original source line. OpenBSD and NetBSD carry the same lineage; macOS does too. The tool is simply present on a fresh install, no package required.

GNU patch, part of the GNU project and licensed under the GPL, is a separate fork from the same root. It has grown some extra extensions over the years (additional file-name heuristics, more flexible handling of edge cases) but the on-disk format both tools read and write is the same. A patch produced by git diff on one machine applies on any machine with either implementation. That interoperability after forty years is not an accident; it is what one gets when the format is small enough to specify completely and honestly.

This series prizes that property of FreeBSD's choices: keep the lean BSD-licensed original in base, where it is one of the few tools every engineer has on day one. The GPL alternative is a pkg install away if you need a specific extension. For the daily load, the in-base tool is the whole tool.

The Lineage

Larry Wall posted patch 1.3 to the mod.sources newsgroup on 8 May 1985, from his desk at NASA's Jet Propulsion Laboratory in Pasadena. He had already written rn, the news reader, the year before; Perl was still two years away. Wall's patch borrowed the diff output that the BSD diff(1) tool already produced (in the older context-diff format) and turned it from a thing humans read into a thing programs apply. That move, treating diff output as a machine-readable changeset rather than a human report, is the conceptual reduction the rest of the story rests on.

The format Wall worked with then was the context diff: each hunk listed several lines of context before and after the change, with the old and new versions in separate blocks. It worked, but it was verbose. In August 1990, Wayne Davison posted unidiff to comp.sources.misc, volume 14: a new format that interleaved the deletions and insertions into a single block, sharing the context between them, and saved roughly a quarter of the bytes on a typical patch. Richard Stallman folded unidiff support into GNU diff 1.15 in January 1991, the patch tool learned to read it shortly after, and the unified format has been the lingua franca ever since. Git produces it. GitHub shows it. Every code-review tool you have used speaks it.

The two implementations alive today (the BSD line carried in FreeBSD, OpenBSD, NetBSD and macOS; the GNU fork in the GNU project's repositories) both descend from Wall's 1985 source, and both remain interoperable to this day. A bug found in 2020 (Warner Losh, restoring 2.11BSD, traced a thirty-five-year-old corner case in the original parsing) was triaged identically against both. The format is so small that two independent maintainer lines have kept it stable for forty-one years without drifting.

That is the lesson of the episode. Not "patches are simple" (they are not, in the corners), but "the standard worth keeping is the one small enough to be a sentence". Wall wrote one in 1985. Davison sharpened it in 1990. Every modern code review still depends on it, and a patch -p1 < fix.diff from a 1986 manual still works on a 2026 FreeBSD installation. That is the kind of compatibility one earns by writing a small, honest format and then leaving it alone.

Read the full article on vivianvoss.net →

By Vivian Voss — System Architect & Software Developer. Follow me on LinkedIn for daily technical writing.