<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vivian Voss</title>
    <description>The latest articles on DEV Community by Vivian Voss (@vivian-voss).</description>
    <link>https://dev.to/vivian-voss</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3841501%2F2405ae59-aa07-4eb1-80f2-1c3517691538.png</url>
      <title>DEV Community: Vivian Voss</title>
      <link>https://dev.to/vivian-voss</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vivian-voss"/>
    <language>en</language>
    <item>
      <title>The Terms You Did Not Sign: HashiCorp's BSL, OpenTofu and IBM</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Fri, 05 Jun 2026 06:54:38 +0000</pubDate>
      <link>https://dev.to/vivian-voss/the-terms-you-did-not-sign-hashicorps-bsl-opentofu-and-ibm-36g5</link>
      <guid>https://dev.to/vivian-voss/the-terms-you-did-not-sign-hashicorps-bsl-opentofu-and-ibm-36g5</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnk4p7w5athg5576ynftd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnk4p7w5athg5576ynftd.png" alt="A wide landscape view of a late-afternoon woodland clearing where a single dirt path forks into two distinct branches. A young woman with long dark brown hair and pink cat-ear headphones, stands slightly off-centre at the fork, viewed from a three-quarter back angle, contemplating the choice. She wears a longer-cut white t-shirt, blue jeans torn at one knee, and dark brown Dr. Martens boots. One hand is raised to the back of her head, scratching, the universal gesture of " width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;In the Net — Episode 06&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;On 10 August 2023, HashiCorp announced that all future releases of Terraform, Vault, Consul, Nomad, Packer, Boundary, Waypoint and Vagrant would move from the Mozilla Public License v2.0 to the Business Source License v1.1. There was no consultation with users; there was no extended discussion period; the announcement was a press release. Forty-one days later, on 20 September 2023, the Linux Foundation accepted OpenTofu, a community fork of Terraform held under MPL 2.0, with founding sponsorship from Spacelift, Harness, Gruntwork, env0 and Scalr. By January 2024, OpenTofu 1.6 had shipped as a drop-in replacement. On 24 April 2024, IBM announced its intent to acquire HashiCorp for $6.4 billion. The acquisition closed on 27 February 2025, after the U.K. Competition and Markets Authority granted clearance.&lt;/p&gt;

&lt;p&gt;The Terraform file in your repository, written before any of this, is on a different licence today than it was on the day you wrote it. That sentence, plainly read, is what this episode is about.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Promise
&lt;/h2&gt;

&lt;p&gt;HashiCorp made infrastructure boring, in the best sense. Before Terraform, the path from "I want an EC2 instance" to "an EC2 instance exists" went through AWS CloudFormation YAML (proprietary, AWS-only), Chef recipes (Ruby, mutable), Ansible playbooks (push-based imperative), or a shell script and a prayer. Each had merits; none generalised. Terraform's hcl let you describe an EC2 instance, a Postgres database, an S3 bucket, an IAM policy, a Route 53 record and the dependencies that bound them, then plan the difference between your intent and the cloud's reality, then apply that difference atomically, then store the resulting state file for the next plan to read.&lt;/p&gt;

&lt;p&gt;For nine years (Terraform 0.1 shipped in July 2014), the source code was Mozilla Public License v2.0. MPL 2.0 is a copyleft licence in the file-level sense: modifications to MPL-licensed files must be released under MPL, but those files may be combined with code under other licences (including proprietary code) at the file boundary. For the practical purpose of building a business around Terraform, it was permissive enough that a CI provider could host runs, a consultancy could automate it, a wrapper could extend it, all without a negotiation with HashiCorp.&lt;/p&gt;

&lt;p&gt;That permissiveness produced the ecosystem. Spacelift, env0, Scalr, Terramate, Atlantis, Atmos, Terragrunt and a long tail of internal tooling at every serious cloud-using organisation grew up around the assumption that Terraform's binary, source, and provider ecosystem were a stable commons. The de facto IaC standard had Open Source mechanics behind it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hooks
&lt;/h2&gt;

&lt;p&gt;The Business Source License 1.1 is not Open Source by the Open Source Initiative's definition. It is "source available", which is to say that you can read the source and use the code for purposes the licensor permits, but you cannot meet the OSI's freedoms-2 and -6: freedom to use for any purpose, freedom to redistribute modified versions for any purpose. The BSL contains what its drafters call an "Additional Use Grant", a paragraph in which the licensor names the things you may not do; the canonical example, used by MariaDB and Sentry before HashiCorp, is "you may not offer a commercial product that competes with us".&lt;/p&gt;

&lt;p&gt;HashiCorp's Additional Use Grant forbids "production use that competes with HashiCorp's commercial offering". HashiCorp's commercial offerings include Terraform Cloud, Terraform Enterprise, Vault Enterprise, Consul Enterprise and several adjacent services. What constitutes "competition with" these offerings is, on the face of the licence text, ambiguous, and that ambiguity is the design point. Two examples illustrate the range:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A consultancy that writes Terraform for a client, runs the binary on the client's CI runners, and bills the client for the work. Production use? Almost certainly. Competing with Terraform Enterprise? Arguably, since Enterprise sells essentially that service as a SaaS. The licence text does not draw a bright line.&lt;/li&gt;
&lt;li&gt;A SaaS platform whose value proposition is "managed Terraform runs with policy guards and secrets". Production use? Yes. Competing with Terraform Cloud? Almost certainly yes; this was the design target of the BSL change.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;HashiCorp has, in its public guidance, said that "end users running Terraform on their own infrastructure" remain permitted. But end users have lawyers, and lawyers read the licence text, and the licence text is not what the public guidance says it is.&lt;/p&gt;

&lt;p&gt;Three further mechanics of the BSL change matter to the user:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every binary built from official HashiCorp source after 10 August 2023 falls under BSL until exactly four years after that version's release, when the licence on that version (and only that version) converts to MPL 2.0. Terraform 1.5.7, the last MPL release, will remain MPL 2.0 forever; Terraform 1.6.0 and onwards is BSL for four years from each individual release date.&lt;/li&gt;
&lt;li&gt;Forking BSL code as Open Source is forbidden by the BSL itself. Forking it as proprietary, source-available or BSL code is permitted. The OpenTofu fork was made from Terraform 1.5.7 (the last MPL version) precisely because the MPL-licensed code was the only fork-target the team could legally relicense.&lt;/li&gt;
&lt;li&gt;HashiCorp's APIs, SDKs, libraries and provider plugins (the things that talk to AWS, Azure, GCP, on your behalf) remain MPL 2.0. The core binary changed; the surrounding ecosystem code, mostly, did not.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Hook, summarised: your existing Terraform code is fine. Your next upgrade is on different legal terms than your last upgrade.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Standing
&lt;/h2&gt;

&lt;p&gt;OpenTofu was forked very quickly. On 25 August 2023, fifteen days after the BSL announcement, a manifesto signed by initial supporters proposed OpenTF as a Linux Foundation project. On 20 September 2023, the Linux Foundation formally accepted the project, which had by then been renamed OpenTofu. The founding sponsors were Spacelift, Harness, Gruntwork, env0 and Scalr, with subsequent endorsements from Digger, Terrateam, Massdriver, Terramate and others. All of these are vendors whose business model HashiCorp's BSL Additional Use Grant ambiguously threatens.&lt;/p&gt;

&lt;p&gt;By January 2024 OpenTofu shipped 1.6, the first stable release, fully compatible with Terraform 1.5.x including module syntax, provider ecosystem and state file format. The tool's vocabulary changed (&lt;code&gt;terraform&lt;/code&gt; becomes &lt;code&gt;tofu&lt;/code&gt; at the command line), the lockfile differs slightly, and OpenTofu added features Terraform did not have, including OCI registry support for modules and providers. The state file written by either binary remains compatible with the other; a &lt;code&gt;terraform apply&lt;/code&gt; followed by a &lt;code&gt;tofu apply&lt;/code&gt; on the same state file is, today, a working migration path.&lt;/p&gt;

&lt;p&gt;GitHub stars are an imperfect measure of community adoption, but they are visible: OpenTofu crossed 20,000 stars within months of its 1.6 release and continued to climb. Major cloud providers, vendors and large internal platforms migrated. The community voted with its mirror.&lt;/p&gt;

&lt;p&gt;Six months later, on 24 April 2024, IBM announced its intent to acquire HashiCorp for $6.4 billion (approximately $35 per share in cash). The acquisition was delayed by regulatory review at the U.S. Federal Trade Commission and the U.K. Competition and Markets Authority, longer than IBM's original "by the end of 2024" guidance suggested. The CMA cleared the deal in late February 2025; the acquisition closed on 27 February 2025. As of that date, the licence on every Terraform release after 10 August 2023 belongs, contractually, to IBM.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Exit That Isn't
&lt;/h2&gt;

&lt;p&gt;The Business Source License is reversible at the licensor's discretion only. HashiCorp could, in principle, restore MPL 2.0 to its products tomorrow. IBM, the licensor now, could do the same. Either could also extend the four-year BSL period, modify the Additional Use Grant, or replace BSL with a more restrictive licence altogether at future releases. The Terraform file you wrote in 2014 was on a contract you understood; the Terraform file you write today is on a contract that IBM holds and may, with notice, change.&lt;/p&gt;

&lt;p&gt;This is Lock-in by Retroactive Adoption. The hooks were not laid when you adopted the tool. The hooks were retrofitted onto the version-stream of the tool you had already adopted, and the retrofit happened because a press release said so, not because you renegotiated. The only practical defences are forks (OpenTofu, OpenBao) and migrations away (Pulumi, Crossplane), and both of those are work you did not budget for when you adopted the original tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Price
&lt;/h2&gt;

&lt;p&gt;HashiCorp's commercial pricing is not the headline cost. Terraform Cloud SaaS prices per applied resource, per concurrent run and per workspace; the Free tier covers small teams; the Standard, Plus and Enterprise tiers add Sentinel policy enforcement, SSO, audit logging and run pipelines. For a hundred-engineer organisation managing a few thousand resources, the annual bill comfortably reaches six figures. Terraform Enterprise (the self-hosted variant) starts at five-figure annual commitments. Vault Enterprise prices per client (per authenticated identity per month). At the upper end of large estates, the HashiCorp annual spend can reach seven figures.&lt;/p&gt;

&lt;p&gt;The pricing was the same pricing before the BSL change as after. The BSL change was not a pricing increase; it was a redefinition of the legal terms on which the free version was available, which has the effect of pushing organisations who built on the free version, at scale, toward either the commercial version or toward a migration. The licence is the leverage; the pricing is the price of staying.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Escape Route
&lt;/h2&gt;

&lt;p&gt;The migration off HashiCorp's licensed stack is, today, a more concrete proposition than the migration off VMware or Oracle Java SE (Episode 04 and 05 of this series), because the community produced a complete replacement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For Terraform: OpenTofu&lt;/strong&gt; (Linux Foundation, MPL 2.0). Replaces the &lt;code&gt;terraform&lt;/code&gt; binary with &lt;code&gt;tofu&lt;/code&gt;. State files are forward and backward compatible with Terraform 1.5.x. The hcl syntax is identical. Provider plugins are reusable. The migration, for the existing .tf code in your repository, is a tooling swap rather than a rewrite. Terraform 1.6.0 and later features (some, not all) have been ported. For a sufficiently complex enterprise estate, the migration is a several-week project to audit the differences and exercise the new binary in CI; for a small estate, it is an afternoon.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For Vault: OpenBao&lt;/strong&gt; (Linux Foundation, MPL 2.0). Forked at Vault 1.14.x in late 2023; GA release December 2024. Drop-in for most Vault workloads; some enterprise features (HSM integration, MFA, namespaces in the OSS sense) require additional development or paid alternatives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For Consul: pinned-version OSS or migration to alternatives.&lt;/strong&gt; Service mesh capability has largely shifted to Istio and Linkerd; KV-store needs map onto etcd or Consul OSS pinned to a pre-BSL version.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Beyond the BSL stack:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pulumi&lt;/strong&gt; (Apache 2.0): IaC in Python, Go, TypeScript, C#. Different programming model; a real port rather than a swap. Mature; appropriate when the team prefers a real programming language over hcl.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Crossplane&lt;/strong&gt; (Apache 2.0): Kubernetes-native composition. Defines cloud resources as Kubernetes Custom Resources, reconciled by controllers. Appropriate when the team already runs Kubernetes and wants infrastructure-as-controller rather than infrastructure-as-code.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Practical hygiene for new code from today onwards:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Avoid Terraform Cloud private registries and sensitive-variable storage; those are extraction points&lt;/li&gt;
&lt;li&gt;Pin provider versions explicitly; do not float&lt;/li&gt;
&lt;li&gt;Keep CI runners self-hosted where possible; reduce dependency on HashiCorp-owned runtime&lt;/li&gt;
&lt;li&gt;Maintain state file backups that any compatible binary can read&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Coda
&lt;/h2&gt;

&lt;p&gt;The pattern of this episode is the sixth distinct shape of Lock-in this series has named. Adobe took your file format. LinkedIn took your reach. AWS took your identity and your egress. VMware took your perpetual licence in an acquisition you were not party to. Oracle took your Java users and billed your entire workforce. HashiCorp, now IBM, took the licence on the source of the tool you had already adopted, four years deep into your platform, and changed it under your feet. There is no shock; there is no audit; there is no per-employee invoice. There is a press release, a four-year BSL clock, and a Terraform file in your repository whose terms today are not its terms when you wrote it.&lt;/p&gt;

&lt;p&gt;You wrote infrastructure as code so the next engineer could read it. You did not promise the next licence-holder would let them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/the-terms-you-did-not-sign" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt;, System Architect and Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>terraform</category>
      <category>opentofu</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
    <item>
      <title>Unix, Everything Is a File</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Thu, 04 Jun 2026 08:02:28 +0000</pubDate>
      <link>https://dev.to/vivian-voss/unix-everything-is-a-file-4did</link>
      <guid>https://dev.to/vivian-voss/unix-everything-is-a-file-4did</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Favntzwz52i7d3vzv0sfw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Favntzwz52i7d3vzv0sfw.png" alt="A vertically composed cyberpunk illustration in three stacked era-bands. Top: a 1969 Bell Labs computer room with a Digital Equipment Corporation PDP-7 minicomputer, its main console with toggle switches and indicator lights flanked by connected equipment cabinets (memory cabinet, paper-tape reader, magnetic tape drive), thick cable runs in dim light. Middle: a 1990s beige Sun SPARCstation with a curved-glass CRT showing green phosphor text 'ls -la /' and a short directory listing. Bottom: a 2026 setup with a modern MacBook on a desk next to a smartphone, the laptop screen showing FreeBSD top(1) output with load averages, CPU and memory lines and a small process table. Spiralling vertically through all three eras in the foreground, a cyberpunk data-stream of empty cyan-glowing folder icons connected by cyan and magenta neon ribbons, conveying continuity across decades. The cover overlay reads: title EVERYTHING IS A FILE, subtitle UNIX, BELL LABS, SINCE 1969, claims " width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;By Design — Episode 07&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In the summer of 1969, Ken Thompson had three weeks of uninterrupted time. His wife and infant son were visiting relatives in California, and the AT&amp;amp;T research division at Murray Hill had recently withdrawn from the Multics project, leaving Thompson with an empty PDP-7 (an 18-bit minicomputer with four kilobytes of memory) and the lingering question of what a smaller, cleaner operating system might look like. By the end of those three weeks Thompson had written the first version of what became Unix. With Dennis Ritchie and Rudd Canaday over the following months, the small Bell Labs group built a hierarchical filesystem, the notion of computer processes, pipes for inter-process communication, a command interpreter, and one architectural idea that has carried half a century without much fading: the file as the universal interface. A device, a pipe, a socket, a process listing, all opened, read, written and closed through the same system calls.&lt;/p&gt;

&lt;p&gt;The idea did not arrive as a slogan. It arrived as the path of least resistance: if you had a kernel that already knew how to give a userspace program a numbered handle to a piece of named state, and a small set of operations on that handle, why invent another kind of handle for the next piece of state? Devices got file paths. Pipes got file descriptors without paths. Sockets, once they arrived in 4.2BSD in 1983, got the same file descriptors after a special creation call. The shape held, and the shape was the point.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Complaint
&lt;/h2&gt;

&lt;p&gt;"Sockets are not files. GPUs are not files. Hardware state does not fit a read()."&lt;/p&gt;

&lt;p&gt;The complaint is older than the design, and it is not unfounded. Sockets carry connection state, message boundaries, and asynchronous events; representing them as a stream of bytes throws information away. GPUs have command queues, memory layouts, and a parallel execution model that no sequence of read() and write() calls naturally captures. A terminal has modes, signal characters and a baud rate that nobody types into the data stream. The escape hatch, ioctl, exists precisely because some operations refuse to pretend they are bytes flowing in and out, and the Unix designers did not pretend otherwise.&lt;/p&gt;

&lt;p&gt;The complaint, taken seriously, has produced several alternative designs over the decades. Microkernels handed each device its own service. Modern operating-system research has proposed message-passing kernels, capability-based systems, and explicit object models in place of file descriptors. Each is more expressive in some specific direction. None has displaced the file as the lingua franca of Unix-like systems, and the reason is not that the alternatives are wrong; it is that the file is unreasonably composable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Decision
&lt;/h2&gt;

&lt;p&gt;The decision, plainly: when in doubt, give it a path.&lt;/p&gt;

&lt;p&gt;Devices live in /dev/, with paths the user can read aloud (/dev/null, /dev/random, /dev/zero) and that even a shell script can open. Pipes are anonymous file descriptors created by pipe(2); the shell binds them between processes with the | character that became the most-copied piece of syntax in the history of computing. Sockets are created with socket(), bound and connected with their own calls, and then used through the same read/write/close as everything else. Processes are inspectable as /proc entries on Linux and several other systems, optionally on FreeBSD; the kernel exposes itself as a filesystem because once you have decided that files are how state is named, you may as well expose your own state that way.&lt;/p&gt;

&lt;p&gt;Ritchie and Thompson's 1974 ACM paper, "The UNIX Time-Sharing System", presented in CACM volume 17 number 7, codified what had already been running on the PDP-7 since 1969 and was, by then, also running on the PDP-11. Plan 9 from Bell Labs (Pike, Thompson, Presotto, Winterbottom, first edition 1992) pushed the idea to its logical end. In Plan 9, the window system is a file system; the keyboard is a file system; remote machines are mounted as file systems through the 9P protocol; even computation can be a file system. Dennis Ritchie said of Plan 9 that "Unix had the right idea that just about everything in the system is accessible through a file" but did not, in the end, follow through. Plan 9 followed through.&lt;/p&gt;

&lt;p&gt;It is rather a sweeping promise, on the face of it. The decision was to keep it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Trade-Off
&lt;/h2&gt;

&lt;p&gt;Not every operation fits a sequence of bytes. The Unix designers knew this from the start, and ioctl is the honest answer. A terminal driver has a stty configuration that no read() will reveal; a network interface has an IP address that you do not write into a file the way you write into one. ioctl is the call you make when the file model has carried you as far as it can, and the rest needs structured arguments and structured replies.&lt;/p&gt;

&lt;p&gt;GPU drivers, in the modern era, are the most striking case. A modern GPU exposes a command queue, a memory allocator, a synchronisation primitive set and a shader compiler interface. The path-and-handle model can carry the open-the-device step, but the rest is ioctl calls that look more like RPC than like file IO. DRM (Direct Rendering Manager) on Linux and FreeBSD is the practical compromise: the GPU is a file in /dev/, opened like everything else, and then almost all the work happens through ioctl.&lt;/p&gt;

&lt;p&gt;Hardware that refuses the model outright lives behind its own ABI. Some embedded firmware, some legacy industrial controllers, some accelerator cards reach userspace through libraries that bypass the filesystem entirely. The Unix answer is to extend the file when possible (memory mapping, asynchronous IO via fd, kqueue/epoll on file descriptors), accept the ioctl when the file abstraction cannot stretch further, and keep the path-and-handle vocabulary as the default.&lt;/p&gt;

&lt;p&gt;The trade is honest. The file model is not free; it is cheap because the alternatives are expensive.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Proof
&lt;/h2&gt;

&lt;p&gt;Fifty-six years of Unix, and the file is still the unit.&lt;/p&gt;

&lt;p&gt;On every Unix-like system shipped since the early 1970s, the same paths exist: /dev/null discards what is written to it, /dev/random and /dev/urandom give cryptographic randomness, /dev/zero gives an inexhaustible stream of zero bytes. Pipes hold the shell together; a one-liner like ps ax | grep nginx | awk '{print $1}' composes three programs into one query because every one of them reads from a file descriptor and writes to a file descriptor without caring whether the other end is a terminal, a file or another process.&lt;/p&gt;

&lt;p&gt;On FreeBSD the discipline is intact. devfs (in-base since FreeBSD 5.0, 2003) presents the kernel's device tree as a regular filesystem mounted on /dev/; you can stat a device, change its permissions with chmod, and follow symbolic links to it like any other file. GELI, the FreeBSD disk-encryption framework, exposes encrypted volumes as /dev/.eli, a transparent overlay you can newfs() or hand to ZFS as a vdev. ZFS volumes (zvol) appear under /dev/zvol//; you can dd to one, partition it, export it via iSCSI, all through the file interface. bhyve, FreeBSD's hypervisor, presents virtual machine handles under /dev/vmm/. Plan 9 is the pure version of the idea and is still actively maintained at 9front.org, on the same architectural premise as in 1992.&lt;/p&gt;

&lt;p&gt;Linux started the same way and has, in some corners, drifted. /sys, the sysfs hierarchy added in 2.6, is in the spirit of the idea, extending /proc with kernel object trees. So far, so good. The drift sits elsewhere. D-Bus, originating in 2002, carries the message bus that earlier Unix work would have built as files and sockets; systemd, from 2010, builds its own interfaces around cgroups, sockets, the journal, and a large set of dbus services. Netlink sockets are the alternative kernel channel for routing, audit and firewall configuration that traditional Unix would have done through paths and ioctls. eBPF is a programmable layer that runs verified bytecode inside the kernel rather than a path that names a piece of state; it is, in many ways, a different architectural idea.&lt;/p&gt;

&lt;p&gt;None of these are wrong. They solve real problems that the simple file model handles awkwardly. But the cumulative effect is that the Linux of 2026 has, in several important corners, become a system in which the file is one of several interfaces rather than the interface. Dennis Ritchie's verdict on Plan 9 was that "Unix had the right idea but didn't follow through". The same sentence, read against 2026 Linux, has rotated by 180 degrees: the right idea has been followed through in some corners and quietly diluted in others. The FreeBSD line is closer to the original commitment, deliberately so.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Principle
&lt;/h2&gt;

&lt;p&gt;One interface, infinite implementations.&lt;/p&gt;

&lt;p&gt;The lesson, made portable: when a small set of operations can address an arbitrarily large set of resources, the operations compose without limit. Pipes, redirections, file-descriptor passing, the entire shell tradition of composing tools by gluing their inputs and outputs together, rest on this property. The protocol the kernel offers (open, read, write, close, ioctl as the escape) and the abstraction the shell composes (the |, the &amp;lt;, the &amp;gt;, the 2&amp;gt;&amp;amp;1) are the same protocol viewed from different ends. There is no impedance to match because there is no impedance to bridge.&lt;/p&gt;

&lt;p&gt;A Unix one-liner reads almost like a sentence because every noun in it is a file. The verbs are universal. The grammar is small. The expressive range is, in practice, vast. Half a century of operational evidence sits behind the idea, and the corners of the modern Unix world that have departed from it tend to discover, after a while, that they have rebuilt parts of it with worse vocabulary.&lt;/p&gt;

&lt;p&gt;That is what an architectural decision worth keeping looks like. Not perfection, not freedom from trade-offs, not absence of escape hatches. A small, honest, composable shape that does not cease to pay for fifty-six years.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/unix-everything-is-a-file" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt;, System Architect and Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>unix</category>
      <category>freebsd</category>
      <category>architecture</category>
      <category>history</category>
    </item>
    <item>
      <title>The Eighth Server: How One Missed Deploy Ended Knight Capital, 2012</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Wed, 03 Jun 2026 08:54:18 +0000</pubDate>
      <link>https://dev.to/vivian-voss/the-eighth-server-how-one-missed-deploy-ended-knight-capital-2012-34og</link>
      <guid>https://dev.to/vivian-voss/the-eighth-server-how-one-missed-deploy-ended-knight-capital-2012-34og</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffzo04vw4jrm3tmsl8jgm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffzo04vw4jrm3tmsl8jgm.png" alt="A dim Wall Street trading floor in the chaos of the first minutes after the opening bell. Dozens of LED monitors stream cascades of red tickers and falling price charts with sharp downward breaks. Cool blue and magenta tones pulse across the screens. Several trader chairs sit empty, mid-flight; a printer at the right edge is mid-spew, white paper unspooling onto the floor. A single large display on the back wall shows a sharply falling red price chart with a steep downward break. A young woman with long dark brown hair and pink cat-ear headphones, white t-shirt and torn jeans, stands in the foreground at the edge of a desk, one hand on the desk surface, watching the scene from a three-quarter back angle. Cyberpunk-noir cool deep blue dominant, magenta accents on monitor bezels. The cover overlay reads: title DEAD FLAG WALKING, subtitle KNIGHT CAPITAL, 2012, claims " width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tales from the Bare Metal — Episode 06&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;At 09:30 on 1 August 2012, the New York Stock Exchange opened a new programme for retail-order matching, the Retail Liquidity Program. By 10:15, Knight Capital Group, one of the largest market makers on the American equities markets, had ceased to function as a going concern. The forty-five minutes between those two times cost the firm roughly $440 million in realised pre-tax loss, required emergency capital from Jefferies within days, and led to acquisition by Getco within months. The proximate cause was one server out of eight, running code from 2003 that, in a strictly source-control sense, had never gone away.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Incident
&lt;/h2&gt;

&lt;p&gt;Knight ran a routing system called SMARS, the Smart Market Access Routing System, which decided how to forward orders into the various American equity venues. The production deployment of SMARS sat on eight servers; identical, redundant, all expected to handle a share of the morning's order flow. In late July, in preparation for the NYSE's launch of the Retail Liquidity Program on 1 August, Knight's developers prepared a release that taught SMARS to recognise and route RLP-eligible orders.&lt;/p&gt;

&lt;p&gt;The deployment to the eight production servers ran on schedule the day before the launch. By the SEC's later account, the new code reached seven of the eight servers correctly. On the eighth, the deployment did not complete, and nobody knew. The server kept running the previous build.&lt;/p&gt;

&lt;p&gt;When the markets opened at 09:30, retail order flow began arriving at SMARS bearing a new piece of metadata: a particular flag in the order-routing protocol, set by the RLP programme, indicating an order eligible for retail matching. On the seven correctly-deployed servers, the routing logic recognised the flag, consulted the new RLP code path, and routed accordingly. On the eighth, the same flag was interpreted through code paths last meaningful in 2003.&lt;/p&gt;

&lt;p&gt;What happened next is the part that reads like a horror story even now. The eighth server began generating orders at an exceptional rate, accumulating positions Knight had never intended to take, paying the ask and selling at the bid, again and again, in hundreds of stocks. By the time the firm halted trading and unwound, the numbers were unprecedented: roughly four million orders had been sent, roughly seven billion dollars of unintended positions had been built, and individual share prices had moved by tens of percent in minutes. Knight booked a realised pre-tax loss of approximately $440 million when it unwound the position. The firm raised roughly $400 million of emergency capital from investors led by Jefferies within days, the SEC subsequently fined Knight $12 million for violations of the Market Access Rule, and Getco Holdings acquired the firm before the year was out. The Knight name effectively ended that morning.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Diagnosis
&lt;/h2&gt;

&lt;p&gt;The defective behaviour was produced by a module called Power Peg.&lt;/p&gt;

&lt;p&gt;Power Peg had been written in 2003 to manage parent-order execution: an algorithmic test routine for slicing large orders into many smaller ones. It had been used in production briefly, judged unfit for purpose, and disabled after 2005. The disabling, however, was operational rather than structural. The Power Peg code remained in the SMARS source tree, compiled into the binary, dormant. What kept it dormant was a single flag in the order-routing protocol: when that flag was set, the SMARS code activated Power Peg; when not, the module did nothing. After 2005, the systems that produced upstream orders simply stopped setting the flag, and Power Peg slept.&lt;/p&gt;

&lt;p&gt;Years passed. Power Peg was not removed in any subsequent refactor. The 2003 code, unchanged, continued to be built into every release of SMARS.&lt;/p&gt;

&lt;p&gt;In 2012, in preparation for the RLP launch, the SMARS routing protocol gained a new feature: the ability to recognise RLP-eligible orders. The implementation, plausibly enough, repurposed the bit position that had once carried the Power Peg activation signal. The bit was no longer in use, the engineers reasoned; let it carry the RLP signal now. The new code, on the seven correctly-deployed servers, read the bit and routed by RLP rules. The old code, still resident on the eighth server, read the same bit and read it as "start Power Peg".&lt;/p&gt;

&lt;p&gt;Each RLP-eligible order, on the eighth server, was therefore an instruction to fire up a nine-year-old algorithm. Power Peg, as originally written, was a test routine. It had no production-grade understanding of fills, of order completion, of when to stop. Its inner loop bought at the ask and sold at the bid; the outer loop fed it more parent orders to slice. The retail order stream of a major venue at market open is a great many orders. Each one became a parent. Each parent became a flurry. Within minutes, Knight was the dominant counterparty in dozens of stocks. Within fifteen minutes, the firm's exposure was visible in the prices themselves.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Context
&lt;/h2&gt;

&lt;p&gt;Three quiet drifts compounded, and none of them, judged at the time they happened, was foolish.&lt;/p&gt;

&lt;p&gt;The first was the source tree. In 2005, the team that decided Power Peg was unfit gated it with a flag rather than deleting it. That decision was defensible at the time: the code might still be wanted, removing it risked breaking adjacent assumptions, the codebase was large and removal was real work for an uncertain payoff. The gating worked perfectly, every day, for seven years. The flag became invisible: still there, still loaded, still compiled, but inert. The engineers who made the 2005 decision were no longer present in 2012. The link between the flag, the gated module and the bit position the flag occupied existed in nobody's head, and was nowhere recorded as a constraint on future use of that bit.&lt;/p&gt;

&lt;p&gt;The second was the deployment script. The script that pushed code to the eight SMARS hosts treated "deployment" as a file-copy operation. It reported success when files had landed and the SMARS process had restarted on each target. It did not, as part of "success", verify that the running binary on each target was the new one. It did not interrogate each host for a build identifier. It did not require a healthcheck that the new code alone could pass. In a fleet of eight, an old binary on one looks exactly like a new binary on the others from outside, as long as you only ask "did the deploy command succeed". For the eighth server, the deploy command did succeed, in the sense the script meant; the files had not, in fact, arrived. The script's notion of success was the wrong notion.&lt;/p&gt;

&lt;p&gt;The third was the release note. The change documentation for the RLP release recorded that the cumulative-quantity-flag bit was now being used to signal RLP eligibility. It did not, and probably could not, record that the same bit had once been the activation signal for an ancient module that was still compiled into the running binary. The reviewer reading "we now use this flag bit for RLP" had no honest way of knowing the bit had a prior life; the prior life had been dormant for seven years and lived in a comment, if it lived anywhere. To call the review careless is to misunderstand what was visible to the reviewer.&lt;/p&gt;

&lt;p&gt;All three were ordinary. Gating instead of deleting; treating "files copied" as "deploy succeeded"; documenting the new use of a flag without auditing every prior use across the source tree. Every team does at least one of these. Most teams do all three. Knight's bad luck was to do all three on the morning a stock exchange opened a new programme.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Principle
&lt;/h2&gt;

&lt;p&gt;Two architectural disciplines would have prevented this incident, and they remain useful in every modern stack.&lt;/p&gt;

&lt;p&gt;The first is to delete dead code, not to gate it. A flag that disables a module is a switch that an unrelated change can later flip. The module is still present, still loaded, still subject to whatever the runtime decides to do with it; its absence in behaviour rests on a piece of state that was never intended to be a load-bearing safety mechanism. A deleted module, on the other hand, is gone: not in the binary, not in the loaded process, not waiting for an accidental activation. Source control retains the history; the running system retains nothing. If a module is too dangerous to remove, the right response is to make it safe enough to remove, not to leave it gated.&lt;/p&gt;

&lt;p&gt;The second is to verify after deploy, on the property of the deployed code itself. The end of a deployment is not "the files copied successfully". It is "every target host reports the new build's identity". A deployment script that hits a /version endpoint on each host after restart, that compares the returned hash to the expected hash, that fails loudly if any host disagrees, is a small piece of plumbing that catches the specific class of failure that ended Knight Capital. In a FreeBSD shop, this is rc.d managing the SMARS-equivalent, plus a deploy script in plain shell that loops over the host list, hits a /version endpoint on each, and refuses to declare success until every host returns the expected build. This is not exotic engineering. It is half a page of shell.&lt;/p&gt;

&lt;p&gt;A great many of the practices that have become normal in the last decade, particularly the discipline of immutable infrastructure, container image hashes, and deploy verification through Kubernetes' Deployment status, are downstream of incidents like Knight's. They exist precisely because, before them, organisations could not be sure what was running on which host.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where It Travels
&lt;/h2&gt;

&lt;p&gt;The pattern wears the local clothes everywhere.&lt;/p&gt;

&lt;p&gt;Kubernetes: a rolling update on a Deployment where one node has the image cached under the same tag from a previous build. With imagePullPolicy: IfNotPresent (the default for some tags), the kubelet uses the cached image. The new pod starts, the readiness probe passes (returning 200 from either version), and the Deployment reports rolled out, with one pod silently still on the old code.&lt;/p&gt;

&lt;p&gt;Feature-flag libraries (LaunchDarkly, ConfigCat, Unleash, Flagsmith): a flag whose semantic meaning has shifted between releases. The old code path is still in the binary, gated by the same flag, waiting for somebody to wake it.&lt;/p&gt;

&lt;p&gt;Cloud auto-scaling: an EC2 launch template that points at an outdated AMI ID. Newly scaled-out instances run the old binary while the freshly-deployed ones run the new. Traffic is balanced across them as if they were identical.&lt;/p&gt;

&lt;p&gt;Helm and Kustomize: a Deployment manifest pins version 2.4.0, but one cluster node's local image cache resolves the tag to an older 2.3.7. The Pod runs from the local cache; the Deployment status reports healthy.&lt;/p&gt;

&lt;p&gt;CI/CD job matrices: a deployment job that mass-deploys to a list of targets and reports success when N out of N return zero exit codes, without verifying any target's running version.&lt;/p&gt;

&lt;p&gt;The shape is identical in each: a release that completes on N-1 out of N targets, a verification step that asks "did the deploy succeed" rather than "is the new version running everywhere", and a piece of latent surface area (a stale image, a repurposed flag, a cached AMI, an older registry layer) lying in wait on the unverified one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coda
&lt;/h2&gt;

&lt;p&gt;Knight rebuilt nothing; Knight was absorbed. The architectural lessons of 1 August 2012 are by now in many shops' release checklists, and the SEC's Market Access Rule has acquired a fixed point in financial-services compliance training. The risk has not gone away; it has merely been moved, into stacks that have richer deployment tooling and, often, the same blind spots dressed in newer vocabulary.&lt;/p&gt;

&lt;p&gt;The single sentence worth carrying out of this episode is the question that ends it: when our deployment reports success this afternoon, on what property of the deployed code did it confirm that?&lt;/p&gt;

&lt;p&gt;A deploy that succeeds on seven of eight is one that failed quietly on one. Production does not give partial credit; the market gives none at all.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/the-eighth-server" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt;, System Architect and Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postmortem</category>
      <category>devops</category>
      <category>reliability</category>
      <category>deploy</category>
    </item>
    <item>
      <title>patch: The Format That Taught Us to Ship Changes</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Tue, 02 Jun 2026 06:58:42 +0000</pubDate>
      <link>https://dev.to/vivian-voss/patch-the-format-that-taught-us-to-ship-changes-3d7n</link>
      <guid>https://dev.to/vivian-voss/patch-the-format-that-taught-us-to-ship-changes-3d7n</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd2rzkaaj81dhxqa0desi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd2rzkaaj81dhxqa0desi.png" alt="A young woman with long dark brown hair and pink cat-ear headphones stands in a dark indigo workspace at night, both hands held slightly out as if gently conjuring. Above her float two translucent glowing document icons: original.c on the upper left, modified.c on the upper right. Between them, a luminous swirling spiral of typographic lines streams across the frame — red lines each beginning with a minus sign drift outward as deletions, green lines each beginning with a plus sign flow inward as additions, while pale-grey unmarked lines weave between them as untouched context. The swirl converges below into a third document, halfway formed and still resolving from light, labelled patched.c. The cover overlay reads: title PATCH, subtitle SINCE 1985, claims " width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Technical Beauty — Episode 38&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A maintainer sends you a fix. Not as a branch on a server somewhere, not as a pull request awaiting your authentication token, not as an invitation to fetch a remote you have not heard of. A text file, attached to an email or pasted into an issue, with plus signs and minus signs and a few lines of plain context around them. You drop it onto your tree, type one command, and your code is current. No credentials, no network, no infrastructure.&lt;/p&gt;

&lt;p&gt;The format that taught the world to ship a change in a dozen lines arrived in 1985, written by a man who would invent Perl two years later. It is, by some distance, the smallest unit of code distribution in widespread use, and it is the format every modern code-review tool, every commit, every diff in your terminal still speaks.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Format
&lt;/h2&gt;

&lt;p&gt;A patch is a recipe, not a snapshot. It does not contain the new file; it contains the change that turns the old file into the new one. That is the whole conceptual move, and it is the reason a fix to a million-line codebase can travel as twenty lines of email.&lt;/p&gt;

&lt;p&gt;The file header names the source and destination paths. Each hunk in the file is a small unit of change with its own little header (the line ranges in the old and new versions: &lt;code&gt;@@ -42,7 +42,9 @@&lt;/code&gt;), followed by the lines themselves: a few lines of context, each prefixed with a space; the lines to remove, each prefixed with &lt;code&gt;-&lt;/code&gt;; the lines to add, each prefixed with &lt;code&gt;+&lt;/code&gt;; and a few more lines of context to close. The context is what allows the apply-tool to find the right place in the file even if the surrounding code has drifted by a few lines since the patch was generated.&lt;/p&gt;

&lt;p&gt;That is the entire vocabulary. A header, hunks, plus and minus and space. The whole standard fits on a page, and it has carried every code change in every open-source project for the better part of four decades.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Surface
&lt;/h2&gt;

&lt;p&gt;In daily use the idiom is two commands. Make a patch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;diff &lt;span class="nt"&gt;-u&lt;/span&gt; original modified &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; change.patch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply a patch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;patch &lt;span class="nt"&gt;-p1&lt;/span&gt; &amp;lt; change.patch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-u&lt;/code&gt; flag tells diff to use the unified format, which is the one every modern tool produces. The &lt;code&gt;-p1&lt;/code&gt; flag tells patch to strip one leading directory from the paths recorded in the file, which is how you make a patch generated against &lt;code&gt;a/src/main.c&lt;/code&gt; apply to &lt;code&gt;src/main.c&lt;/code&gt; in your tree. The leading-directory game is a convention that grew up around how patches were generated against named source trees and is, after thirty years, simply the gesture one learns.&lt;/p&gt;

&lt;p&gt;A handful of further flags carry most of the rest of the daily work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;patch &lt;span class="nt"&gt;-R&lt;/span&gt; &amp;lt; change.patch         &lt;span class="c"&gt;# reverse: undo the patch&lt;/span&gt;
patch &lt;span class="nt"&gt;--dry-run&lt;/span&gt; &lt;span class="nt"&gt;-p1&lt;/span&gt; &amp;lt; change.patch  &lt;span class="c"&gt;# check, do not write&lt;/span&gt;
patch &lt;span class="nt"&gt;-p1&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; change.patch       &lt;span class="c"&gt;# explicit input file&lt;/span&gt;
patch &lt;span class="nt"&gt;-p1&lt;/span&gt; &amp;lt; change.patch.bz2 | bzcat  &lt;span class="c"&gt;# compressed input via pipeline&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a hunk does not fit cleanly, the patch tool tries to apply it with what it calls line-fuzz: it allows the context lines to be a little wrong, within a small tolerance, on the theory that the file has drifted but not changed beyond recognition. If even that fails, it writes the rejected hunk to a &lt;code&gt;.rej&lt;/code&gt; file beside the target, plainly, with the failing hunk in the original format. You can read the reject, edit by hand, and try again. The tool is honest about its limits, and it tells you exactly what it could not do. Beauty, here, includes telling the truth about failure.&lt;/p&gt;

&lt;h2&gt;
  
  
  On FreeBSD
&lt;/h2&gt;

&lt;p&gt;FreeBSD ships patch in the base system at &lt;code&gt;/usr/bin/patch&lt;/code&gt;, BSD-licensed, descended directly from Larry Wall's original source line. OpenBSD and NetBSD carry the same lineage; macOS does too. The tool is simply present on a fresh install, no package required.&lt;/p&gt;

&lt;p&gt;GNU patch, part of the GNU project and licensed under the GPL, is a separate fork from the same root. It has grown some extra extensions over the years (additional file-name heuristics, more flexible handling of edge cases) but the on-disk format both tools read and write is the same. A patch produced by &lt;code&gt;git diff&lt;/code&gt; on one machine applies on any machine with either implementation. That interoperability after forty years is not an accident; it is what one gets when the format is small enough to specify completely and honestly.&lt;/p&gt;

&lt;p&gt;This series prizes that property of FreeBSD's choices: keep the lean BSD-licensed original in base, where it is one of the few tools every engineer has on day one. The GPL alternative is a &lt;code&gt;pkg install&lt;/code&gt; away if you need a specific extension. For the daily load, the in-base tool is the whole tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Lineage
&lt;/h2&gt;

&lt;p&gt;Larry Wall posted patch 1.3 to the mod.sources newsgroup on 8 May 1985, from his desk at NASA's Jet Propulsion Laboratory in Pasadena. He had already written rn, the news reader, the year before; Perl was still two years away. Wall's patch borrowed the diff output that the BSD &lt;code&gt;diff(1)&lt;/code&gt; tool already produced (in the older context-diff format) and turned it from a thing humans read into a thing programs apply. That move, treating diff output as a machine-readable changeset rather than a human report, is the conceptual reduction the rest of the story rests on.&lt;/p&gt;

&lt;p&gt;The format Wall worked with then was the context diff: each hunk listed several lines of context before and after the change, with the old and new versions in separate blocks. It worked, but it was verbose. In August 1990, Wayne Davison posted unidiff to comp.sources.misc, volume 14: a new format that interleaved the deletions and insertions into a single block, sharing the context between them, and saved roughly a quarter of the bytes on a typical patch. Richard Stallman folded unidiff support into GNU diff 1.15 in January 1991, the patch tool learned to read it shortly after, and the unified format has been the lingua franca ever since. Git produces it. GitHub shows it. Every code-review tool you have used speaks it.&lt;/p&gt;

&lt;p&gt;The two implementations alive today (the BSD line carried in FreeBSD, OpenBSD, NetBSD and macOS; the GNU fork in the GNU project's repositories) both descend from Wall's 1985 source, and both remain interoperable to this day. A bug found in 2020 (Warner Losh, restoring 2.11BSD, traced a thirty-five-year-old corner case in the original parsing) was triaged identically against both. The format is so small that two independent maintainer lines have kept it stable for forty-one years without drifting.&lt;/p&gt;

&lt;p&gt;That is the lesson of the episode. Not "patches are simple" (they are not, in the corners), but "the standard worth keeping is the one small enough to be a sentence". Wall wrote one in 1985. Davison sharpened it in 1990. Every modern code review still depends on it, and a &lt;code&gt;patch -p1 &amp;lt; fix.diff&lt;/code&gt; from a 1986 manual still works on a 2026 FreeBSD installation. That is the kind of compatibility one earns by writing a small, honest format and then leaving it alone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/patch" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt; — System Architect &amp;amp; Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>unix</category>
      <category>freebsd</category>
      <category>history</category>
      <category>devops</category>
    </item>
    <item>
      <title>procstat vs lsof: Asking the System Which Process Holds What</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Mon, 01 Jun 2026 08:45:17 +0000</pubDate>
      <link>https://dev.to/vivian-voss/procstat-vs-lsof-asking-the-system-which-process-holds-what-9f0</link>
      <guid>https://dev.to/vivian-voss/procstat-vs-lsof-asking-the-system-which-process-holds-what-9f0</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fibzecgzgcrxgn1nosflu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fibzecgzgcrxgn1nosflu.png" alt="A young woman with long dark brown hair and pink cat-ear headphones sits at a modern standing desk in a developer's home office, working at a wide ultrawide monitor that shows a dark terminal with columnar fstat output for nginx (PID 4711) and postgres (PID 8082) across /var/log and /var/db. On the desk: a Stream Deck with illuminated keys, a mechanical keyboard with subtle RGB underglow, a steel coffee mug and an open paper notebook with a pen. Three framed posters hang on the wall behind her — a classic Atari Fuji logo (left), a Commodore Amiga Boing Ball (centre), and a Hitchhiker's Guide to the Galaxy " title="" width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Unix Way — Episode 19&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;There is a class of error message that every engineer who has touched a long-running system knows by reflex. "Cannot remove: text file busy." "Address already in use." "Mount point is busy." And, perhaps best of all, the silent contradiction in which &lt;code&gt;df&lt;/code&gt; reports a filesystem at ninety percent while &lt;code&gt;du&lt;/code&gt; can only account for half the bytes. They are different messages, but they all refuse the same question: which process is holding this open?&lt;/p&gt;

&lt;p&gt;FreeBSD and Linux both answer the question, and they answer it properly. They keep their own tools, in their own shapes, for what is at heart the same enquiry to the kernel. The tools, and the shapes they prefer, are the substance of this episode.&lt;/p&gt;

&lt;h2&gt;
  
  
  FreeBSD: procstat (and fstat)
&lt;/h2&gt;

&lt;p&gt;The modern tool is &lt;code&gt;procstat(1)&lt;/code&gt;, introduced by Robert N M Watson and shipped with FreeBSD 9.0 in 2012. It is built on top of &lt;code&gt;libprocstat(3)&lt;/code&gt;, a stable, well-documented C library that every program needing process state can link against directly. The FreeBSD ecosystem (&lt;code&gt;top&lt;/code&gt;, &lt;code&gt;gstat&lt;/code&gt;, debugger frontends, monitoring tools) draws from one common source, and the command-line utility is itself a thin client of that library. &lt;code&gt;procstat&lt;/code&gt; is the per-process introspection front-end:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;procstat &lt;span class="nt"&gt;-f&lt;/span&gt; 4711               &lt;span class="c"&gt;# file-descriptor view, one process&lt;/span&gt;
procstat &lt;span class="nt"&gt;-af&lt;/span&gt;                   &lt;span class="c"&gt;# file-descriptor view, all processes&lt;/span&gt;
procstat &lt;span class="nt"&gt;-t&lt;/span&gt; 4711               &lt;span class="c"&gt;# thread view, with states&lt;/span&gt;
procstat &lt;span class="nt"&gt;-v&lt;/span&gt; 4711               &lt;span class="c"&gt;# virtual-memory map&lt;/span&gt;
procstat &lt;span class="nt"&gt;-k&lt;/span&gt; 4711               &lt;span class="c"&gt;# kernel stack&lt;/span&gt;
procstat &lt;span class="nt"&gt;-s&lt;/span&gt; 4711               &lt;span class="c"&gt;# security credentials&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Different flag, same tool, same library underneath. &lt;code&gt;procstat&lt;/code&gt; is the focused interview with one process; the library is the quietly important half, because the next tool that needs to ask the same questions does not have to parse text or scrape &lt;code&gt;/proc&lt;/code&gt;. It links against &lt;code&gt;libprocstat&lt;/code&gt;, like every other tool on the system.&lt;/p&gt;

&lt;p&gt;Its classical sibling is &lt;code&gt;fstat(1)&lt;/code&gt;, which has served the base system since 4.3BSD-Tahoe in 1988 and is still installed by default. Where &lt;code&gt;procstat&lt;/code&gt; is the per-process interview, &lt;code&gt;fstat&lt;/code&gt; is the system-wide ledger: every open file, by every process, on every filesystem the kernel knows. A handful of flags narrows the listing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;fstat &lt;span class="nt"&gt;-p&lt;/span&gt; 4711                  &lt;span class="c"&gt;# every open file held by pid 4711&lt;/span&gt;
fstat &lt;span class="nt"&gt;-u&lt;/span&gt; www                   &lt;span class="c"&gt;# everything the www user has open&lt;/span&gt;
fstat /var/log/messages        &lt;span class="c"&gt;# every process holding this exact path&lt;/span&gt;
fstat &lt;span class="nt"&gt;-f&lt;/span&gt; /var                  &lt;span class="c"&gt;# everything open under the /var mountpoint&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;fstat&lt;/code&gt; is wonderfully general, and for one specific diagnostic it remains the first reach. The case that most often justifies either tool in production is the deleted-but-held file. A long-running daemon (a database, a log forwarder, a web server) opens a logfile, then someone or something unlinks the file from the filesystem while the daemon still holds it open. The directory entry is gone, so &lt;code&gt;du&lt;/code&gt; cannot see the bytes; but the inode and its blocks remain allocated until the last fd is closed, so &lt;code&gt;df&lt;/code&gt; still counts them. The disk fills, the alert fires, no obvious file is to blame.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;fstat&lt;/code&gt; will show such a file with a &lt;code&gt;-&lt;/code&gt; in the inode column where the link normally lives:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;fstat &lt;span class="nt"&gt;-p&lt;/span&gt; 4711 | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'$5=="-"'&lt;/span&gt;  &lt;span class="c"&gt;# files this pid holds but that are unlinked&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cure is to close the fd. In practice that means a graceful restart, a SIGHUP to a daemon that knows how to reopen its log, or in well-designed services a deliberate logrotate hook. Identifying the offender takes seconds with the right tool. Without it, it takes an evening.&lt;/p&gt;

&lt;h2&gt;
  
  
  Linux: lsof
&lt;/h2&gt;

&lt;p&gt;On Linux the same question is answered by &lt;code&gt;lsof(8)&lt;/code&gt;, written by Vic Abell at Purdue University and first released to &lt;code&gt;comp.sources.unix&lt;/code&gt; in 1991. Two years before that, in 1989, he had published ports of BSD's &lt;code&gt;fstat&lt;/code&gt; and &lt;code&gt;ofiles&lt;/code&gt; commands to DYNIX, SunOS and ULTRIX, and &lt;code&gt;lsof&lt;/code&gt; is in a direct line of descent: it is, recognisably, a generalisation of the BSD idea, written to work across the many Unix-likes that the early 1990s contained.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;lsof&lt;/code&gt; is, in the kindest sense, the universal hammer. It understands regular files, directories, sockets (TCP, UDP, raw, Unix-domain), pipes, anonymous inodes, character and block devices, the various synthetic entries the kernel exposes through &lt;code&gt;/proc&lt;/code&gt;, and a great many things besides. The flag vocabulary is dense; the basics suffice for most working questions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;lsof &lt;span class="nt"&gt;-p&lt;/span&gt; 4711                   &lt;span class="c"&gt;# everything pid 4711 has open&lt;/span&gt;
lsof +D /var/log               &lt;span class="c"&gt;# everything open beneath a directory tree&lt;/span&gt;
lsof &lt;span class="nt"&gt;-i&lt;/span&gt; :443                   &lt;span class="c"&gt;# processes holding TCP/UDP port 443&lt;/span&gt;
lsof &lt;span class="nt"&gt;-u&lt;/span&gt; www-data               &lt;span class="c"&gt;# everything one user has open&lt;/span&gt;
lsof +L1                       &lt;span class="c"&gt;# files with link count &amp;lt; 1: deleted but held&lt;/span&gt;
lsof &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="nt"&gt;-P&lt;/span&gt;                     &lt;span class="c"&gt;# skip DNS and service-name lookups (faster)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;lsof +L1&lt;/code&gt; is the Linux answer to the FreeBSD &lt;code&gt;awk '$5=="-"'&lt;/code&gt; pattern: name the deleted-but-held files and the processes holding them. It is the command you reach for when &lt;code&gt;df&lt;/code&gt; and &lt;code&gt;du&lt;/code&gt; disagree, and it is the right reach the first time.&lt;/p&gt;

&lt;p&gt;The thing to understand about &lt;code&gt;lsof&lt;/code&gt; on modern Linux is that almost everything it reports is derived from &lt;code&gt;/proc&lt;/code&gt;. Each running process has a directory &lt;code&gt;/proc/&amp;lt;pid&amp;gt;/&lt;/code&gt; containing a &lt;code&gt;fd/&lt;/code&gt; subdirectory of symbolic links to the actual files (or sockets, or pipes) that file descriptors point to, and the kernel-maintained &lt;code&gt;net/&lt;/code&gt; files describe socket state. &lt;code&gt;lsof&lt;/code&gt; is, in a real sense, a structured tour of &lt;code&gt;/proc&lt;/code&gt;, presented as one consistent output. If &lt;code&gt;/proc&lt;/code&gt; is the kernel's public filing cabinet, &lt;code&gt;lsof&lt;/code&gt; is the librarian who knows where everything is.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Shape
&lt;/h2&gt;

&lt;p&gt;Two answers to the same question, two different shapes.&lt;/p&gt;

&lt;p&gt;FreeBSD splits the work. &lt;code&gt;procstat&lt;/code&gt; is the focused per-process introspection, &lt;code&gt;fstat&lt;/code&gt; the broad system view that has been part of the base system since 1988, &lt;code&gt;libprocstat&lt;/code&gt; the C library that any tool can link against to ask the same questions in its own way. Each piece does one thing and does it well; the pieces compose, and the next tool that needs to know what process holds what does not need to parse text or scrape &lt;code&gt;/proc&lt;/code&gt;. It reads from &lt;code&gt;libprocstat&lt;/code&gt;, like everyone else on the system.&lt;/p&gt;

&lt;p&gt;Linux folds the same work into one large, lovingly maintained binary. &lt;code&gt;lsof&lt;/code&gt; understands every kind of open file and every kind of socket; it reads from &lt;code&gt;/proc&lt;/code&gt; and from kernel symbol tables and from netlink and from a great many other places, and it presents the union as one consistent output. There is no &lt;code&gt;liblsof&lt;/code&gt;, because &lt;code&gt;lsof&lt;/code&gt; is itself the interface; its sustained existence as one program, with one author and now one maintainer organisation, is what makes its breadth possible.&lt;/p&gt;

&lt;p&gt;Neither shape is wrong. The Linux model trades modularity at the C-library layer for completeness at the command-line layer, and the result is a single tool that handles every variety of "open" the kernel exposes. The FreeBSD model trades that single-binary breadth for a layered design in which every program that needs process state asks the same library, in the same way, and the user-facing tools (&lt;code&gt;procstat&lt;/code&gt;, &lt;code&gt;fstat&lt;/code&gt;, &lt;code&gt;top&lt;/code&gt;, &lt;code&gt;gstat&lt;/code&gt;, debuggers) are thin clients of that library. The first is delightful when you have one weird question and a hurry to answer it. The second is delightful when you are writing the next tool, or when you want to be sure that &lt;code&gt;top&lt;/code&gt; and your monitoring agent and your shell prompt are all reading the same kernel reality.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Point
&lt;/h2&gt;

&lt;p&gt;Both ship with the OS. Neither asks you to install a runtime. Neither wants JSON. Whether your daily reach is &lt;code&gt;procstat -f&lt;/code&gt; or &lt;code&gt;lsof +L1&lt;/code&gt;, the operative discipline is the same: when a system refuses to do what you asked, ask it precisely what is in the way. The kernel will tell you, plainly, on either platform. The Unix way prefers parts that compose; &lt;code&gt;lsof&lt;/code&gt; prefers parts that arrive together. The right answer to "which one?" is "whichever ships with the box in front of you, and learn its flags before the page goes off."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/procstat-vs-lsof" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt; — System Architect &amp;amp; Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>freebsd</category>
      <category>linux</category>
      <category>sysadmin</category>
      <category>devops</category>
    </item>
    <item>
      <title>Why We Restart to Fix It</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Sun, 31 May 2026 09:25:06 +0000</pubDate>
      <link>https://dev.to/vivian-voss/why-we-restart-to-fix-it-4c3</link>
      <guid>https://dev.to/vivian-voss/why-we-restart-to-fix-it-4c3</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo9yuvyrh5nezvahmht1x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo9yuvyrh5nezvahmht1x.png" alt="Quote on the left half, reading: " width="800" height="457"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;On Second Thought — Episode 10&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The pager has gone off. Memory on the auth service is climbing in a way it should not be. You SSH in, you observe nothing in particular, you &lt;code&gt;kubectl delete pod&lt;/code&gt;. The pod comes back, memory is fresh, the graph flattens. The on-call channel returns to silence. Nobody asks what was wrong.&lt;/p&gt;

&lt;p&gt;This is the tenth episode of &lt;em&gt;On Second Thought&lt;/em&gt;, a series about the daily routines we perform without ever quite deciding to. Today's routine is the one that runs at the top of half the world's incident response: when the machine misbehaves, restart it. Have you tried turning it off and on again. The line is a joke, until it is a runbook, until it is the production strategy, until it is the only debugging step we still know.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Axiom
&lt;/h2&gt;

&lt;p&gt;The reflex is universal. A stuck container, a frozen browser tab, a JVM that has been ruminating on a class loader for forty minutes, a kafka consumer that fell off a partition, a connection pool that quietly stopped reaping idle handles. The runbook for half the world's incidents has three steps, and the first two are window-dressing for the third. We accept this as the natural response to a system that fails, in the same way we accept that traffic jams are simply the price of having cars. On second thought, both deserve a second thought.&lt;/p&gt;

&lt;p&gt;The strange thing is not that we restart. Restart is, in the right architecture, a perfectly reasonable response to a particular class of fault. The strange thing is that the restart has become the diagnosis, and that we have built an entire generation of platforms around the assumption that it would.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Origin
&lt;/h2&gt;

&lt;p&gt;The reflex has two parents, and we mostly inherited only one of them.&lt;/p&gt;

&lt;p&gt;The first is the consumer-electronics tradition, codified in the IT Crowd's catchphrase but older than the show. A Sky+ box from 2004, a Windows laptop, a wireless router: a device that has wandered into an unknown state can be returned to a known one only by power-cycling it, because the device offers no language in which to be asked where it had gone wrong. The assumption was reasonable for the hardware of its time. It was honest about its limits: the only legible interface to the device's interior was the off switch.&lt;/p&gt;

&lt;p&gt;The second tradition is the one engineers built deliberately to replace the first, and it is the one we mostly forgot.&lt;/p&gt;

&lt;p&gt;In 1986, at Ericsson's Computer Science Laboratory in Stockholm, Joe Armstrong, Robert Virding and Mike Williams began work on what became Erlang, a language for the kinds of telephone exchanges that simply could not be allowed to go down. The constraint was not academic. A switch that dropped calls cost regulatory fines and lost contracts. A switch that ran for a year between reboots was the point. The output of that work, the AXD301 ATM switch, runs on roughly two million lines of Erlang and is the system most often cited for "nine nines" of reliability: on the order of 31 milliseconds of downtime per year. The figure is contested in the way every figure of that shape is contested; whether the measurement was apples-to-apples, whether it included planned maintenance, whether the operational data was systematically collected. The architecture that produced it, however, is uncontested, and it is the architecture that matters here.&lt;/p&gt;

&lt;p&gt;Armstrong's principle, on the surface, looked exactly like the consumer tradition: when a process gets into a bad state, terminate it. He called it "let it crash", and the phrase has done more damage to the idea than any critic could. Read as a slogan it sounds like the Sky+ box: when in doubt, kill it. Read as architecture, it is the opposite.&lt;/p&gt;

&lt;p&gt;Three properties make it architecture.&lt;/p&gt;

&lt;p&gt;First, processes are isolated. An Erlang process is not a thread, and not a coroutine; it has its own heap, its own message queue, and shares nothing mutable with any other process. When one crashes, it cannot corrupt the state of another, because there is no shared state to corrupt. A crash takes itself with it and nothing else.&lt;/p&gt;

&lt;p&gt;Second, every worker has a supervisor. The supervisor is not a vague concept; it is a specific process, with a specific role, defined in OTP, the standard Erlang library. When a worker crashes, the crash is delivered to its supervisor as a message. The supervisor decides what to do.&lt;/p&gt;

&lt;p&gt;Third, the supervisor decides according to a written strategy. The strategies have names: one-for-one (restart only the crashed worker), one-for-all (restart all siblings), rest-for-one (restart the crashed worker and any later in the dependency order). Every supervisor has a maximum restart frequency, and when the frequency is exceeded, the supervisor itself crashes, which delivers the failure to its supervisor, one level up. A failure escalates a tree, not a runbook. The rule that handles it was written years before the outage.&lt;/p&gt;

&lt;p&gt;Let it crash, in its proper form, is not "have you tried turning it off and on again." It is "we have already decided what to do when this fails, and we wrote it down." The restart is the same gesture. The contract underneath is wholly different.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cost
&lt;/h2&gt;

&lt;p&gt;What we kept of let-it-crash is the let-it-crash. What we left in Stockholm is the supervisor.&lt;/p&gt;

&lt;p&gt;The first cost lands daily. Restart is the diagnosis. The pod comes back, the alert clears, the day is shipped. The cause of the alert is not investigated, because nothing in the response made room for investigation: the runbook said restart, the restart worked, the page closed. A memory leak, a file-descriptor exhaustion, a lock contention, a queue backing up because a downstream service is throttling, each leaves the same heartbeat-recovery signature on a dashboard, and each needs a different fix. The restart erases the question that distinguishes them. The bug remains exactly as resident in the code as it was before the pager went off, with the small refinement that the team is now slightly more trained to ignore it.&lt;/p&gt;

&lt;p&gt;The second cost is structural. We have built whole platforms on the assumption. Kubernetes liveness and readiness probes are, in the honest reading, a contract that the orchestrator will rotate the symptoms while the cause goes unexamined. A pod that fails its liveness check is killed and replaced. There is no concept, in the standard Kubernetes flow, of capturing the dying process's state, of preserving the crash for later inspection, of asking why before the next pod is scheduled. "Self-healing" is the marketing term for this, and it is accurate in the sense that a person who takes paracetamol every four hours has a self-healing headache. The symptom keeps disappearing. The cause has not been touched.&lt;/p&gt;

&lt;p&gt;The third cost is institutional. A team that restarts to fix gets very good at restarting and never gets good at diagnosing. The post-incident review produces a runbook with an additional command. The runbook is consulted next time the pager goes off; the additional command is added; the team's collective intuition about the system shifts from "what is this system actually doing" to "what sequence of recovery steps clears the current alert". In the worst case, the only conjecture anybody had about why the alert ever fired leaves quietly with the last engineer who maintained the service, and the new on-call rotation inherits the runbook but not the model. A few months later the system is misbehaving in a new way that the old runbook does not cover, and nobody is in a position to ask why.&lt;/p&gt;

&lt;p&gt;The fourth cost is the one this series exists to point at: we have stopped expecting our systems to be debuggable. The restart was a shortcut, originally; we took it because diagnosing the live system was hard, and the restart was cheap, and the bug was small. We then built more software on top of that shortcut, and more on top of that, until "you cannot reasonably diagnose this in production" stopped being an embarrassment and started being a feature description. Container orchestration is, among other things, a way to ship software that nobody knows in detail and to rotate it fast enough that no one has to.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Question
&lt;/h2&gt;

&lt;p&gt;There is software in operation that does not work this way, and the alternative is older and quieter than the current default.&lt;/p&gt;

&lt;p&gt;WhatsApp serves north of a billion users with around fifty engineers. Its backend is Erlang. The supervisor model from Stockholm runs the company's production. Crashes happen. They are caught by their supervisors. The strategies are written. The escalation tree handles the rest. The engineers do not spend their day power-cycling boxes; they spend it writing the rules under which the boxes manage themselves. It is a small, deliberate team operating a system that, by every comparable measure, ought to require many times its size to keep running. The supervisor architecture is why.&lt;/p&gt;

&lt;p&gt;In the unixoid tradition, the FreeBSD base provides the operator's half of the same picture. &lt;code&gt;init&lt;/code&gt; and &lt;code&gt;rc.d&lt;/code&gt; use the same model that Stockholm did: explicit start, explicit dependency, explicit recovery. A service has a script that says how it starts, what must be up before it starts, and what to do when it dies. When a service on a FreeBSD machine misbehaves, the operator has &lt;code&gt;dtrace&lt;/code&gt; to follow what the kernel and user-space code are actually doing, &lt;code&gt;ktrace&lt;/code&gt; to record system calls for later inspection, &lt;code&gt;procstat&lt;/code&gt; and &lt;code&gt;fstat&lt;/code&gt; to read what a process is holding, post-mortem core dumps that survive the crash and can be examined at leisure, and a kernel that will, with some precision, tell you what process held what lock at what time. The reboot is available, on FreeBSD as everywhere else. It is rarely the first reach, because the system is willing to speak, and the operator has been trained to listen.&lt;/p&gt;

&lt;p&gt;So the honest question is not whether to keep the restart. The runbooks have it for a reason and they are not foolish. The restart, in a supervisor architecture, is a perfectly normal recovery step. The question is the one we did not write down: in a system that fails, was the restart the answer, or the moment the question got dropped?&lt;/p&gt;

&lt;p&gt;A restart, on second thought, is not a tool. It is a measurement. It tells you, with some precision, how much of the cause you decided you could afford to leave unknown.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/why-we-restart-to-fix-it" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt; — System Architect &amp;amp; Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>reliability</category>
      <category>erlang</category>
      <category>freebsd</category>
      <category>architecture</category>
    </item>
    <item>
      <title>The OS That Brought Its Own AI: How Gemini Moved Into Android's System Slot</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Sat, 30 May 2026 10:08:04 +0000</pubDate>
      <link>https://dev.to/vivian-voss/the-os-that-brought-its-own-ai-how-gemini-moved-into-androids-system-slot-1hgh</link>
      <guid>https://dev.to/vivian-voss/the-os-that-brought-its-own-ai-how-gemini-moved-into-androids-system-slot-1hgh</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp6dl88lny4x5wlhqmesh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp6dl88lny4x5wlhqmesh.png" alt="A tech conference architecture slide on a dark navy board, comparing how the same Gemini model is integrated into Android and iOS 27 side by side. The Android panel shows the gold sparkle (Gemini) inside an ASSISTANT SLOT in the OS layer, with a red INHERITS block listing six system permissions (RECORD_AUDIO, READ_SMS, READ_CONTACTS, READ_CALL_LOG, POST_NOTIFICATIONS, ACCESSIBILITY) and a red FULL SYSTEM ACCESS banner at the bottom. The iOS 27 panel shows the same sparkle inside a red-bordered AI EXTENSION in the Apps layer, labelled on-demand and XPC-mediated, with a green ENTITLEMENTS block (per-invocation prompt only, no background mic, no SMS/contacts/calls) and a green USER-MEDIATED ACCESS banner; an empty ASSISTANT ROLE and an APPLE PRIVACY box with a closed green padlock sit in the iOS layer. A young developer stands beside the board in her t-shirt and pink cat-ear headset, pointer stick raised toward the diagram, in the posture of a security architect explaining the asymmetry to a technical audience." width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Not in the Brief, Episode 05&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Your phone changed assistants. You did not. Sometime between Android 14 and 16, Google Assistant retired and Gemini moved in, into the same slot, with the same long-press, answering to the same "Hey Google". This is the fifth episode of &lt;em&gt;Not in the Brief&lt;/em&gt;, a series on the documented things software does that the user did not ask for. We have covered a browser that brought a language model, a vault that stays open in memory, a screenshot index nobody requested, and a platform setting flipped on by default. This week the change is one layer deeper: it is not a feature inside an app, it is the assistant the operating system answers to.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Feature
&lt;/h2&gt;

&lt;p&gt;Gemini is now Android's default assistant on Pixel 10 and is rolling out across the eligible Android estate. The hardware threshold, set by Google, is Android 10 or later and at least 2 GB of system RAM. The replacement is system-level: Gemini does not sit beside the existing assistant as a competing app. It sits in the assistant slot the OS exposes to all apps. The standalone Google Assistant, having been available since 2016, retires on or around 31 March 2026. After that date, the choices presented by Android in the digital-assistant slot are Gemini or None. The third option that existed for nearly a decade is going away on a published schedule.&lt;/p&gt;

&lt;p&gt;Pixel 10 (announced 20 August 2025, shipping from 28 August 2025) shipped with Gemini already in the slot. On the rest of the eligible estate, Gemini arrived as a system update over the course of 2025, in some cases as a notification offering to "upgrade" the existing assistant (framed, in that wording, as the outdated version of itself). Devices below the threshold keep classic Google Assistant by default, not by choice; they simply lack the resources to run the new model.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Introduction
&lt;/h2&gt;

&lt;p&gt;The transition was confirmed as a roadmap step in late 2025, after an earlier 2024 target was missed. Google announced in December 2025 that the full sunset of Google Assistant on mobile would land in March 2026, with the work continuing across Wear OS, Android Auto, Google TV and Google Home. The rollout is not a single launch event; it is a slow displacement, device-class by device-class, with a hard deadline at the end.&lt;/p&gt;

&lt;p&gt;For users this means three different experiences. On a Pixel 10 the assistant has been Gemini from day one. On most other eligible Android phones it has been switched over by system update, with or without a clear notification. On devices below the eligibility line, classic Assistant continues to work until the standalone service retires, after which those devices will have no current Google assistant at all unless they are replaced.&lt;/p&gt;

&lt;p&gt;The brief signed when a phone was bought said "smart assistant". It did not say which one, and it did not promise that the answer would not change.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Mechanics
&lt;/h2&gt;

&lt;p&gt;A digital assistant on Android is not a special application; it is a role. The operating system exposes a hook (the "assistant role"), and exactly one application at a time is the holder. When a user long-presses the home button, says the wake word, or when a third-party app calls the assistant intent, the OS routes the call to the holder.&lt;/p&gt;

&lt;p&gt;Until 2024, the holder was Google Assistant. As of the rollout described above, on every eligible device, the holder is Gemini. The same long-press, the same "Hey Google", the same downstream apps; a different program at the other end.&lt;/p&gt;

&lt;p&gt;On Pixel-class hardware, Gemini Nano runs on the device for part of the work. Gemini Nano is a small foundation model designed to run locally, and it powers a number of system features the user may not know are powered by a model at all (more on those below). For anything heavier, the call leaves the phone and goes to Google's cloud, where a larger model decides what to do with what was said. The line between "handled locally" and "sent to the cloud" is set by Google; the user interface does not display it.&lt;/p&gt;

&lt;h3&gt;
  
  
  The System-App Integration
&lt;/h3&gt;

&lt;p&gt;The point worth stating, because the assistant slot is only the visible edge of the change, is that Gemini Nano is woven into the Pixel system apps, not just behind the assistant role.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Magic Compose&lt;/strong&gt; in Google Messages drafts suggested replies based on recent conversation history, using Gemini Nano on-device. It generates style variants ("Formal", "Excited", "Chill") and works without an internet connection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Call Notes&lt;/strong&gt;, a Pixel-exclusive feature, records phone calls and uses Gemini Nano to produce a written summary of the conversation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pixel Recorder&lt;/strong&gt; uses Gemini Nano to produce automatic summaries of audio recordings, including the three-bullet summaries for recordings longer than 30 minutes that appeared in 2024 and 2025.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scam Detection&lt;/strong&gt;, launched in 2025, uses Gemini Nano to analyse phone calls in real time and warn the user when the patterns of a fraud script appear.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these is, in isolation, a legitimate feature. The cumulative point is that the language model is now a Pixel platform service, with system apps for phone calls, messages and voice recording wired into it. The slot is not the only place the model lives. It is just the most obvious one.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Architectural Contrast: iOS 27
&lt;/h3&gt;

&lt;p&gt;The same vendor's model lands very differently on the other major mobile platform, and the comparison is the privacy story.&lt;/p&gt;

&lt;p&gt;iOS 27, due in autumn 2026, takes the opposite architectural path. Gemini ships on iOS as a sandboxed extension at the application layer, in a new "Extensions" framework Apple introduced after a public partnership with Google in early 2026. Three properties matter. The extension is opt-in: it does not appear in any assistant role unless the user enables it. It is isolated from system hooks: it cannot become the slot, because iOS does not expose the slot to extensions. It is invoked only when the user explicitly calls it; the wake word, the home gesture and the system microphone all remain Apple's.&lt;/p&gt;

&lt;p&gt;The privacy implication is structural rather than rhetorical. A sandboxed extension can be revoked by the user, audited by the platform, and constrained by permissions on a fine-grained basis. A model installed into the slot inherits the slot: the microphone access, the default-app reach, the assistant intent routed to it by every app on the system.&lt;/p&gt;

&lt;p&gt;The architecture follows the business, and the business is visible in the published filings. Apple's FY2025 revenue was approximately 74 per cent hardware and 26 per cent services; the device is the product, the data on the device is a property to be protected because it is the customer's. Alphabet's 2025 revenue was approximately 74 to 76 per cent advertising (Google Search and YouTube combined), with cloud as the third major segment; the advertising placement is the product, and the data the placement is targeted against is the input. These are not motives or speculations. They are line items in the 10-Q filings. The implementations of the same model on the two platforms sit at the two ends of that spectrum, and they sit there for reasons their respective shareholders can read in the same documents.&lt;/p&gt;

&lt;p&gt;This is the OS-level privacy point this episode is about. It is not whether either platform's model is "safe". It is that the same model has two structurally different exposure surfaces on the two platforms, and the difference is not random.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Risk
&lt;/h2&gt;

&lt;p&gt;A new assistant in the slot is not a breach and it is not a leak. The microphone access was already there; the assistant role was already there; the user agreed, at some abstract point in the past, that their phone would have an assistant. What changed is who is at the other end of all of that.&lt;/p&gt;

&lt;p&gt;The risk worth naming, in plain terms, is exposure asymmetry. Voice input that used to be answered by a search agent is now answered by a language model with different defaults about what it stores, what it summarises and what it forwards to third-party tools (extensions, the Gemini app on iOS, integrated services on Android). A reasonable user, having bought a phone, would expect to be asked before a different programme answers their microphone. The brief said "smart assistant". The execution says "model".&lt;/p&gt;

&lt;p&gt;That distinction is not pedantry. A search agent's job is to route a query to a known endpoint. A language model's job is to interpret the query, decide what data is relevant, decide which extensions or external tools to call, and produce a coherent answer. The decisions are made on the user's behalf, with privacy defaults set by the vendor, on data the user did not explicitly classify. None of this is in itself wrongdoing. All of it is in scope for the question "what is this thing doing with what I said".&lt;/p&gt;

&lt;p&gt;The Awareness-pflicht of this series applies in full: the user can see where this stands on their own phone, and decide.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to See It
&lt;/h2&gt;

&lt;p&gt;Three steps, under a minute, on a current Android device:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Settings → Apps → Default apps → Digital assistant app.&lt;/strong&gt; This is the OS role. The options listed there are the assistants the OS will route the long-press, the wake word and the assistant intent to. If classic Google Assistant is still installed, "Digital assistant from Google" appears; if it is not, the choice is Gemini or None.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;App management → Gemini → Disable + Force Stop.&lt;/strong&gt; Gemini ships as a system app on most devices, which means it cannot be uninstalled in the ordinary sense; it can, however, be disabled, and Force Stop ends any background work it was doing. If the assistant role is set to None and Gemini is disabled, the phone has no active assistant.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Samsung path:&lt;/strong&gt; Settings → Apps → Choose default apps → Assistant app. The same role, under Samsung's menu layout.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For the Pixel-only inline features (Magic Compose, Call Notes, Recorder Summaries, Scam Detection) the relevant settings are per-app: open Settings → Apps → choose Messages, Phone or Recorder, and review the AI-feature toggles and Permissions for each.&lt;/p&gt;

&lt;p&gt;Trade-off, stated plainly: if the role is None and Gemini is disabled, the long-press of the home button does nothing and "Hey Google" does nothing. The phone otherwise works normally. The inline system-app features remain available unless explicitly turned off in their respective apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Note on Architecture and Business
&lt;/h2&gt;

&lt;p&gt;A short note for the reader who reads filings.&lt;/p&gt;

&lt;p&gt;The two implementations of Gemini described above sit at the two ends of a published spectrum. On Apple's side, FY2025 revenue was approximately 74 per cent hardware and 26 per cent services; the device is the customer-facing product, and the privacy of the data on it is a marketing surface, a regulatory consideration and a business asset. On Google's side, 2025 revenue was approximately three quarters advertising; the placement of advertising is the customer-facing product, and the data the placement is targeted against is operational input.&lt;/p&gt;

&lt;p&gt;These observations are not accusations. They are descriptions of where each company makes its money, as reported to their respective regulators. The architectural choice of where to install a foundation model, in the slot or beside it, in the operating system or on top of it, is the kind of choice that follows from those numbers, not the kind of choice that contradicts them. That is the only point this section makes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coda
&lt;/h2&gt;

&lt;p&gt;This episode is about an OS-level change of who is at the other end of the assistant role, made silently on a published schedule, on a population of devices the size of a continent. None of it is a breach. None of it is hidden. All of it can be checked in three taps and decided on the user's own terms.&lt;/p&gt;

&lt;p&gt;The OS used to ask before changing your browser. It did not ask before changing the model behind your microphone. The looking is not difficult. It just has to start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/the-os-that-brought-its-own-ai" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt;, System Architect and Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>android</category>
      <category>gemini</category>
      <category>privacy</category>
      <category>awareness</category>
    </item>
    <item>
      <title>The Licence You Did Not Count: How Oracle Java Went From Per-User to Per-Employee</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Fri, 29 May 2026 06:12:55 +0000</pubDate>
      <link>https://dev.to/vivian-voss/the-licence-you-did-not-count-how-oracle-java-went-from-per-user-to-per-employee-56ao</link>
      <guid>https://dev.to/vivian-voss/the-licence-you-did-not-count-how-oracle-java-went-from-per-user-to-per-employee-56ao</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fesc9jtkem2kz5ze3pqgz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fesc9jtkem2kz5ze3pqgz.png" alt="A modern, bright open-plan office at midday: glass partition walls, wooden panelling, plenty of daylight. On the left a few colleagues work at their desks; the right half of the room stands empty, with vacant chairs and dark, switched-off monitors. A small glowing euro symbol floats above every desk, including all the empty ones on the right. In the foreground Claudine, in a " width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;In the Net, Episode 05&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;You did not change how you use Java. In January 2023, Oracle changed how it counts you. The unit of measurement is no longer how many people run Java, nor how many processors it runs on. It is how many people you employ. This is the fifth episode of &lt;em&gt;In the Net&lt;/em&gt;, a series on the documented mechanics of vendor lock-in. The premise has not changed. Every platform tells you how to come in; the architecture, and increasingly the contract, tells you whether you can leave, and on whose terms.&lt;/p&gt;

&lt;p&gt;The previous four episodes found the lock-in built into a product, a platform, a cloud and an acquisition. This one is built into a definition. The Java runtime did not get worse, did not change hands, and did not move into the cloud. A single word in a price list did the work: "employee".&lt;/p&gt;

&lt;h2&gt;
  
  
  The Promise
&lt;/h2&gt;

&lt;p&gt;Java arrived in 1995 with a promise that turned out to be true: write once, run anywhere. Compile to bytecode once, and the same program runs on any machine with a Java Virtual Machine, regardless of operating system or chip. For roughly three decades, that promise made Java the language the enterprise quietly settled on: banking back-ends, logistics systems, payment processors, government services, and the vast layer of middleware that nobody markets and nobody thinks about until it stops.&lt;/p&gt;

&lt;p&gt;For most of that history the runtime was, in practice, simply there. You downloaded a Java Development Kit, you ran your software, and the question of a licence rarely surfaced. Sun Microsystems, Java's creator, open-sourced the platform as OpenJDK in 2006 and 2007. Oracle acquired Sun in 2010 and inherited both the open-source project and the commercial brand. The promise was real, and it was kept for a very long time. None of what follows is an argument that Java was a trap. It is an account of what happened to the licence under it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hooks
&lt;/h2&gt;

&lt;p&gt;The licensing history is worth stating plainly, because the lock-in is in the sequence. With Java 11 in 2019, Oracle moved its own branded JDK builds to a licence that required a paid subscription for commercial production use. With Java 17 in 2021, Oracle softened this with the No-Fee Terms and Conditions licence, free again for many uses. Then, on 23 January 2023, Oracle introduced the Java SE Universal Subscription and retired the older per-user (Named User Plus) and per-processor metrics for new subscriptions.&lt;/p&gt;

&lt;p&gt;The new metric is per employee. Three properties make it consequential.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Employee" does not mean Java user.&lt;/strong&gt; Oracle's definition counts the entire organisation: full-time, part-time and temporary staff, plus contractors, consultants and agents who support internal operations. It is not the number of people who write Java, run Java, or have ever heard of Java. It is the headcount.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One install makes the whole headcount billable.&lt;/strong&gt; Because the subscription is organisation-wide, a single server legitimately running Oracle's Java is, under this metric, enough to make every employee a billable unit. There is no "we only use it here" tier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The price scales by headcount, not by use.&lt;/strong&gt; Oracle's published price list starts at around 15 US dollars per employee per month for the smallest band and steps down by volume to a few dollars at the largest. A small firm pays a small per-head figure across a small head count; a large firm pays a smaller per-head figure across a very large one. Either way, the bill tracks the size of the company, not the size of its Java estate.&lt;/p&gt;

&lt;p&gt;The runtime did not change. The unit of measurement did, and the unit is now the one thing guaranteed to be large.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Standing
&lt;/h2&gt;

&lt;p&gt;Two dimensions beyond price matter here: market position and how the customer is treated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Market position.&lt;/strong&gt; Java remains one of the most widely deployed runtimes in enterprise computing, and Oracle holds three things that matter: the brand, the official commercial builds, and the compatibility test suite (the TCK) that certifies a build as genuinely Java. The language is open. The trademark, the commercial binary, and the certification are Oracle's. That is enough standing to make a licence change a market event rather than a product note.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How the customer is treated.&lt;/strong&gt; The standing is now enforced by audit, and the audit volume is the part of this story that has moved most. In a survey conducted by Dimensional Research and commissioned by Azul, a commercial OpenJDK vendor and therefore an interested party, 73 per cent of Oracle Java users reported having been audited in the previous three years, and 81 per cent reported they had moved, were moving, or planned to move at least some Java to an open-source alternative. The same body of survey work reported concern about Java pricing rising year on year. Gartner, separately, estimated that the per-employee model could cost two to five times the previous model for the same software, and has been widely cited as predicting that a large share of Java-using organisations would face an Oracle audit approach.&lt;/p&gt;

&lt;p&gt;Read those numbers with the commissioning interest in mind: a competitor paid for the survey, and survey populations skew toward the aggrieved. Even discounted, the direction is unambiguous, and it is corroborated by independent trade reporting: audit activity around Java rose sharply after January 2023. When the audit becomes the most reliable customer touch-point, it has stopped being a risk of the relationship and become the shape of it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Exit That Isn't
&lt;/h2&gt;

&lt;p&gt;Here the exit is, unusually, genuinely open, and the catch is somewhere else.&lt;/p&gt;

&lt;p&gt;Oracle will tell you that you can simply stop using Oracle's JDK and switch to OpenJDK, and this is entirely true. OpenJDK is not a clone or a reimplementation; it is the official reference implementation of Java SE, and Oracle's own branded JDK is built from it. From Java 11 onward, the Oracle build and a current OpenJDK build are, for almost all purposes, the same binary with different labels and support terms. Switching is a reinstall, not a rewrite. No code changes, no recompilation in the normal case.&lt;/p&gt;

&lt;p&gt;The catch is the past. An Oracle audit does not only ask what you run today; it reads download history and installation records. A single Oracle JDK left on a forgotten server, or downloaded under the post-2019 commercial terms at some point in the last several years, can become the basis of a retroactive claim, and because the metric is per employee, the claim is not scoped to that one machine. It is scoped to the workforce. The exit is open for the future. The bill can still arrive for the years behind you, which is why the migration and the audit-exposure review are two separate pieces of work, and the second one is the one to take advice on.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Price
&lt;/h2&gt;

&lt;p&gt;The arithmetic is the argument, so here is the arithmetic. A firm of 12,000 staff, licensed at the entry band of roughly 15 euros per employee per month, pays on the order of 12,000 times 15 euros a month: a little over 2 million euros a year. That figure does not change whether twelve employees use Java or twelve hundred, because the metric never asked. Reported real-world increases on moving from the old per-processor or Named-User-Plus models to per-employee commonly run from three times to ten times, with some estates citing far more; Gartner's two-to-five-times estimate is the conservative end.&lt;/p&gt;

&lt;p&gt;The price of leaving is a migration, and unlike the AWS identity case in an earlier episode, it is bounded. Independent reports put OpenJDK migrations at large enterprises in the range of nine to fourteen months, most of which is testing and operational re-tooling rather than code change. It is real work. It is also one-time work, weighed against a recurring charge sized by a number, your headcount, that you grow on purpose. That asymmetry, a bounded exit against an unbounded stay, is the calculation that changes the moment the per-employee quote lands.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Escape Route
&lt;/h2&gt;

&lt;p&gt;The escape from this particular lock-in is one of the cleanest in the series, precisely because the format was never proprietary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Switch to a free OpenJDK distribution.&lt;/strong&gt; Eclipse Temurin (the vendor-neutral build from the Adoptium project), Amazon Corretto, Azul Zulu Community, Microsoft Build of OpenJDK, BellSoft Liberica, Red Hat's build and IBM Semeru are all production-grade, all certified against the Java compatibility tests, and all licensed under GPLv2 with the Classpath Exception, which permits unrestricted commercial use. They are not lesser Java. From Java 11 on they are built from the same source tree as Oracle's binary. The choice between them is about support contracts and release cadence, not capability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inventory before you migrate.&lt;/strong&gt; The download history is the audit's first exhibit, so the first task is not installing Temurin; it is finding every Oracle JDK across the estate and recording what is there before Oracle asks. This is the step that protects against the retroactive claim, and it is the one most often skipped.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Address the same pattern elsewhere in the Oracle estate.&lt;/strong&gt; The per-metric, audit-enforced approach is not unique to Java. Oracle Database licensing carries its own well-documented version: under Oracle's audit position, running Oracle Database on a VMware cluster can require licensing every physical core in the cluster the software could move to, not only the host it runs on, because Oracle does not recognise VMware as a valid partitioning boundary. The structural defence is the same one that answers Java: where a component is replaceable, keep it replaceable. PostgreSQL and MariaDB carry no per-core, whole-cluster audit exposure, and for a large class of workloads they are a complete answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coda
&lt;/h2&gt;

&lt;p&gt;The lesson generalises past Oracle and past Java. A licence is a promise about cost, and a per-something licence is only as safe as the stability of the "something". When the unit was the processor, you could plan, because you controlled the processors. When the unit becomes the employee, the bill is pegged to the number you are trying hardest to grow, and the one you least want to suppress to save on a runtime.&lt;/p&gt;

&lt;p&gt;Java still runs anywhere; that promise held. What changed is that the licence now counts everyone, whether or not they ever asked for Java, ever used it, or ever knew it was there. The defence is not loyalty to a vendor or hostility to one. It is keeping the replaceable layer replaceable, so that a change in someone else's price list is an inconvenience rather than a summons.&lt;/p&gt;

&lt;p&gt;Write once, run anywhere. Licence once, pay for everyone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/the-licence-you-did-not-count" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt;, System Architect and Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>oracle</category>
      <category>java</category>
      <category>vendorlock</category>
      <category>openjdk</category>
    </item>
    <item>
      <title>Your Neighbour Is Now Root: The cPanel LiteSpeed Plugin Flaw, May 2026</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Thu, 28 May 2026 06:40:31 +0000</pubDate>
      <link>https://dev.to/vivian-voss/your-neighbour-is-now-root-the-cpanel-litespeed-plugin-flaw-may-2026-5gk9</link>
      <guid>https://dev.to/vivian-voss/your-neighbour-is-now-root-the-cpanel-litespeed-plugin-flaw-may-2026-5gk9</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F07ebojbkh62h0zxi3o3l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F07ebojbkh62h0zxi3o3l.png" alt="A dark cyberpunk security operations centre at night, lit by cyan and magenta neon and the glow of monitors, a rain-streaked window with a neon city skyline beyond. A young Developer sits at the desk as the operator, one hand on the keyboard, pink cat-ear headset on, in a t-shirt reading $whoami. The large central monitor displays a live network topology: many small site nodes ringed around one central host node, the host and most of the nodes glowing alarm-red with red lines blazing between them, only a few still cyan-blue, the whole network turning red as the compromise spreads from one account to the entire server." width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Wire Fire — Episode 03&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;On a shared hosting server you share one machine with hundreds of strangers. You have never met them. You did not choose them. This week it emerged that, for a large slice of the world's shared hosting, any one of them could quietly become root, the all-powerful administrator account, and on a shared server root over one account is root over all of them. This is the situation, what it means, and what to do about it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Breach
&lt;/h2&gt;

&lt;p&gt;CVE-2026-48172 is a flaw in the LiteSpeed cPanel plugin: the component that wires the LiteSpeed web server into cPanel and WHM, the control panel that runs a large share of the world's commercial shared hosting. The plugin is the user-end piece that hundreds of customer accounts can reach through the ordinary cPanel interface.&lt;/p&gt;

&lt;p&gt;The timeline is short and worth stating precisely. On 19 May 2026, cPanel pulled the plugin from servers through a nightly update, stating plainly that the vulnerability "allowed unauthorized root access to the server". On 21 May, LiteSpeed shipped a fix, version 2.4.5, after which a broader security review produced version 2.4.7, bundled in the LiteSpeed WHM plugin 5.3.1.0. On 26 May, CISA added CVE-2026-48172 to its Known Exploited Vulnerabilities catalogue: confirmation that it had been used as a zero-day, in the wild, before the patch existed.&lt;/p&gt;

&lt;p&gt;The flaw is rated at the top of the scale: 10.0 on the CVSS v4.0 system, 9.8 on the older v3.1. The number matters less than the sentence cPanel used: unauthorized root access. On a shared server, that is the worst sentence there is.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Scope
&lt;/h2&gt;

&lt;p&gt;Every version of the plugin from 2.3 through 2.4.4 is affected.&lt;/p&gt;

&lt;p&gt;The exposed machines are shared hosting servers, and the word "shared" is the whole story. On such a server, hundreds of customer accounts, each a separate website, a separate business, a separate owner, live side by side on one operating system, under one kernel, with one root account reserved for the host. They are kept apart not by a hard wall but by the convention that each account stays in its own lane.&lt;/p&gt;

&lt;p&gt;This is not a niche stack. Exposure analysis of the affected estate shows it spread across industry, retail and media, with nearly half of all observed exposure falling outside the top sectors entirely: the long tail of small businesses, agencies and personal sites that is precisely what shared hosting exists to serve. cPanel is not an obscure tool. It is the default way a great deal of the web is hosted.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Mechanism
&lt;/h2&gt;

&lt;p&gt;Here is the part that should change behaviour, in plain terms.&lt;/p&gt;

&lt;p&gt;On a shared server, every customer is an ordinary, unprivileged user. Root, the account that can do anything to the whole machine, belongs to the host alone. The entire security model rests on customers never being able to reach it.&lt;/p&gt;

&lt;p&gt;The plugin exposed a function named lsws.redisAble, reachable through cPanel's standard JSON API by any logged-in customer. The function had one fatal omission: it did not check who was calling. It ran with root's authority regardless of which unprivileged user invoked it. This is a textbook case of what the security taxonomy calls incorrect privilege assignment (CWE-266): code that runs with more power than its caller was entitled to.&lt;/p&gt;

&lt;p&gt;The consequence is direct. Any customer on the server, or any attacker who had phished or bought a single customer's password, could ask that function to run a script of their choosing. The script ran as root. From there, the attacker held the entire machine: every other customer's files, every database, every credential, every site. Not their own slice. The whole server.&lt;/p&gt;

&lt;p&gt;Note what was not required. No firewall breach. No kernel exploit. No physical access. Just a normal account, a normal API call, and a function that forgot to ask for identification at the door.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Exposure
&lt;/h2&gt;

&lt;p&gt;If you operate cPanel with the LiteSpeed plugin in any version from 2.3 to 2.4.4, treat this as urgent. Upgrade to version 2.4.7 (delivered with WHM plugin 5.3.1.0 or later), or remove the plugin entirely if you cannot patch immediately. Then assume you may already have been visited: search your logs for the tell-tale call with &lt;code&gt;grep -rE "cpanel_jsonapi_func=redisAble"&lt;/code&gt;, investigate any matches, audit recently created accounts, and restrict cPanel and WHM access to trusted source addresses where you can.&lt;/p&gt;

&lt;p&gt;If you are a hosting customer rather than an operator, here is the uncomfortable truth: you cannot patch this. The vulnerable code is not in your account; it is in the machine your account sits on. Your only lever is to ask your provider whether they have patched, and to treat the answer as a real procurement question rather than a formality.&lt;/p&gt;

&lt;p&gt;On FreeBSD, the boundary this incident is missing has a name: the jail. A jail is kernel-enforced isolation. A process inside a jail cannot see the host's processes, cannot touch files outside its own root, and cannot escalate to the host's root, because from inside the jail the host's root does not exist to be reached. A shared hosting panel emulates separation between tenants inside a single shared system, using ownership rules and plugin logic. A jail enforces it one level down, in the kernel, where a forgotten privilege check in a plugin cannot reach. The same idea underpins Linux containers and illumos zones; the point is not the brand but the level at which the wall is built.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern
&lt;/h2&gt;

&lt;p&gt;Shared hosting sells a boundary it does not enforce, and that is the structural news under this week's headline.&lt;/p&gt;

&lt;p&gt;Hundreds of tenants share one kernel, one filesystem and one root. The separation between them lives in user permissions, in panel logic and in plugin code, which is to say it lives in software written by people, all of it one mistake away from failing. When that mistake lands on a privilege check, the blast radius is not the one account that was compromised. It is every account on the machine. The model works perfectly right up until any single component fumbles, and then it fails completely, for everyone at once.&lt;/p&gt;

&lt;p&gt;For a decision-maker, the translation is direct and it belongs in a procurement conversation, not a post-incident one. On shared hosting, your security boundary is not your own account and your own good practices. It is the weakest account on the same server, managed by the least careful neighbour you have never met. If the data you host would genuinely hurt you to lose or to leak, the question is not whether your password is strong. It is whether you are sharing a kernel with strangers. Budget for real isolation, a VPS, kernel-enforced containers, jails, zones, or accept, consciously, that your neighbour is part of your threat model.&lt;/p&gt;

&lt;p&gt;You never chose the strangers on your server. This week, one of them could choose to be you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/your-neighbour-is-now-root" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt;, System Architect and Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>cpanel</category>
      <category>hosting</category>
      <category>freebsd</category>
    </item>
    <item>
      <title>The Fire That Reached the Backups: The OVHcloud Strasbourg Data-Centre Fire, 2021</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Wed, 27 May 2026 06:39:25 +0000</pubDate>
      <link>https://dev.to/vivian-voss/the-fire-that-reached-the-backups-the-ovhcloud-strasbourg-data-centre-fire-2021-1m8f</link>
      <guid>https://dev.to/vivian-voss/the-fire-that-reached-the-backups-the-ovhcloud-strasbourg-data-centre-fire-2021-1m8f</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F83b6jd8fkwuz78l02k26.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F83b6jd8fkwuz78l02k26.png" alt="Night scene outside a data centre. A tall, multi-storey server building stands fully ablaze, flames driving upward through its core and breaking from the upper floors, thick black smoke rolling into a dark sky. Fire engines with blue lights wait on wet ground at the base. In the foreground, a young developer with a pink cat-ear headset stands small against the scale of it, watching the building burn." width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tales from the Bare Metal — Episode 05&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In the early hours of 10 March 2021, a fire began in a power room in Strasbourg. By morning an entire data centre had been destroyed and a second badly damaged. Around 3.6 million websites went offline. For a great many of those customers the sites came back within days. For some, they never came back at all, because the only copy of their data had been in the building that burned. The data loss is not the lesson of this episode. The lesson is that a backup can be complete, valid, restorable, and still worthless, if it shares a failure domain with the thing it is backing up.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Incident
&lt;/h2&gt;

&lt;p&gt;Shortly before 01:00 on 10 March 2021, fire broke out in a power room at OVHcloud's Strasbourg site, known as SBG. The site comprised several buildings. SBG2 was destroyed entirely. SBG1 was badly damaged, several of its rooms lost. SBG3 and SBG4 were not burned but were powered down as the site was made safe and the power infrastructure was gone.&lt;/p&gt;

&lt;p&gt;The scale of the dependent estate became clear within hours. According to figures cited in the official investigation, roughly 3.6 million websites, corresponding to around 464,000 domain names, were unavailable at the height of the crisis, close to 18 per cent of the active IP addresses OVH had assigned over the preceding fortnight. Game servers, government sites, e-commerce shops and countless small businesses went dark together. OVHcloud's founder communicated openly and frequently through the days that followed, and the company moved quickly to rebuild and to ship replacement capacity. But for customers whose only copy of their data lived on the SBG site, no amount of openness brought the data back. It was gone.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Diagnosis
&lt;/h2&gt;

&lt;p&gt;The fire started in the power room. The French Bureau of Investigation and Analysis on Industrial Risks (BEA-RI) published its report in June 2022. The report records high humidity readings near one of the power inverters in the hour before the fire began, and discusses the inverters as a likely origin, but it deliberately stops short of asserting a single definitive cause. That hedge is worth respecting: the precise ignition is not known with certainty, and inventing one would be dishonest.&lt;/p&gt;

&lt;p&gt;What is understood, and what matters more for the lesson, is why a fire in one power room became the loss of a building. Three design facts compounded.&lt;/p&gt;

&lt;p&gt;First, the cooling. SBG2 was built in 2011 using a tower design with free cooling, sometimes called auto-ventilation: rather than mechanical chillers, the building let the waste heat of the servers rise and vent at the top, drawing cooler outside air in at the bottom. As an energy strategy this is genuinely elegant and genuinely efficient. As a fire behaviour, a tall shaft with a strong natural updraught is, in the words that have followed the incident, rather like a chimney. The same airflow that cooled the servers fed and lifted the fire.&lt;/p&gt;

&lt;p&gt;Second, the construction. The floors were wooden, rated to resist fire for about an hour. An hour is a long time at a desk and a short time against a fed fire in a ventilated tower.&lt;/p&gt;

&lt;p&gt;Third, suppression. OVH had chosen not to fit any of the five buildings on the Strasbourg site with an automatic fire-extinguishing system. There were detection and human response and the fire brigade, but no gas or water system that triggers on its own in the room of origin in the first minutes, which are the minutes that decide whether a fire stays in one rack or takes a building.&lt;/p&gt;

&lt;p&gt;None of these, on its own, is the villain. Together they meant that an event in one power room had very little standing between it and the whole structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Context
&lt;/h2&gt;

&lt;p&gt;The hard part of this story is not OVH's building. It is the customers' assumption, because that assumption is nearly universal and it is the part that travels.&lt;/p&gt;

&lt;p&gt;The first condition is a mental model. We say "it is in the data centre" or "it is in the cloud" and we hear "it is safe". The phrase abstracts away the physical fact: a specific building, in a specific town, with specific walls and a specific power room. Almost nobody, choosing where their backup lives, pictures the building. The abstraction that makes cloud convenient is the same abstraction that hides the failure domain.&lt;/p&gt;

&lt;p&gt;The second condition is the shape of the tools. A hosting panel offers a backup option, often a cheap one, and the nearest and cheapest option is frequently storage in the same data centre, sometimes the same building. The interface presents "backup" as a feature you switch on, not as a question about geography. So customers switched it on, in good faith, and their primary and their backup came to sit inside one failure domain, chosen by default rather than by decision. The word "backup" did all the reassuring; the location did all the risk.&lt;/p&gt;

&lt;p&gt;The third condition is ownership. The location of a backup is rarely anyone's explicit, written requirement. It is a setting, a default, a checkbox during provisioning, and checkboxes have no owner. Restore-testing, the subject of this series' first episode, at least tends to land on someone's plate eventually. "Is our backup in a different failure domain from our primary?" is a question that, in a great many organisations, no one has ever been assigned to answer.&lt;/p&gt;

&lt;p&gt;And all of it was reasonable at the time it was decided. The free-cooling tower was a real efficiency innovation that saved real energy for a decade. The single-site backup was a real saving that worked perfectly every day the building did not burn. These were not careless choices. They were ordinary trade-offs whose hidden assumption, the failure domains do not overlap, was simply never tested until a fire tested it for everyone at once.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Principle
&lt;/h2&gt;

&lt;p&gt;The rule that answers this is older than the cloud, and it is three numbers: 3-2-1. Keep three copies of your data, on at least two different kinds of media, with at least one of them off-site. The number that does the work here is the one: off-site, which does not mean a different rack or a different room, it means a different failure domain, far enough that a fire, a flood, a power surge or a flooded basement at the primary cannot reach it.&lt;/p&gt;

&lt;p&gt;The previous episode of this series gave you the first commandment of backups: thou shalt not trust a backup thou hast not restored. This episode gives you the second, and they are not the same: thou shalt not keep that backup in the building thou art backing up. A backup you have diligently restore-tested every week is still not a backup if it burns in the same fire as the original. Restorability and separation are two independent axes, and you need both. GitLab, in episode one, had the separation and lacked the restorability. OVH's unluckier customers had neither guaranteed.&lt;/p&gt;

&lt;p&gt;In the unixoid tradition the mechanics are unglamorous and well-proven. On FreeBSD, take a ZFS snapshot and &lt;code&gt;zfs send&lt;/code&gt; it over SSH to a pool in another building, another region, or another provider entirely; a cron job and a receiving pool are the whole apparatus, and the stream is incremental after the first run. With restic or borg, back up to object storage in a different region, encrypted, deduplicated, with the repository somewhere the primary's misfortune cannot follow. The tooling is not the hard part and never was. The hard part is the decision to put the second copy somewhere the first copy's bad day cannot reach, and then to verify, with a restore, that it is really there.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where It Travels
&lt;/h2&gt;

&lt;p&gt;The OVH customers were not unusually careless. The same failure domain hides in nearly every modern stack, wearing the local vocabulary.&lt;/p&gt;

&lt;p&gt;On AWS, the trap is the word "zone". Multi-AZ feels like redundancy, and against a single server or rack failure it is. But an Availability Zone is a cluster of buildings in one metropolitan area, and a region is the unit of geographic separation. A database replicated across AZs survives a host fault and may not survive a regional event; the off-site copy is cross-region replication (S3 CRR, cross-region snapshots), and it is a separate, deliberate setting.&lt;/p&gt;

&lt;p&gt;On Azure, the distinction is the storage redundancy tier: locally-redundant storage (LRS) keeps the copies in one data centre, while geo-redundant storage (GRS) places a copy in a paired region hundreds of kilometres away. The cheaper default is the one that shares the postcode.&lt;/p&gt;

&lt;p&gt;On Google Cloud, multi-region buckets and cross-region backups serve the same role, and the same default-versus-decision applies.&lt;/p&gt;

&lt;p&gt;In Kubernetes, the cluster is the failure domain people forget. Velero backups and etcd snapshots that live on the same cluster, or in object storage in the same region, are a second copy in one place. Ship them off-cluster and off-region.&lt;/p&gt;

&lt;p&gt;On-premises, the rule is at its most physical and most ignored. The backup NAS in the same server room as the production servers is not a backup; it is a second copy awaiting the same flood, the same power surge, the same fire. The unfashionable tape, written weekly and carried to a drawer across town or a safe-deposit box, has quietly saved more organisations than any cloud panel's backup toggle.&lt;/p&gt;

&lt;p&gt;The shape is identical everywhere: a copy that shares a failure domain with the original is redundancy in name only. It survives the failures that do not matter much and dies in the one that does.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coda
&lt;/h2&gt;

&lt;p&gt;OVHcloud rebuilt, changed its designs, and the industry spent a fortnight reading think-pieces about fire suppression and free cooling. Both are worth reading. But the durable lesson of 10 March 2021 is not about cooling towers or wooden floors, which are OVH's to fix. It is about a sentence every team can check this afternoon without a single phone call to a vendor: where, physically, is our backup, and could the thing that kills our primary kill it too?&lt;/p&gt;

&lt;p&gt;Redundancy that shares a postcode is decoration. The fire does not read your architecture diagram. It reads the floor plan.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/the-fire-that-reached-the-backups" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt;, System Architect and Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>postmortem</category>
      <category>backup</category>
      <category>devops</category>
      <category>reliability</category>
    </item>
    <item>
      <title>find: The Little Language Pretending to Be a Command</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Tue, 26 May 2026 06:20:36 +0000</pubDate>
      <link>https://dev.to/vivian-voss/find-the-little-language-pretending-to-be-a-command-3nij</link>
      <guid>https://dev.to/vivian-voss/find-the-little-language-pretending-to-be-a-command-3nij</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz2rm86nqq3b4a7qw5i7e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz2rm86nqq3b4a7qw5i7e.png" alt="A European-comic style illustration on a deep-blue background. A large luminous tree fills the frame, its branches hung with small folder and file icons as leaves: a filesystem rendered as a tree. A young developer with a pink cat-ear headset and an RTFM t-shirt, stands at the base holding a small glowing lantern; a thread of warm light runs from it up into the branches. Only some of the leaves are lit, glowing warmly, while the rest stay dim and cool. The image is the find command made visible: a query walks the directory tree, and only the matching files light up." width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Technical Beauty — Episode 37&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A disk is filling up. Somewhere under /var are thousands of stale log files, scattered across directories nobody remembers creating. The task is dreary and familiar: locate the old ones and clear them. One line does it, and reads almost like an instruction to a colleague:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find /var/log &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s1"&gt;'*.log'&lt;/span&gt; &lt;span class="nt"&gt;-mtime&lt;/span&gt; +30 &lt;span class="nt"&gt;-delete&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No loop, no temporary file, no manual descent into each directory. The tool that reads that line as a single coherent thought has been doing so since 1979, and the way it reads is the whole point of this episode.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Grammar
&lt;/h2&gt;

&lt;p&gt;Most Unix tools take flags: a verb and a handful of switches that modify it. find is different. find takes an expression.&lt;/p&gt;

&lt;p&gt;The arguments after the starting path are not flags in the usual sense; they are terms in a small query language. There are primaries, which are tests or actions: &lt;code&gt;-name&lt;/code&gt; matches a glob, &lt;code&gt;-type f&lt;/code&gt; selects regular files, &lt;code&gt;-mtime +30&lt;/code&gt; means "modified more than thirty days ago", &lt;code&gt;-size +100M&lt;/code&gt; means larger than a hundred megabytes, &lt;code&gt;-newer ref&lt;/code&gt; means changed more recently than a reference file. There are operators that combine them: terms written next to each other are joined by an implicit logical AND, &lt;code&gt;-o&lt;/code&gt; is OR, &lt;code&gt;!&lt;/code&gt; is NOT, and parentheses group sub-expressions (escaped from the shell as &lt;code&gt;\( ... \)&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;find walks the directory tree, and for each file it evaluates the expression. If the expression is true, the actions in it fire. That is the entire model. "Find every regular file under here, larger than a hundred megabytes, not owned by root, and print it" is one expression, evaluated once per file, composed from parts you already know.&lt;/p&gt;

&lt;p&gt;This is the reduction the series exists to celebrate. find does not ship a flag for every conceivable query. It ships a grammar, and the grammar composes every query from a small vocabulary of primaries and three operators. The surface you must learn is tiny; the space of things you can express is enormous. That ratio, small vocabulary to large expressivity, is what elegance looks like on a command line.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Surface
&lt;/h2&gt;

&lt;p&gt;In practice, most of what anyone types is a handful of shapes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-type&lt;/span&gt; f &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s1"&gt;'*.conf'&lt;/span&gt;
find /var/log &lt;span class="nt"&gt;-mtime&lt;/span&gt; +30 &lt;span class="nt"&gt;-delete&lt;/span&gt;
find &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-size&lt;/span&gt; +100M &lt;span class="nt"&gt;-exec&lt;/span&gt; &lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-lh&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt; +
find &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-type&lt;/span&gt; d &lt;span class="nt"&gt;-empty&lt;/span&gt; &lt;span class="nt"&gt;-delete&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;-exec&lt;/code&gt; action deserves a note, because it has two forms and the difference matters. &lt;code&gt;-exec cmd {} \;&lt;/code&gt; runs the command once per matched file, substituting the filename for &lt;code&gt;{}&lt;/code&gt;. &lt;code&gt;-exec cmd {} +&lt;/code&gt; gathers as many matches as the command line allows and runs the command as few times as possible, which is dramatically faster for large match sets. The plus form is the one to reach for by default; the semicolon form is for when the command genuinely takes one argument at a time.&lt;/p&gt;

&lt;p&gt;For everything else, find composes with the rest of the toolbox through the pipe, and it does so safely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;find &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-type&lt;/span&gt; f &lt;span class="nt"&gt;-print0&lt;/span&gt; | xargs &lt;span class="nt"&gt;-0&lt;/span&gt; sha256
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;-print0&lt;/code&gt; terminates each filename with a null byte instead of a newline, and &lt;code&gt;xargs -0&lt;/code&gt; reads them the same way. This is not a nicety. Filenames on Unix may contain spaces, newlines, and almost any other byte, and the naive idioms (&lt;code&gt;for f in $(ls)&lt;/code&gt;, or piping plain &lt;code&gt;find&lt;/code&gt; output into a tool that splits on whitespace) corrupt or skip such names, occasionally with destructive results. The null-separated pipeline is the correct way to move a list of arbitrary filenames between tools, and find has supported it for decades. Beauty, here, includes correctness: the elegant idiom is also the safe one.&lt;/p&gt;

&lt;h2&gt;
  
  
  On FreeBSD
&lt;/h2&gt;

&lt;p&gt;FreeBSD ships BSD find in the base system, BSD-licensed, at &lt;code&gt;/usr/bin/find&lt;/code&gt;. It is the lean, POSIX-clean implementation, and on a freshly installed FreeBSD it is simply present, no package required. The same is true on OpenBSD, NetBSD and macOS, all of which carry a BSD-derived find.&lt;/p&gt;

&lt;p&gt;GNU find, part of the GNU findutils package and licensed under the GPL, grew a larger set of primaries over the years (&lt;code&gt;-printf&lt;/code&gt; with its own format language, several regex variants, and more) and accreted complexity accordingly. None of that is wrong, and some of the extensions are genuinely handy; it is simply a different point on the curve between "small and POSIX-clean" and "feature-rich". On FreeBSD it is a &lt;code&gt;pkg install findutils&lt;/code&gt; away, installed as &lt;code&gt;gfind&lt;/code&gt;, for the occasions when a script needs a specific GNU primary. For the daily load, the in-base BSD tool is the whole tool, and that is the version this episode is about: the one that fits in your head.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Lineage
&lt;/h2&gt;

&lt;p&gt;Dick Haight wrote find for Version 7 Unix, released in 1979, along with cpio and expr. He worked in what was then the Unix Support Group, the part of Bell Labs charged with turning the research system into something AT&amp;amp;T could support and ship, rather than in the research group where Unix itself was born.&lt;/p&gt;

&lt;p&gt;There is a well-aired anecdote, preserved in the Unix history archives, that the researchers were faintly put off by the syntax of the USG tools: find did not read like the other commands, with its prefix-expression notation and its little grammar. It was, by the aesthetic of the research room, slightly foreign. They kept it anyway, because it was useful, and because once you stop expecting it to look like grep and start reading it as a query language, it is not foreign at all; it is consistent with itself.&lt;/p&gt;

&lt;p&gt;Forty-seven years later, the expression grammar is essentially unchanged. A find one-liner from a 1980s manual runs today. The modern descendant fd (David Peter, written in Rust, 2017, MIT and Apache licensed) is faster, prettier in its output, and friendlier in its defaults (it ignores &lt;code&gt;.git&lt;/code&gt; and respects &lt;code&gt;.gitignore&lt;/code&gt;), and it reproduces the very same idea: a small set of predicates over a tree walk. The shape was right the first time.&lt;/p&gt;

&lt;p&gt;find is the rare Unix tool that is a little language pretending to be a command. The researchers were right that it reads oddly. They were also right to keep it, because a small grammar that composes every case from a few parts is worth a little oddness at first sight. Learn the vocabulary once, and you can ask the filesystem almost anything, in a sentence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/find" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt; — System Architect &amp;amp; Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>unix</category>
      <category>freebsd</category>
      <category>cli</category>
      <category>sysadmin</category>
    </item>
    <item>
      <title>GELI vs LUKS: Full-Disk Encryption, Two Shapes</title>
      <dc:creator>Vivian Voss</dc:creator>
      <pubDate>Mon, 25 May 2026 06:31:49 +0000</pubDate>
      <link>https://dev.to/vivian-voss/geli-vs-luks-full-disk-encryption-two-shapes-2l03</link>
      <guid>https://dev.to/vivian-voss/geli-vs-luks-full-disk-encryption-two-shapes-2l03</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foetfq0s24terakuz0pgn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foetfq0s24terakuz0pgn.png" alt="A cyberpunk tech-noir scene in European-comic style: a cramped, neon-lit hacker den at night, rain on the window, magenta and cyan neon reflecting on wet surfaces, cables and monitors everywhere. A young developer with a pink cat-ear headset, holds up a small glowing key-chip toward a screen; where its light falls, a cascade of unreadable garbage characters resolves into a clean line reading " width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The Unix Way — Episode 18&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A laptop is left on a train. With full-disk encryption, the person who finds it has an expensive paperweight and a drive full of noise. Without it, they have your SSH keys, your mail, your password store and your customers' data. The stakes are not subtle. FreeBSD and Linux both solve this problem properly, with mature, audited tooling and the same underlying cipher. They arrive at the solution by rather different routes, and the routes are the interesting part.&lt;/p&gt;

&lt;h2&gt;
  
  
  FreeBSD: GELI
&lt;/h2&gt;

&lt;p&gt;GELI is FreeBSD's disk-encryption framework, and the first thing to understand is that it is not a standalone product bolted onto the system. It is a GEOM class. GEOM is FreeBSD's modular block-storage framework, in which every transformation of a disk (mirroring, striping, labelling, encryption) is a class that consumes one or more providers and presents a new provider. Encryption, in this model, is simply one more transform in the stack.&lt;/p&gt;

&lt;p&gt;You initialise a provider, choosing the cipher, key length and, optionally, a data-authentication algorithm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;geli init &lt;span class="nt"&gt;-e&lt;/span&gt; AES-XTS &lt;span class="nt"&gt;-l&lt;/span&gt; 256 &lt;span class="nt"&gt;-a&lt;/span&gt; HMAC/SHA256 /dev/ada1
geli attach /dev/ada1
newfs /dev/ada1.eli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;geli init&lt;/code&gt; writes a small metadata block to the last sector of the provider and sets up the master key, encrypted under your passphrase (and, optionally, one or more key files). The default cipher is AES-XTS; the default key length for AES-XTS is 128 bits, so &lt;code&gt;-l 256&lt;/code&gt; is worth specifying for AES-256-XTS. Key strengthening uses PKCS#5v2 (PBKDF2), with the iteration count auto-tuned to the host.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;geli attach&lt;/code&gt; prompts for the passphrase and creates a new provider, &lt;code&gt;/dev/ada1.eli&lt;/code&gt;. Everything written to &lt;code&gt;.eli&lt;/code&gt; is encrypted on its way to &lt;code&gt;/dev/ada1&lt;/code&gt;; everything read is decrypted on the way back. You can put UFS on it with &lt;code&gt;newfs&lt;/code&gt;, or hand it to ZFS as a vdev (&lt;code&gt;zpool create tank /dev/ada1.eli&lt;/code&gt;), at which point you have an encrypted ZFS pool with all of ZFS's checksumming and snapshots intact.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;-a HMAC/SHA256&lt;/code&gt; flag is the quietly elegant part. Without it, GELI provides confidentiality only: an attacker who flips bits on the raw disk cannot read your data, but the corrupted ciphertext will decrypt into corrupted plaintext, silently. With it, GELI stores a keyed HMAC for each sector and verifies it on read. A tampered or degraded sector is detected and reported, not silently served as rubbish. This is authenticated encryption, and it lives in the same tool, behind one flag.&lt;/p&gt;

&lt;p&gt;GELI also handles encrypted swap cleanly with one-time keys: each boot, swap is encrypted under a fresh random key that is discarded at shutdown, so swapped-out secrets never persist. This is configured in &lt;code&gt;rc.conf&lt;/code&gt; via the &lt;code&gt;geom_eli&lt;/code&gt; mechanism, with no separate subsystem involved.&lt;/p&gt;

&lt;h2&gt;
  
  
  Linux: LUKS
&lt;/h2&gt;

&lt;p&gt;LUKS (Linux Unified Key Setup) is the standard, and it is genuinely excellent. The user-facing tool is &lt;code&gt;cryptsetup&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cryptsetup luksFormat /dev/sdb
cryptsetup open /dev/sdb secret
mkfs.ext4 /dev/mapper/secret
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;cryptsetup luksFormat&lt;/code&gt; writes a LUKS2 header (LUKS2 has been the default format since cryptsetup 2.1, 2019) to the start of the device. The header is the part that distinguishes LUKS from raw dm-crypt: it is a documented metadata format holding the cipher parameters and up to thirty-two key slots, so multiple passphrases or key files can unlock the same master key, and any one can be revoked without re-encrypting.&lt;/p&gt;

&lt;p&gt;The defaults are well chosen and, in one respect, ahead of GELI. The default cipher is &lt;code&gt;aes-xts-plain64&lt;/code&gt; with a 256-bit key per XTS half, which is AES-256-XTS. The default key-derivation function in LUKS2 is argon2id, a memory-hard function (roughly one gibibyte of memory and several hundred milliseconds per attempt by default). Memory-hardness is the property that makes large-scale brute-forcing expensive even on GPUs and custom hardware, because the attacker must provision memory per guess, not merely compute. GELI's PBKDF2 is sound but not memory-hard. On the key-derivation question, LUKS2 is the stronger default, and it is worth saying so plainly.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;cryptsetup open&lt;/code&gt; decrypts the master key with your passphrase and creates &lt;code&gt;/dev/mapper/secret&lt;/code&gt; through device-mapper, the kernel's generic framework for stacking virtual block devices. The underlying crypto target is dm-crypt. If you want authenticated encryption, the equivalent of GELI's HMAC flag, you enable it with &lt;code&gt;cryptsetup&lt;/code&gt;'s &lt;code&gt;--integrity&lt;/code&gt; option, which adds a second device-mapper target, dm-integrity, beneath dm-crypt. It works well; it is a second layer rather than a flag on the first.&lt;/p&gt;

&lt;p&gt;Persistence is via &lt;code&gt;/etc/crypttab&lt;/code&gt; and the initramfs, and LUKS composes with LVM in either order (LVM-on-LUKS or LUKS-on-LVM) depending on whether you want one passphrase for several logical volumes or separate encryption per volume.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Shape
&lt;/h2&gt;

&lt;p&gt;Here is the difference, and it is architectural rather than a matter of which is better.&lt;/p&gt;

&lt;p&gt;On FreeBSD, encryption is a GEOM class. So is mirroring (&lt;code&gt;gmirror&lt;/code&gt;), striping (&lt;code&gt;gstripe&lt;/code&gt;), labelling (&lt;code&gt;glabel&lt;/code&gt;), and journalling (&lt;code&gt;gjournal&lt;/code&gt;). Each consumes providers and presents providers, with a uniform interface, so they stack in any sensible order: a mirror of two encrypted disks, or an encrypted mirror, is the same two classes composed two ways, with the same command vocabulary. The block layer is a single framework of interchangeable transforms, and &lt;code&gt;geli&lt;/code&gt; is one of them. Integrity is not a separate subsystem; it is a flag on the encryption class.&lt;/p&gt;

&lt;p&gt;On Linux, the same capabilities exist and are at least as powerful, but they are assembled from separate subsystems with separate tooling: the LUKS header format, dm-crypt for encryption, dm-integrity for authentication, dm-verity for read-only integrity, LVM for volume management, mdadm for RAID. Each is a different device-mapper target or a different tool, with its own configuration surface, composed into a stack. The power is there; the uniformity is not. You compose specialised parts rather than uniform ones.&lt;/p&gt;

&lt;p&gt;Neither approach is wrong, and the trade-off is real. The Linux ecosystem's modularity is part of why LUKS2 could adopt argon2id ahead of GELI: dm-crypt's key handling is one focused subsystem with its own maintainers and its own release cadence. FreeBSD's coherence is part of why the answer to "encrypt this, mirror it, and detect tampering" is three flags and one mental model rather than three subsystems.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Point
&lt;/h2&gt;

&lt;p&gt;The honest summary for a working engineer in 2026 is this. This series prizes composability, and on that measure GELI is the more Unix-aligned answer: encryption, per-sector authentication and one-time-keyed swap are one GEOM class with one vocabulary, composing uniformly with the rest of the FreeBSD block layer, so "encrypt it, mirror it, detect tampering" is three flags and one mental model. LUKS reaches the same secure drive that turns a stolen disk into noise, and on key derivation it is genuinely ahead: LUKS2's argon2id is memory-hard where GELI's PBKDF2 is not, a real reason to prefer it where passphrase brute-force resistance dominates the threat model. But LUKS assembles its result from separate specialised subsystems, a LUKS header over dm-crypt over dm-integrity beside LVM, each a different device-mapper target with its own tooling.&lt;/p&gt;

&lt;p&gt;One framework of uniform parts, or several specialised parts composed. Both arrive at the same place: a drive that is, to anyone without the key, honestly just noise. The Unix way prefers the version you can hold in one hand.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vivianvoss.net/blog/geli-vs-luks" rel="noopener noreferrer"&gt;Read the full article on vivianvoss.net →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;By &lt;a href="https://vivianvoss.net" rel="noopener noreferrer"&gt;Vivian Voss&lt;/a&gt; — System Architect &amp;amp; Software Developer. Follow me on &lt;a href="https://www.linkedin.com/in/vvoss/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; for daily technical writing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>freebsd</category>
      <category>linux</category>
      <category>encryption</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
