DEV Community

Cover image for 50% of Cloud Hard Drives are Now SMR: Why Hyperscalers Love What Consumers Hate
Joichiro Mitaka
Joichiro Mitaka

Posted on

50% of Cloud Hard Drives are Now SMR: Why Hyperscalers Love What Consumers Hate

If you hang around r/DataHoarder or home lab communities, you’ve likely heard the golden rule of buying hard drives: "Avoid SMR at all costs."

The general consensus is that Shingled Magnetic Recording (SMR) drives have abysmal write speeds, fail early, and will crash your RAID arrays. Yet, if you look at the enterprise storage market today, a massive paradox emerges.


(Caption: Data from Western Digital’s Quarterly Earnings showing UltraSMR crossing 50% of nearline exabyte shipments in 2023)

The numbers don't lie. During recent earnings calls, Western Digital confirmed a massive milestone: their UltraSMR technology now makes up over 50% of their total nearline exabyte shipments. Cloud giants like Dropbox rely on SMR for over 90% of their entire storage fleet. In fact, production for high-capacity SMR drives is essentially sold out as AI and data centers buy them by the truckload.

So, SMR offers higher capacity and a lower price per terabyte. Why does everyone in the consumer and small enterprise space avoid them, and more importantly, how can you start using them like the cloud giants do?

Let's demystify SMR technology, look at how the hyperscalers bypassed its limitations, and explore how you can leverage it in your own home lab or small enterprise using modern open-source tools.


The Root of the Stigma: Drive-Managed SMR and the Physics of the Write Head

To understand the hate, you have to understand the physics of the technology.

Hard drive manufacturers eventually hit a physical wall: the magnetic write head of a hard drive is physically wider than the read head. You need a certain amount of magnetic energy to flip bits on the platter, meaning the write head can only be shrunk so much before it fails to work. However, the read head can be made incredibly tiny.

Conventional Magnetic Recording (CMR) accommodates this bulky write head by writing wide data tracks and leaving tiny gaps between them so they don't overlap.

SMR, on the other hand, exploits the tiny read head. It writes a wide track, and then writes the next track partially overlapping the first—like shingles on a roof. This leaves just enough of the previous track exposed for the smaller read head to read, allowing manufacturers to squeeze up to 25% more data onto the same physical platters.

The catch? You cannot randomly overwrite data. Because the tracks overlap, modifying a single block of data requires the drive to read a massive "zone" of data into a cache, modify it, and rewrite the entire zone sequentially.

The stigma started around 2020 when manufacturers silently slipped Drive-Managed SMR into consumer NAS drives. These drives pretended to be normal CMR drives to the host operating system, attempting to manage all that complex "read-modify-write" garbage collection internally. Under heavy, random-write NAS workloads, their caches would fill up, performance would tank to dial-up speeds, and RAID controllers would assume the drives were dead and drop them from arrays.

Consumers rightly felt burned. But the cloud providers took a completely different approach.


The Cloud's Secret Weapon: Host-Managed SMR, Aggressive Tiering, and Pin 3

Hyperscalers like Dropbox and Google didn't try to force SMR drives to act like traditional CMR drives. Instead, they completely rebuilt their storage infrastructure around three core concepts.

1. Host-Managed Zoned Storage

In a Host-Managed setup, the hard drive doesn't try to hide its shingled nature. It exposes its architecture to the operating system as a Zoned Block Device (ZBD). The drive essentially says to the host: "Here are my zones. You can only write to them sequentially. If you want to delete or overwrite something, you must reset the entire zone and start from the beginning." By shifting the responsibility from the drive's underpowered internal processor to the host machine's software, the performance penalty vanishes.

2. Aggressive Telemetry and Tiering

Rewriting the file system is only half the magic. Cloud services have immense intelligence about your files and access patterns. When you upload a file, it rarely lands directly on a spinning disk; it hits a blazing-fast NVMe cache (the "hot" tier).

However, the moment their algorithms determine that a file is no longer actively being used—say, a cloud backup from three months ago, or an old photo album—they aggressively tier that data. It is migrated off the expensive flash and funneled sequentially, block by tightly-packed block, into the zones of a high-capacity SMR drive (the "cold" tier).

3. Pin 3 Power Management (Letting the Disks Go Cold)

Once that data is packed into an SMR drive, cloud providers take it a step further: they literally pull the plug.

If you've ever "shucked" an external hard drive for your home lab, you might have run into the infamous "Pin 3 Power Disable" (PWDIS) issue, where you had to put Kapton tape over the 3rd SATA pin just to get the drive to spin up. While consumers were taping over this pin out of frustration, hyperscalers were demanding this feature by design.

The PWDIS feature allows data centers to remotely hard-reset or cut power to individual drives. When an SMR drive is full of cold data, it doesn't just sit there spinning, consuming electricity, and generating heat. It is commanded to spin down. It sits entirely dormant, acting as a high-density, rapidly accessible tape replacement. It only wakes up when an API call requests a specific dormant file—spinning up, serving the file with a few seconds of acceptable latency, and going back to sleep.

The Power of Zoned.io

If you want to dive deep into how the foundation of this works on a technical level, the ultimate resource is Zoned.io. Backed by major storage players, it documents the Linux kernel's implementation of Zoned Block Devices. Zoned Storage is becoming critical not just for SMR hard drives, but for next-generation NVMe SSDs (ZNS) that use the exact same sequential-write paradigms to reduce write amplification and extend flash memory life.


Bringing SMR to the Home Lab and Small Enterprise

You don't need to be AWS or Dropbox to take advantage of SMR pricing and density. You just need to pay attention to your workload requirements and use software designed to treat storage sequentially.

SMR is an absolute powerhouse for WORM (Write Once, Read Many) workloads:

  • Media archiving and Plex servers
  • Immutable backup targets
  • Log aggregation and long-term surveillance video storage
  • Data hoarding and cold storage

If you are using ZFS, you can utilize sequential-friendly configurations, but to truly unlock the potential of Host-Managed SMR in a small enterprise or home lab, you need purpose-built archiving tools.


The Do's and Don'ts of SMR Archiving

SMR is an absolute powerhouse for WORM (Write Once, Read Many) workloads, but only if you follow the rules.

The Don'ts: RAID, ZFS, and Standard File Systems

The biggest mistake you can make is blindly handing an SMR drive over to your operating system or RAID controller.

  • Don't use hardware RAID or standard ZFS. RAID arrays (and ZFS RAIDZ) rely on parity calculations and constant metadata updates. If an array needs to rebuild or "resilver", it blasts the drives with random, fragmented writes. The SMR cache will choke, write speeds will drop to practically zero, and the controller will drop the drive entirely.
  • Don't just format it to ext4 or NTFS. Traditional file systems are designed for random I/O. Even if you are just sequentially copying a large video file, the file system is constantly darting its write head back and forth across the platter to update the journal, timestamps, and file allocation tables. This micro-random I/O ruins SMR performance.

The Do's: Buffer, Batch, and Stream

Instead of fighting the drive with random I/O, treat it exactly like a modern tape drive.

  • Do use an SSD staging tier. Borrow the cloud provider's playbook. Buffer your incoming files on an NVMe or standard SSD first. Once you have accumulated a large batch of cold data, flush it to the SMR drive in one massive, continuous stream.
  • Do use Zoned-aware storage layers. The Linux kernel has made massive strides here. For example, btrfs now has a dedicated "Zoned Mode" explicitly designed to work with Zoned Block Devices. It forces all writes to be strictly sequential, keeping metadata organized and eliminating the read-modify-write penalty.

If you want to go deeper into SMR this is a good resource(Stop Murdering your SMR Drives), but if you want to take advantage of SMR pricing and density today, you must completely rethink your storage architecture.

Conclusion

The storage industry is moving toward Zoned Storage, whether consumers like it or not. With the massive shift toward UltraSMR in the enterprise space, the era of ignoring the technology is over.

It's time we stop universally hating SMR drives and start treating them like the specialized, high-capacity archiving tools they are. If you stop handing them directly to standard file systems and instead provide them with the buffered, sequential workloads they crave, they will reward you with massive capacity, incredible cost savings, and a much lower power bill.

Check out the documentation at Zoned.io to understand the kernel-level magic, and if you are looking for an out-of-the-box archiving solution that loves SMR drives, check out AGPL archiving at Huskhord.com.

Top comments (0)