I needed to migrate VMs off an ESXi host. Standard approach: mount an NFS datastore, use vmkfstools to copy VMDKs across.
Expected speed: ~100 MB/s.
Actual speed: 3β5 MB/s.
My conclusion: "The HDD is dying."
So I did what any reasonable person would do: ordered a 4 TB SSD for $350, cloned the entire VMFS datastore onto it, and prepared to write a guide about how SSDs are essential for ESXi migration.
Spoiler: I'm an idiot.
π Interactive benchmark charts β full matrix, sync penalty, IOPS data
π Source β GitHub (raw data coming later)
The Aha Moment (That Took Too Long)
After getting the SSD and seeing 98 MB/s, I benchmarked the original HDD one more time for comparison.
Result: 82 MB/s.
Same HDD. Same VM. Same everything. Except now it was only 20% slower than the brand new $350 SSD.
I hadn't fixed the hardware. I'd accidentally fixed the configuration β and had no idea what I'd changed.
The Real Culprit: NFS Sync Mode
After two days of bisecting changes, I found it.
# /etc/exports β the line that cost me $350
/mnt/raid/esxi-import 192.168.0.0/24(rw,sync,no_root_squash,no_subtree_check)
# ^^^^
# THIS
sync mode forces the NFS server to confirm every write is physically on disk before acknowledging it to the client. vmkfstools issues sequential writes and waits for each acknowledgement before sending the next chunk. On spinning rust, each fsync takes ~10ms. Do the math: 100 writes/sec Γ ~50KB = 5 MB/s. Exactly what I was seeing.
The fix:
# /etc/exports
/mnt/raid/esxi-import 192.168.0.0/24(rw,async,no_root_squash,no_subtree_check)
# Apply β DON'T FORGET THIS STEP
exportfs -ra
# Verify it actually loaded
exportfs -v | grep async
That last exportfs -ra is important. I've seen people edit the file and wonder why nothing changed.
The Numbers (16-Run Matrix)
Once I understood the real variable, I ran a proper benchmark β 16 combinations of source Γ destination Γ NIC speed Γ MTU. Same VM throughout: 32 GB provisioned, ~19 GB allocated (thin/sparse).
| Source | Dest | NIC | MTU | MB/s |
|---|---|---|---|---|
| HDD | HDD | 1G | 1500 | 82.4 |
| HDD | HDD | 1G | 9000 | 85.7 |
| HDD | NVMe | 1G | 1500 | 82.9 |
| HDD | NVMe | 1G | 9000 | 85.8 |
| HDD | HDD | 25G | 1500 | 110.2 |
| HDD | HDD | 25G | 9000 | 114.5 |
| HDD | NVMe | 25G | 1500 | 122.9 |
| HDD | NVMe | 25G | 9000 | 127.4 |
| SSD | HDD | 1G | 1500 | 98.1 |
| SSD | HDD | 1G | 9000 | 102.6 |
| SSD | NVMe | 1G | 1500 | 98.6 |
| SSD | NVMe | 1G | 9000 | 103.0 |
| SSD | HDD | 25G | 1500 | 155.1 |
| SSD | HDD | 25G | 9000 | 157.4 |
| SSD | NVMe | 25G | 1500 | 285.6 |
| SSD | NVMe | 25G | 9000 | 289.7 |
All runs with NFS async. Sync numbers below.
The Sync Penalty β Measured Properly
| Setup | Async | Sync | Multiplier |
|---|---|---|---|
| HDDβHDD 1G | 82.4 MB/s | 6.5 MB/s | 12.7Γ |
| HDDβHDD 25G | 110.2 MB/s | 6.5 MB/s | 16.9Γ |
| SSDβNVMe 25G | 289.7 MB/s | 12.3 MB/s | 23.5Γ |
The faster the stack, the worse sync destroys it. On 25G with NVMe, sync costs you 23.5Γ throughput. This is not a minor tuning knob.
The Bottleneck Hierarchy
Each layer proven by isolating variables:
1. NFS async mode β 13-24Γ improvement, always, no exceptions
2. Source drive β 1.2Γ on 1G, up to 2.3Γ on 25G+NVMe
3. Network speed β 1.3Γ with HDD, up to 2.9Γ with SSD+NVMe
4. Destination drive β ~0 on 1G, +84% on 25G+SSD
5. MTU 9000 β consistent +3-4 MB/s absolute, everywhere
The hierarchy is strict. Fix layer N before upgrading layer N+1.
NVMe destination on 1G = +0.4 MB/s.
NVMe destination on 25G with SSD source = +130 MB/s.
Same hardware. Completely different result.
Step-by-Step: The Correct Way
NFS Server Setup
# /etc/exports
/mnt/raid/esxi-import 192.168.0.0/24(rw,async,no_root_squash,no_subtree_check)
exportfs -ra
exportfs -v | grep async # verify
systemctl enable --now nfs-server
ESXi NFS Mount
esxcli storage nfs add \
-H 192.168.0.105 \
-s /mnt/raid/esxi-import \
-v migration_target
esxcli storage nfs list # verify
Migrate
vmkfstools -i /vmfs/volumes/source/vm/disk.vmdk \
-d thin \
/vmfs/volumes/migration_target/vm/disk.vmdk
Validate
# Hash on ESXi while source is still mounted
md5sum /vmfs/volumes/source/vm/disk-flat.vmdk > /vmfs/volumes/migration_target/disk.md5
# Unmount source, then verify cold on Linux
md5sum -c disk.md5
The Economics
| Option | Cost | Speed gain |
|---|---|---|
| Fix NFS async | $0 | 13-24Γ |
| MTU 9000 | $0 | +3-4 MB/s |
| Upgrade to 25G NIC | ~$150 used | +30-150% (stack dependent) |
| SSD staging | $80-350 | +20% on 1G, +84% on 25G |
Fix the config before buying hardware. Every time.
Lessons Learned
- Configuration mistakes cost more than hardware β $350 SSD, fixed by editing one word in a config file
- Measure before you buy β 5 minutes of benchmarking would have saved the SSD purchase entirely
-
The checklist exists for a reason β
exportfs -v | grep asynctakes 2 seconds - Upgrades multiply, they don't add β NVMe destination means nothing without the network to feed it
Conclusion
If you take away one thing:
NFS sync mode will ruin your day, your week, and your migration project.
Enable async. Reload the config. Verify it loaded. Test it works. Then worry about whether you need an SSD.
Want the Full Data?
This article covers the fundamentals. The Advanced Deep Dive ($9) goes further:
- Multi-target NFS fan-out (8 simultaneous copies, throughput plateau analysis)
- tmpfs as migration target β eliminating destination storage from the critical path entirely
- 2026 hardware update: dual 25G + iSCSI over NVMe RAID0
- Parallelism scaling data and where it stops helping
- Raw NVMe baseline (concurrent dd readers, queue depth analysis)
- Interactive benchmark charts β full matrix, sync penalty, IOPS (free)
- Source β raw data coming later
The destination storage is almost never the real constraint. The advanced article explains what is β and what to do about it.
get in touch: vlad [at] cipeople [dot] com β this is a solved problem.
All numbers from real hardware, real workloads. NFS async throughout unless explicitly marked sync. February 2026.
Top comments (0)