<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: vlad</title>
    <description>The latest articles on DEV Community by vlad (@vlad73).</description>
    <link>https://dev.to/vlad73</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3788004%2F45899419-7bc2-4720-8a9d-86da12af1f2d.png</url>
      <title>DEV Community: vlad</title>
      <link>https://dev.to/vlad73</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vlad73"/>
    <language>en</language>
    <item>
      <title>I Spent $350 on an SSD to Fix a Configuration Problem (ESXi VM Migration Deep Dive)</title>
      <dc:creator>vlad</dc:creator>
      <pubDate>Tue, 24 Feb 2026 12:28:54 +0000</pubDate>
      <link>https://dev.to/vlad73/i-spent-350-on-an-ssd-to-fix-a-configuration-problem-esxi-vm-migration-deep-dive-40nc</link>
      <guid>https://dev.to/vlad73/i-spent-350-on-an-ssd-to-fix-a-configuration-problem-esxi-vm-migration-deep-dive-40nc</guid>
      <description>&lt;p&gt;I needed to migrate VMs off an ESXi host. Standard approach: mount an NFS datastore, use &lt;code&gt;vmkfstools&lt;/code&gt; to copy VMDKs across.&lt;/p&gt;

&lt;p&gt;Expected speed: ~100 MB/s.&lt;br&gt;&lt;br&gt;
Actual speed: 3–5 MB/s.&lt;br&gt;&lt;br&gt;
My conclusion: "The HDD is dying."&lt;/p&gt;

&lt;p&gt;So I did what any reasonable person would do: ordered a 4 TB SSD for $350, cloned the entire VMFS datastore onto it, and prepared to write a guide about how SSDs are essential for ESXi migration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Spoiler: I'm an idiot.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📊 &lt;strong&gt;&lt;a href="https://cipeople.github.io/esx-migration/benchmark-results.html" rel="noopener noreferrer"&gt;Interactive benchmark charts&lt;/a&gt;&lt;/strong&gt; — full matrix, sync penalty, IOPS data&lt;br&gt;
🐙 &lt;strong&gt;&lt;a href="https://github.com/cipeople/esx-migration" rel="noopener noreferrer"&gt;Source&lt;/a&gt;&lt;/strong&gt; — GitHub (raw data coming later)&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  The Aha Moment (That Took Too Long)
&lt;/h2&gt;

&lt;p&gt;After getting the SSD and seeing 98 MB/s, I benchmarked the original HDD one more time for comparison.&lt;/p&gt;

&lt;p&gt;Result: &lt;strong&gt;82 MB/s.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Same HDD. Same VM. Same everything. Except now it was only 20% slower than the brand new $350 SSD.&lt;/p&gt;

&lt;p&gt;I hadn't fixed the hardware. I'd accidentally fixed the configuration — and had no idea what I'd changed.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Real Culprit: NFS Sync Mode
&lt;/h2&gt;

&lt;p&gt;After two days of bisecting changes, I found it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# /etc/exports — the line that cost me $350&lt;/span&gt;
/mnt/raid/esxi-import 192.168.0.0/24&lt;span class="o"&gt;(&lt;/span&gt;rw,sync,no_root_squash,no_subtree_check&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;#                                        ^^^^&lt;/span&gt;
&lt;span class="c"&gt;#                                        THIS&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;sync&lt;/code&gt; mode forces the NFS server to confirm every write is physically on disk before acknowledging it to the client. &lt;code&gt;vmkfstools&lt;/code&gt; issues sequential writes and waits for each acknowledgement before sending the next chunk. On spinning rust, each fsync takes ~10ms. Do the math: 100 writes/sec × ~50KB = 5 MB/s. Exactly what I was seeing.&lt;/p&gt;

&lt;p&gt;The fix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# /etc/exports&lt;/span&gt;
/mnt/raid/esxi-import 192.168.0.0/24&lt;span class="o"&gt;(&lt;/span&gt;rw,async,no_root_squash,no_subtree_check&lt;span class="o"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Apply — DON'T FORGET THIS STEP&lt;/span&gt;
exportfs &lt;span class="nt"&gt;-ra&lt;/span&gt;

&lt;span class="c"&gt;# Verify it actually loaded&lt;/span&gt;
exportfs &lt;span class="nt"&gt;-v&lt;/span&gt; | &lt;span class="nb"&gt;grep &lt;/span&gt;async
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That last &lt;code&gt;exportfs -ra&lt;/code&gt; is important. I've seen people edit the file and wonder why nothing changed.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Numbers (16-Run Matrix)
&lt;/h2&gt;

&lt;p&gt;Once I understood the real variable, I ran a proper benchmark — 16 combinations of source × destination × NIC speed × MTU. Same VM throughout: 32 GB provisioned, ~19 GB allocated (thin/sparse).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Dest&lt;/th&gt;
&lt;th&gt;NIC&lt;/th&gt;
&lt;th&gt;MTU&lt;/th&gt;
&lt;th&gt;MB/s&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;1G&lt;/td&gt;
&lt;td&gt;1500&lt;/td&gt;
&lt;td&gt;82.4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;1G&lt;/td&gt;
&lt;td&gt;9000&lt;/td&gt;
&lt;td&gt;85.7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;NVMe&lt;/td&gt;
&lt;td&gt;1G&lt;/td&gt;
&lt;td&gt;1500&lt;/td&gt;
&lt;td&gt;82.9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;NVMe&lt;/td&gt;
&lt;td&gt;1G&lt;/td&gt;
&lt;td&gt;9000&lt;/td&gt;
&lt;td&gt;85.8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;25G&lt;/td&gt;
&lt;td&gt;1500&lt;/td&gt;
&lt;td&gt;110.2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;25G&lt;/td&gt;
&lt;td&gt;9000&lt;/td&gt;
&lt;td&gt;114.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;NVMe&lt;/td&gt;
&lt;td&gt;25G&lt;/td&gt;
&lt;td&gt;1500&lt;/td&gt;
&lt;td&gt;122.9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;NVMe&lt;/td&gt;
&lt;td&gt;25G&lt;/td&gt;
&lt;td&gt;9000&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;127.4&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSD&lt;/td&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;1G&lt;/td&gt;
&lt;td&gt;1500&lt;/td&gt;
&lt;td&gt;98.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSD&lt;/td&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;1G&lt;/td&gt;
&lt;td&gt;9000&lt;/td&gt;
&lt;td&gt;102.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSD&lt;/td&gt;
&lt;td&gt;NVMe&lt;/td&gt;
&lt;td&gt;1G&lt;/td&gt;
&lt;td&gt;1500&lt;/td&gt;
&lt;td&gt;98.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSD&lt;/td&gt;
&lt;td&gt;NVMe&lt;/td&gt;
&lt;td&gt;1G&lt;/td&gt;
&lt;td&gt;9000&lt;/td&gt;
&lt;td&gt;103.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSD&lt;/td&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;25G&lt;/td&gt;
&lt;td&gt;1500&lt;/td&gt;
&lt;td&gt;155.1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSD&lt;/td&gt;
&lt;td&gt;HDD&lt;/td&gt;
&lt;td&gt;25G&lt;/td&gt;
&lt;td&gt;9000&lt;/td&gt;
&lt;td&gt;157.4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSD&lt;/td&gt;
&lt;td&gt;NVMe&lt;/td&gt;
&lt;td&gt;25G&lt;/td&gt;
&lt;td&gt;1500&lt;/td&gt;
&lt;td&gt;285.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSD&lt;/td&gt;
&lt;td&gt;NVMe&lt;/td&gt;
&lt;td&gt;25G&lt;/td&gt;
&lt;td&gt;9000&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;289.7&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All runs with NFS &lt;strong&gt;async&lt;/strong&gt;. Sync numbers below.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Sync Penalty — Measured Properly
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Setup&lt;/th&gt;
&lt;th&gt;Async&lt;/th&gt;
&lt;th&gt;Sync&lt;/th&gt;
&lt;th&gt;Multiplier&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;HDD→HDD 1G&lt;/td&gt;
&lt;td&gt;82.4 MB/s&lt;/td&gt;
&lt;td&gt;6.5 MB/s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;12.7×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HDD→HDD 25G&lt;/td&gt;
&lt;td&gt;110.2 MB/s&lt;/td&gt;
&lt;td&gt;6.5 MB/s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;16.9×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSD→NVMe 25G&lt;/td&gt;
&lt;td&gt;289.7 MB/s&lt;/td&gt;
&lt;td&gt;12.3 MB/s&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;23.5×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The faster the stack, the worse sync destroys it. On 25G with NVMe, sync costs you &lt;strong&gt;23.5×&lt;/strong&gt; throughput. This is not a minor tuning knob.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bottleneck Hierarchy
&lt;/h2&gt;

&lt;p&gt;Each layer proven by isolating variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. NFS async mode     — 13-24× improvement, always, no exceptions
2. Source drive       — 1.2× on 1G, up to 2.3× on 25G+NVMe  
3. Network speed      — 1.3× with HDD, up to 2.9× with SSD+NVMe
4. Destination drive  — ~0 on 1G, +84% on 25G+SSD
5. MTU 9000           — consistent +3-4 MB/s absolute, everywhere
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The hierarchy is strict. &lt;strong&gt;Fix layer N before upgrading layer N+1.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;NVMe destination on 1G = +0.4 MB/s.&lt;br&gt;&lt;br&gt;
NVMe destination on 25G with SSD source = +130 MB/s.&lt;br&gt;&lt;br&gt;
Same hardware. Completely different result.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step-by-Step: The Correct Way
&lt;/h2&gt;

&lt;h3&gt;
  
  
  NFS Server Setup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# /etc/exports&lt;/span&gt;
/mnt/raid/esxi-import 192.168.0.0/24&lt;span class="o"&gt;(&lt;/span&gt;rw,async,no_root_squash,no_subtree_check&lt;span class="o"&gt;)&lt;/span&gt;

exportfs &lt;span class="nt"&gt;-ra&lt;/span&gt;
exportfs &lt;span class="nt"&gt;-v&lt;/span&gt; | &lt;span class="nb"&gt;grep &lt;/span&gt;async   &lt;span class="c"&gt;# verify&lt;/span&gt;
systemctl &lt;span class="nb"&gt;enable&lt;/span&gt; &lt;span class="nt"&gt;--now&lt;/span&gt; nfs-server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ESXi NFS Mount
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;esxcli storage nfs add &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; 192.168.0.105 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-s&lt;/span&gt; /mnt/raid/esxi-import &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-v&lt;/span&gt; migration_target

esxcli storage nfs list   &lt;span class="c"&gt;# verify&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Migrate
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vmkfstools &lt;span class="nt"&gt;-i&lt;/span&gt; /vmfs/volumes/source/vm/disk.vmdk &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; thin &lt;span class="se"&gt;\&lt;/span&gt;
  /vmfs/volumes/migration_target/vm/disk.vmdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Validate
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Hash on ESXi while source is still mounted&lt;/span&gt;
&lt;span class="nb"&gt;md5sum&lt;/span&gt; /vmfs/volumes/source/vm/disk-flat.vmdk &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /vmfs/volumes/migration_target/disk.md5

&lt;span class="c"&gt;# Unmount source, then verify cold on Linux&lt;/span&gt;
&lt;span class="nb"&gt;md5sum&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; disk.md5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Economics
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Option&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Speed gain&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fix NFS async&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;td&gt;13-24×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MTU 9000&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;td&gt;+3-4 MB/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Upgrade to 25G NIC&lt;/td&gt;
&lt;td&gt;~$150 used&lt;/td&gt;
&lt;td&gt;+30-150% (stack dependent)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSD staging&lt;/td&gt;
&lt;td&gt;$80-350&lt;/td&gt;
&lt;td&gt;+20% on 1G, +84% on 25G&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Fix the config before buying hardware. Every time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Configuration mistakes cost more than hardware&lt;/strong&gt; — $350 SSD, fixed by editing one word in a config file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Measure before you buy&lt;/strong&gt; — 5 minutes of benchmarking would have saved the SSD purchase entirely&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The checklist exists for a reason&lt;/strong&gt; — &lt;code&gt;exportfs -v | grep async&lt;/code&gt; takes 2 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Upgrades multiply, they don't add&lt;/strong&gt; — NVMe destination means nothing without the network to feed it&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;If you take away one thing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NFS sync mode will ruin your day, your week, and your migration project.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Enable async. Reload the config. Verify it loaded. Test it works. &lt;em&gt;Then&lt;/em&gt; worry about whether you need an SSD.&lt;/p&gt;




&lt;h2&gt;
  
  
  Want the Full Data?
&lt;/h2&gt;

&lt;p&gt;This article covers the fundamentals. The &lt;a href="https://vladster601.gumroad.com/l/selpp" rel="noopener noreferrer"&gt;Advanced Deep Dive&lt;/a&gt; ($9) goes further:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-target NFS fan-out (8 simultaneous copies, throughput plateau analysis)&lt;/li&gt;
&lt;li&gt;tmpfs as migration target — eliminating destination storage from the critical path entirely
&lt;/li&gt;
&lt;li&gt;2026 hardware update: dual 25G + iSCSI over NVMe RAID0&lt;/li&gt;
&lt;li&gt;Parallelism scaling data and where it stops helping&lt;/li&gt;
&lt;li&gt;Raw NVMe baseline (concurrent dd readers, queue depth analysis)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cipeople.github.io/esx-migration/benchmark-results.html" rel="noopener noreferrer"&gt;Interactive benchmark charts&lt;/a&gt; — full matrix, sync penalty, IOPS (free)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/cipeople/esx-migration" rel="noopener noreferrer"&gt;Source&lt;/a&gt; — raw data coming later&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;The destination storage is almost never the real constraint. The advanced article explains what is — and what to do about it.&lt;/p&gt;

&lt;p&gt;get in touch: vlad [at] cipeople [dot] com — this is a solved problem.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;em&gt;All numbers from real hardware, real workloads. NFS async throughout unless explicitly marked sync. February 2026.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>vmware</category>
      <category>esxi</category>
      <category>homelab</category>
      <category>sysadmin</category>
    </item>
  </channel>
</rss>
