DEV Community

mohideen sahib
mohideen sahib

Posted on

VMware Snapshots Explained: Internals, Pitfalls, and Deep Dive into Base + Delta Mechanics

🧠 VMware Snapshots — The Complete Deep Dive

Snapshots are one of VMware’s most powerful yet misunderstood features.
They let you capture a VM’s exact state (disk, memory, and config) and return to it later.
But they also impact performance and datastore health if used carelessly.

This post explains — in detail — how snapshots work, what happens during revert, OS impact, and cluster-level risks.


⚙️ 1. What Is a VMware Snapshot?

A snapshot preserves a VM’s disk, memory, and power state at a point in time.
After it’s taken:

The base disk becomes read-only.

All new writes go to a delta disk.

Optionally, memory and CPU state are saved too.


🧩 2. Files Created During a Snapshot

File Purpose

Base Disk (.vmdk) Original virtual disk, becomes read-only.
Delta Disk (-delta.vmdk) Stores changes after the snapshot.
Memory File (.vmem) Captures RAM contents if “snapshot memory” is enabled.
Snapshot Metadata (.vmsn) Records configuration, disk, and memory references.


🔍 3. How Snapshot Works

  1. Creation

VMware freezes disk I/O briefly.

A new delta file (vmname-000001-delta.vmdk) is created.

Writes now go to the delta file, keeping the base disk intact.

  1. Retention

Each snapshot adds another delta file, forming a chain.

Reads span across all deltas and the base disk.

  1. Deletion (Commit)

Changes in delta files are merged back into the base disk.

Deletion can trigger heavy I/O depending on delta size.


🔄 4. What Happens During Snapshot Revert

  1. Disk State

VMware reconstructs the snapshot point by combining the base and snapshot delta.

The VM now reads from the reconstructed snapshot state.

  1. Memory State

If memory was captured, .vmem restores RAM and CPU registers.

Processes resume exactly as they were — no reboot.

  1. OS Behavior

The OS is not rebooted, but uptime resets if memory is restored.

Some sessions may drop briefly, but the VM remains reachable.


⚠️ 5. Why OS Takes Time to Stabilize After Revert

Even if the vSphere task shows “Revert completed”, the guest OS may need minutes to recover.
That’s because:

Disk caches, journaled filesystems (ext4/NTFS), and swap files revalidate.

VMware triggers background I/O to reattach or consolidate delta data.

This causes temporary CPU and I/O spikes until the OS stabilizes.


🧠 6. Why Delta Files Are Needed

Even when reverting to the base disk, VMware must read delta files because:

They contain changed blocks since the snapshot.

To restore the exact state, VMware applies those deltas backward.

Hence, deltas remain essential even when reverting to “base.”


📁 7. .vmsn and .vmem Explained

File Description

.vmsn Snapshot descriptor containing VM config, disk, and memory pointers.
.vmem Memory dump used to resume the VM’s running state instantly.


🧱 8. What You Can’t Do While Snapshots Exist

Snapshots freeze certain VM operations. You can’t:

Change hardware version.

Expand disks or modify RDM mappings.

Add or remove virtual disks.

Convert the VM to a template (in some cases).

These are blocked to maintain snapshot integrity.


🧮 9. Uptime, Reachability & Performance

Uptime Reset: If memory was saved, uptime reverts to snapshot time.

Reachability: Minor drop during revert; VM becomes accessible soon after.

Performance: Expect short-lived I/O spikes post-revert.

Duration: Snapshot and revert times scale with VM disk size and snapshot depth.


⚡ 10. Speeding Up Snapshots and Reverts

By default, VMware snapshots all attached disks, slowing large VMs.

✅ Pro Tip

If a VM has large, static disks (e.g., archives or NFS mounts):

Temporarily detach those before taking or reverting a snapshot.

Only attached disks are processed, reducing time drastically.

⚠️ Caution

Never detach disks with OS mounts or active apps.

Always reattach using the same SCSI IDs after the operation.


🧩 11. Managing Snapshots in vSphere

A. Take a Snapshot

  1. Right-click VM → Snapshots → Take Snapshot

  2. Name it (e.g., PrePatch_2025-10-15).

  3. Check:

✅ Snapshot the VM’s memory

✅ Quiesce guest file system

  1. Click OK

B. View Snapshots

Right-click VM → Snapshots → Manage Snapshots

C. Revert

  1. Select snapshot → Revert to Snapshot

  2. Wait for the OS to settle before use.


🧹 12. Best Practices

  1. Avoid keeping snapshots >72 hours.

  2. Consolidate or delete snapshots regularly.

  3. Monitor datastore space — deltas grow fast.

  4. Verify app health after revert.

  5. Never use snapshots as backups.


⚠️ 13. When Snapshots Grow Too Large — Cluster-Wide Impact

  1. What Happens When They Accumulate

Each snapshot creates a delta that grows with every write.

VMware must traverse all deltas to read a block — adding latency.

Long snapshot chains cause severe disk I/O degradation.

  1. VM-Level Impact

Slower I/O and degraded performance.

Long consolidation (merge) times.

Backup jobs slow down or fail.

Datastore fill-up can pause or crash VMs.

  1. Cluster-Level Impact

Datastore Pressure: Deltas consume vast space.

vMotion Failures: Large chains increase transfer time.

I/O Spikes: Snapshot consolidations trigger datastore storms.

vSAN Issues: More objects and resync operations, slowing cluster balance.

  1. Prevention

Automate snapshot cleanup with vCenter alarms or scripts.

Monitor datastore usage.

Keep chain depth ≤ 2–3.

Schedule consolidations during off-hours.

Use backup tools that auto-remove snapshots.

  1. In Short

Large snapshots are silent datastore killers.
The more deltas you keep, the slower your VMs — and the riskier your cluster.
Consolidate early, consolidate often.


🧾 14. Summary

Area Key Point

Snapshot Role Point-in-time rollback for quick recovery or testing
Delta Files Hold all post-snapshot changes
Revert Restores disk/memory state without reboot
OS Impact May pause briefly as background I/O completes
Performance Tip Detach static disks for faster snapshot ops
Cluster Risk Large deltas impact datastore and vMotion
Best Practice Keep snapshots short-lived and managed


✍️ In Short

VMware snapshots are like time machines — powerful but costly.
Every revert, merge, and delta read adds I/O overhead.
Use them wisely, monitor size, and let the OS stabilize before declaring success.

Top comments (0)