By Alison / Last Updated April 25, 2022

What is VM data deduplication

If you have limited storage space or just want to save on VM storage costs, you definitely need to know about VMware data deduplication technique.

Data deduplication is a simple and practical space efficiency feature. In VMware environments, it is often used in combination with data compression. Deduplication removes redundant data blocks, while compression removes additional redundant data from each block. Both of them can effectively reduce the amount of physical storage required to store the data.

In this article I will introduce how VMware data deduplication works, and the detailed steps on how to enable it.

VMware logo

How to achieve VMware data deduplication

VMware data deduplication is not a function of ESXi itself, but provided by vSAN, a software-defined component that is fully integrated with vSphere. This means that you must have a valid license to enable it on your cluster.

vSAN is a two-tier distributed storage system made of a cache tier and a capacity tier. To provide a higher level of storage performance, active VM data is first written to the write buffer with write acknowledgements sent immediately to the guest. When the data is no longer active, the cold data will be destaged to the capacity tier at a time and frequency determined by the vSAN.

VMware data deduplication occurs when cold data is sent to the capacity tier (after the write acknowledgments are sent to the VM). And it is only available on all-flash disk groups, on-disk format version 3.0 or later. Enabling VMware data deduplication on a vSAN cluster, the algorithm utilizes a fixed 4K block size to detect and remove redundant copies of data blocks within each disk group. However, redundant blocks across multiple disk groups are not deduplicated.

How VMware vSAN data duplication works

Important: Backup VMs before enabling data deduplication

Before you start, you have to know that enabling VMware data deduplication requires a rolling reformat of all disks in the vSAN cluster. Therefore, it is best to set it up during the beginning, since it takes time to migrate, format and move back the data.

However, if you want to enable this feature with live data, please be sure to back up your VMs in advance, in case of data loss. With the native VMware backup solutions, you can only back up one VM on the cluster at a time. To quickly back up all VMs on a cluster before enabling VMware data deduplication, some dedicated backup tool may be able to help you better.

I would like to recommend you a professional backup tool that can back up all VMs on a cluster in 3 easy steps, namely AOMEI Cyber Backup. In addition, it has the following benefits:

Multiple VMs Backup: with 3 easy steps you can create a complete automatic backup task of multiple, or even all VMs on the cluster.
Auto Deletion Scheme: cleaning old backup files that exceed the specified retention period.
Offsite Restore: capable of restoring backups to new VMs on original or another datastore, host/cluster.
Clear Log Monitor: clearly recording all operations made to the VMs with separate error logs for easy reference and troubleshooting.
Affordable Pricing: reasonable charges only base on the number of bound devices, regardless of how many VMs are on the host/cluster.

Next, I will demonstrate how to create a backup task in 3 steps. You can click the following button to start a free trial:

Download Free TrialVMware ESXi & Hyper-V
Secure Download

3 quick steps to back up all ESXi VMs on a cluster

1. Bind Devices: Access to AOMEI Cyber Backup web client, navigate to Source Device > VMware ESXi > + Add VMware ESXi to add a ESXi host/cluster. And then click > Bind Device.

Add VMware ESXi host

2. Create Backup Task: Navigate to Backup Task > + Create New Task, and then set Task Name, Backup Type, Device, Target, Schedule, and Cleanup.

Create a VMware ESXi backup task

  • Device: using AOMEI Cyber Backup you can back up multiple, or even all VMs on the cluster at once.
  • Target: selecting to back up to a local path, or to a network path. Used paths will be saved in Favorite Storage for handy selection.
  • Schedule: choosing to perform full, differential or incremental backup, and automate execution daily, weekly or monthly according to the frequency you specified.

Backup schedule

  • Cleanup: specifying a retention period, and the old backup files that exceed the period will be automatically deleted.

Backup Cleanup

3. Start Backup: You can select Add the schedule and start backup now, or Add the schedule only.

Start Backup

Created backup tasks will be listed and monitored separately, for progress checking and schedule changing.

How to enable VMware data deduplication on vSAN

Note: This change will require a rolling reformat of all disk in the vSAN cluster, so please back up the VMs in advance.

1. Launch vSphere web client, and navigate to vSAN Cluster.

2. Go to Configure page and click General on the left inventory.

3. Click Edit next to vSAN is Turned On.

Edit vSAN cluster

4. Check Deduplication and Compression option in Services.

5. Click OK to save the changes. You can check the process in Recent Tasks.

Enable vSAN deduplication and compression

Tip: After enabling VMware data deduplication, you can navigate to vSAN Cluster > Monitor > Capacity > Deduplication and Compression Overview to check the vSAN Deduplication Ratio and Savings status.

how to check vSAN deduplication ratio

Summary

Efficiently saving the storage space without deleting important VM data is the most well-known VMware data deduplication benefit.

VMware data deduplication is a vSAN feature in combination with data compression. In this article I introduced how VMware data deduplication works and how to enable it on a vSAN cluster.

However, enabling VMware data deduplication and compression will require a rolling reformat of all disk in the vSAN cluster, therefore, it is better to back up VMware ESXi VMs on the cluster in advance to protect your VM data from accidents.