By Alison / Last Updated September 5, 2022

What is VM data deduplication

When backing up VMware ESXi VMs, if you have limited storage space or just want to save on VM storage costs, you definitely need to know about VMware data deduplication technique.

Data deduplication is a simple and practical space efficiency feature. In VMware environments, it is often used in combination with data compression. Deduplication removes redundant data blocks, while compression removes additional redundant data from each block. Both of them can effectively reduce the amount of physical storage required to store the data.

VMware logo

In this article I will introduce how VMware data deduplication works, and the detailed steps on how to enable deduplication and compression on vSAN.

How to achieve VMware data deduplication

VMware data deduplication is not a function of ESXi itself, but provided by vSAN, a software-defined component that is fully integrated with vSphere. This means that you must have a valid license to enable vSAN deduplication and compression on your cluster.

vSAN is a two-tier distributed storage system made of a cache tier and a capacity tier. To provide a higher level of storage performance, active VM data is first written to the write buffer with write acknowledgements sent immediately to the guest. When the data is no longer active, the cold data will be destaged to the capacity tier at a time and frequency determined by the vSAN.

VMware vSAN deduplication occurs when cold data is sent to the capacity tier (after the write acknowledgments are sent to the VM). And it is only available on all-flash disk groups, on-disk format version 3.0 or later. Enabling deduplication and compression on vSAN cluster, the algorithm utilizes a fixed 4K block size to detect and remove redundant copies of data blocks within each disk group. However, redundant blocks across multiple disk groups are not deduplicated.

How VMware vSAN data duplication works

Important: Backup VMs before enabling data deduplication

Before you start, you have to know that enabling VMware data deduplication requires a rolling reformat of all disks in the vSAN cluster. Therefore, it is best to set it up during the beginning, since it takes time to migrate, format and move back the data.

However, if you want to enable this vSAN deduplication feature with live data, please be sure to back up your VMs in advance, in case of data loss. With the native VMware backup solutions, you can only back up one VM on the cluster at a time. To quickly back up all VMs on a cluster before enabling VMware data deduplication, some dedicated backup tool may be able to help you better.

Here I introduce you to a free VMware backup software -- AOMEI Cyber Backup. It offers you the following benefits.

Agentless Backup: create complete and independent image-level backup for VMware ESXi VMs.
Multiple Storage Destinations: backup to local or network share destinations.
Automated Execution: create backup schedules to automate virtual machine protection.
Perpetual Free: you can use AOMEI Cyber Backup Free Edition with no time limit.

AOMEI Cyber Backup supports VMware ESXi 6.0 and later versions. Next, I will demonstrate how to create a backup task in 3 steps. You can click the following button to download the freeware:

Download FreewareVMware ESXi & Hyper-V
Secure Download

*You can choose to install this VM backup software on either Windows or Linux system.

3 quick steps to back up all ESXi VMs on a cluster

1. Bind Devices: Access to AOMEI Cyber Backup web client, navigate to Source Device > VMware ESXi > + Add VMware ESXi to add a ESXi host/cluster. And then click > Bind Device.

Add VMware ESXi host

2. Create Backup Task: Navigate to Backup Task > + Create New Task, and then set Task Name, Backup Type, Device, Target, and Schedule.

Create a VMware ESXi backup task

  • Device: cover multiple VMs (10 on Free Edition) on the host in one backup task.
  • Target: selecting to back up to a local path, or to a network path. Used paths will be saved in Favorite Storage for handy selection.
  • Schedule: choosing to perform full, differential or incremental backup, and automate execution daily, weekly or monthly according to the frequency you specified.

Backup schedule

3. Start Backup: You can select Add the schedule and start backup now, or Add the schedule only.

Start Backup

Created backup tasks will be listed and monitored separately, for progress checking and schedule changing.

While the Free Edition covers most of VM backup needs, you can also upgrade to Premium Edition to enjoy:
Backup cleanup: Configure a retention policy to auto delete old backup files and save storage space.
Restore to new location: Create a new VM in the same or another datastore/host directly from the backup, saves the trouble of re-configuring the new VM.

Backup Cleanup

How to enable VMware data deduplication on vSAN

Note: This change will require a rolling reformat of all disk in the vSAN cluster, so please back up the VMs in advance.

1. Launch vSphere web client, and navigate to vSAN Cluster.

2. Go to Configure page and click General on the left inventory.

3. Click Edit next to vSAN is Turned On.

Edit vSAN cluster

4. Check Deduplication and Compression option in Services.

5. Click OK to save the changes. You can check the process in Recent Tasks.

Enable vSAN deduplication and compression

Tip: After enabling VMware data deduplication, you can navigate to vSAN Cluster > Monitor > Capacity > Deduplication and Compression Overview to check the vSAN Deduplication Ratio and Savings status.

how to check vSAN deduplication ratio

Summary

Efficiently saving the storage space without deleting important VM data is the most well-known VMware data deduplication benefit.

VMware data deduplication is a vSAN feature in combination with data compression. In this article I introduced how VMware data deduplication works and how to enable it on a vSAN cluster.

However, enable deduplication and compression vSAN will require a rolling reformat of all disk in the vSAN cluster, therefore, it is better to follow the golden 3-2-1 backup rule to back up VMs on the cluster in advance to protect your VM data from accidents.