Wednesday, October 2, 2013

Increasing Deduplication Rates in Windows Server 2012

While configuring our VM backup infrastructure, I ran across a major issue with the native deduplication feature in Windows Server 2012. In short, what happened was that the deduplication jobs that automatically run were not able to keep up with the data ingress rate.

Based on Microsoft's documentations, Windows 2012 can dedupe at a maximum rate of 100 GB per hour per volume. In order to scale for higher data ingress rates, it was recommended to use additional volumes. However, from my experience, this was not the case.

Even though I had 4 different volumes (each of which had a data ingress rate of less than 100 GB per hour) it was observed that the deduplication was not keeping up, hence, volumes filling up with Veeam backup data. Normally, in this case, a large amount of this data can be deduped due to the block commonalities in the backup files.

Investigating further, running Get-DedupJob in powershell confirmed that the dedupe jobs were not keeping up. In my case, only 2 out of the 4 volumes were being concurrently deduped even when all the throughput settings are configured.

From above, you can see that out of the 3 manual optimization jobs, only two are running. 

To change the resource utilization by the dedupe jobs, go to Task Scheduler and find the deduplication tasks.

Edit the throughput optimization job. For my scenario, since I had more than sufficient CPU and memory in this server, I changed the job schedule to run 24 hours a day. Under the actions tab, I changed the arguments to dedupe at a priority of "high" and a memory limit of "75" percent.

This increased the resource utilization but also resolved the low deduplication rate. Looking at the dedupe jobs after, I also noticed that all four volumes were concurrently deduping.

No comments:

Post a Comment