Deduplication is the process of minimizing storage space taken by the data by detecting data repetition and storing the identical data only once.
For example, if a managed vault where deduplication is enabled contains two copies of the same file—whether in the same archive or in different archives—the file is stored only once, and a link to that file is stored instead of the second file.
Deduplication may also reduce network load: if, during a backup, a file or a disk block is found to be a duplicate of an already stored one, its content is not transferred over the network.
Deduplication is performed on disk blocks (block-level deduplication) and on files (file-level deduplication), for disk-level and file-level backups respectively.
In Acronis Backup & Recovery 10, deduplication consists of two steps:
Deduplication at source
Performed on a managed machine during backup. Acronis Backup & Recovery 10 Agent uses the storage node to determine what data can be deduplicated, and does not transfer the data whose duplicates are already present in the vault.
Deduplication at target
Performed in the vault after a backup is completed. The storage node analyses the vault’s archives and deduplicates data in the vault.
When creating a backup plan, you have the option to turn off deduplication at source for that plan. This may lead to faster backups but a greater load on the network and storage node.
A managed centralized vault where deduplication is enabled is called a deduplicating vault. When you create a managed centralized vault, you can specify whether to enable deduplication in it. A deduplicating vault cannot be created on a tape device.
Acronis Backup & Recovery 10 Storage Node managing a deduplicating vault, maintains the deduplication database, which contains the hash values of all items stored in the vault—except for those that cannot be deduplicated, such as encrypted files.
The deduplication database is stored in the folder which is specified by the Database path in the Create centralized vault view when creating the vault. Deduplication database can be created in a local folder only.
The size of the deduplication database is about one percent of the total size of archives in the vault. In other words, each terabyte of new (non-duplicate) data adds about 10 GB to the database.
In case the database is corrupted or the storage node is lost, while the vault retains archives and the service folder containing metadata, the new storage node rescans the vault and re-creates the database.