Sunday, March 19, 2017

Tarsnap vs Snapshots for daily VirtualBox VM backup



Tarsnap vs Snapshots for daily VirtualBox VM backup


Having wiped out the files on my laptop recently, I have learned the importance of regularly backing up my work. (If you are curious, I used Time Machine, so I lost only 2 days' worth of work) I also do most of my development work in VirtualBox VMs. 

The most convenient way of backing up is to create VirtualBox snapshots, and these snapshots can be copied over to external hard disks or uploaded to a cloud backup service. Let's compare the two. 

Note: this is a simplistic approach to backing up VirtualBox VMs. A better approach is to back up the files and configuration within each VM, using tarsnap or another utility. I am looking for the simplest solution, so...


VirtualBox Snapshots
I have created a snapshot to track a day's work. This is the Snapshots tab. 



Within the guest Linux VM, I have checked out some code using Git, installed a few IntelliJ plugins, changed the settings, and of course write some code as well. Besides that, a few reboots to try to access an external USB drive. There are no additions or deletions of large files, or installation of updates. 

How large are the snapshot files?




1.4 GB for the 2 VDI files. 


Tarsnap
Next, I am going to merge the snapshot into the VM, and let's see how large is tarsnap's new archive. 

                                       Total size  Compressed size

All archives                               131 GB            56 GB

  (unique data)                             17 GB           6.9 GB

This archive                                19 GB           8.2 GB

New data                                   297 MB           108 MB


A little bit of explanation if you are not familiar with tarsnap: the right most column is the actual sized of data stored by tarsnap. The first row shows that I have 56 GB stored in tarsnap. But tarsnap is smart enough to deduplicate my data, and uses only 6.9GB of data. When I run tarsnap over my VirtualBox VM folders, the total size of all the files therein is 19GB, but the new data is only 297 MB, or 108MB after compression. 

The difference between the VirtualBox Snapshot and Tarsnap archive is staggering: 1.4GB to 108MB. That is a 13.3x difference. Even when we take the uncompressed tarsnap archive, the difference is still 4.8x. 




Go to link Download