File Backup Terminology, Part 2

Delta or Block-Level Backups

The term “delta” is often used rather flexibly in reference to different backup technologies, but when paired with other terms as in “Delta Backup,” Delta Block Backup,” and “Delta-Style Backup” they generally refer to the same basic backup method.   Deltas are best described as  block-level technology, where as incrementals and differentials are file-level technologies.  It is important to note that delta block techniques are only applied to modified files, not new files. New files are of course just backed up in a normal fashion.

File-level backups will backup a changed file in its entirety, even if it has only changed slightly.  While this may not be much of a problem for small text documents, is can quickly become a problem with very large files like databases. Take for example the email clients like Outlook, which save all received email and attachments in single file databases.  Even if only one email has been received, the entire database file has changed, and is backup again.  Since these databases can easily grow to be several hundreds of megabytes in size you once again end up with a lot of data redundancy.

Delta backups deal with this problem by backing up only the parts files which have changed instead of the whole file. Each changed file is broken down in fixed size blocks and those blocks are compared with the original file. (The size of block that is handled is dependent on the particular program or perhaps on a user chosen size.  Block sizes generally range between 1 and 32 kilobytes in size.)  Only those blocks that contain differences are extracted and backed up.  Deltas can be confusing because they can be applied in a couple of different ways.  There are differential deltas, and incremental deltas.  These work on the same principle as the differential and incremental file backups explained above, but at a much more granular level. Similarly each type of delta would inherit the same type of
advantages and disadvantages.

Deltas are especially advantageous for use in technologies where files are backed up immediately after files are created or modified. This is known as real-time
backup or continuous data protection.  Deltas are also very beneficial when used to backup files over networks with limited bandwidth or to remote servers such as online storage.

Benefits and Disadvantages of Delta Style Backups

  • Delta Backups are extremely fast because of the small amount of data being transferred.
  • Deltas produce much less redundancy, and backups are fractionally smaller than those produced by incremental or differential backups. This dramatically reduces the demands on storage and bandwidth.
  • Deltas of modified files do not produce whole files in the backup, and thus restores absolutely depend on the program that created them to do the restoration.
  • Deltas are slower to restore because the individual files must be reconstructed from their various parts.

Binary Patch Backups (FastBit)

Binary patch technology was originally developed as a way for software developers to easily update their programs on customers over the Internet by sending “patches” that would replace the parts of files that needed modification. Recently it has started to be adapted into backup technologies as well. The most relevant example is a backup technology called FastBittm which is employed by number of online storage vendors.

Binary Patch Backups work very similarly to Deltas, the primary difference being they are even more granular.  Deltas work on a block-level, while binary patches work on the, well, binary level.  Because Deltas backup only the modified parts of files in fixed size blocks, part of that block may contain some unchanged data.  Binary patches avoid this by only copying the actual bytes of the binary code that have changed.

Benefits and Disadvantages of Binary Patch Backups

  • Virtually eliminates all data redundancy, and produces the smallest backups possible with current technologies.
  • It is even less bandwidth intensive than deltas.
  • The production of the actual patch may be more demanding on system resources and more time consuming than deltas, although the loss may be regained in bandwidth and transfer costs.
  • No information about how file reconstruction is handled and how efficient it is.

Mirror Backups

Most backup programs will list mirror backups as an alternative to full, differential, or incremental backups, etc.  Some programs use an alternate term for mirrors, such as “simple copy.”   Mirror backups are basically the simplest type of backup.  There are no real backup technologies being employed when making a mirror style backup, only copy technology.  If you copy and paste a folder from one drive to another you have created a mirror backup of that folder.  The mirrored files generally exist in the same state they did in the source, not compressed into archives like with a full backup.  (Although some programs support compressing each file individually and adding encryption)

When to Use Mirrored Backups

Mirror style backups without compressions are good to use when you are backing up a lot of files with compression already applied them.
For example, music files in mp3 or wma format, images in jpg or png format, videos in dvix, mov, or flv format, and most program install or setup
files are already compressed.  If you include these files in a normal backup that applies compression you will often notice it will be very slow, and you will gain very little extra compression by doing so.  It is best to set up separate backup jobs for compressed files and non compressed files.  If your backup program supports include and exclude filters they can be used to either automatically select or deselect the compressed files respectively.

Benefits and Disadvantages of Mirror Backups

  • Mirror backups are much faster when working with compressed files.
  • Because mirrored files are not placed in single archive files there is less concern about corruption.
  • Since mirror backups generally don’t use compression they can require large amounts of storage space, unless other techniques such as hard linking are also employed

Synthetic Full Backups

Synthetic Full Backup is a term you will see from time to time and it should be understood that it is not  a backup method like those above, but rather a technology that may be applied to one of the above methods to make full restores more efficient and require less down time.

Synthetics are generally only applied in server – client type backup systems.  A client computer may perform a backup by any method, incremental, delta, etc. then transfer that backup to a server.  At some point the server then combines several of the individual backup archives to form a synthetic full backup.  Because of this, after the initial full backup, the client machine only needs to perform backups of new or modified files, another full backup will never be necessary.

The benefits of this approach are twofold.  First, the backup speed of technologies like differentials won’t degrade over time because of the growing size of cumulative archives since a synthetic will be made on a regular basis.  Secondly, when a full restore needs to be made on a client machine, no reconstruction of files or file parts needs to be done.  The reconstruction has already been performed by the server allowing the client machine the fastest possible recovery time.

Hard Linked Backups (also Hardlink)

Some backup software has the ability to employ multiple hard links to preserve space when you wish to save multiple full mirror style backups of the same set of files.

To understand what a hard link is consider how files are stored on a hard drive.  When you save a document file, the physical data can be written any where on the disk.  Then the file system makes a reference or hard link to that physical data with the file name you specify. With some file systems it is possible to create more than one reference to that physical data. Using multiple hard links it is possible to assign any number of file names in different folders to the same physical data.

When using backup programs that support creating  hard links to make several backups of the same files, the program will build hard links for all the files that have not changed. For example, if you create two copies of a folder that contains 100MB of data, they normally would end up using 200MB of space. With hard links they would only use 100MB of space.  If you changed one 2MB file before you make the second copy using hard links, the two folders would consume 102MB of space. The first folder would contain the original 2MB file while the second would contain the modified one.

It should be mentioned that if you decide you want to delete one of the backups containing hard links, it is not a problem, as all the other hard links will be
unaffected.  The physical file on the disk is only deleted when all the hard links to it are removed.  Also hard links can only exist within the same volume. ( e.g. they can not span across different partitions or drives)  On Windows based file systems,  NTFS supports hard links, while FAT does not.

Advertisements

About SCB Enterprises
System Solutions and Integration

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: