Digital storage basics, Part 3: Backup vs. redundancy
CNET editor Dong Ngo goes over the basics of digital-storage devices for home users.
Editors' note: This post is part of an ongoing series, for the other parts, check out the related stories. This post was updated on March 29, 2013, to add more information on backing up.
One of a storage device's most important roles, if not the most important, is to keep the information stored on it safe, especially from hardware failure. Redundancy and backup are the two popular types of data protection. They are not the same, however, and it's important to understand the differences between the two.
In a nutshell, redundancy in consumer-grade digital storage means using more internal drives than necessary to store the information, or in other words, storing the same data in more than one place. There are many ways to do this, but the most popular is the use of a type of RAID (check Part 2 of this series for more information on), which can be set up on storage devices with two internal drives or more. That said, the first and foremost thing you should remember is that redundancy is not a form of backup, but just a fail-safe measure. The most popular RAID configurations that offer redundancy are RAID 1 and RAID 5.
Again: Redundancy is not a form of backup, but just a fail-safe measure in case of failure of the storage device's internal drive or drives.
RAID 1 (which requires at least two drives) uses double the number of drives necessary to store the information. The two drives mirror each other. Thus, only half of the total storage space is available to the user, while the other half is used for redundancy. RAID 5 (which requires at least three drives) uses at most a third more drive(s) than necessary. In RAID 5, what's available to the user is the combined storage space of all drives used in the array minus one. This way if one drive dies, the rest will kick in immediately and no data is lost.
Note: While RAID generally is available in storage devices with more than one internal drive, for Thunderbolt storage you can daisy-chain multiple single-volume storage devices, such as the
LaCie Little Big Disk Thunderbolt, and create a RAID that way. The storage devices involved need to have two Thunderbolt ports, and once a RAID is created, they need to be used at the same time with the same computer. Most of the time, however, it's more economical to buy a RAID-capable multiple-volume storage device, known as a RAID system, a RAID box, or a RAID array.
You can think of redundancy as using two plastic bags, one inside the other, to carry groceries home from the market. This way, if one of the bags is broken or punctured along the way, food, especially broken eggs, won't spill out.
Redundancy is not perfect, however, and here are its pros and cons.
Pros of redundancy: The biggest and most obvious virtue of redundancy is it protects data against drive failure in real time. This means if you are working on a file and one of the internal drives in a RAID fails, the storage device can continue to work normally. (Some RAID setups can survive when two drives fail.) It will just indicate that one of internal drives has failed, offering you the chance to back up important data and replace the failed drive with another. After that the device itself will blend the replacement drive to become part of the RAID the way the drive it's replacing used to be, in a process called RAID rebuild. During this time, the storage device is still available to use.
In short, redundancy offers an immediate type of data protection. And since internal drive failures can happen at any time, it's important to have redundancy for storage devices that host critical information or that provide a service that must not be interrupted.
Cons of redundancy: The first drawback of redundancy is cost; you have to spend money on multiple drives and this could be expensive. A RAID 1 setup, for example, basically requires double the spending on internal drives.
The second downside is redundancy doesn't provide protection against physical disaster, such as fire or flood, or the failure of the storage device itself. Redundancy also doesn't offer versioning, in which data is saved in different versions (see discussion of backup below).
And lastly the RAID rebuild time can be a very long process that could take days depending on the amount of existing data stored on the storage device. During the rebuild time, the RAID is generally vulnerable, and if a second drive fails before the process is finished, the entire array will crash and you lose all of the data. In fact, during a RAID rebuild time, a RAID storage system is more vulnerable than a single-volume storage device, since rebuilding an array puts a lot of stress on all drives involved, especially when the array still has to provide data to users.
Note: In addition to standard RAID (such as RAID 1 or RAID 5), there are also proprietary RAID setups that, apart from offering redundancy, also permit scaling up data. A typical example of this is the HybridRAID provided as an option in Synology NAS servers. HybridRAID automatically configures the type of redundancy based on the number of internal drives being used. On top of that you can also replace existing drives, one at a time, with drives of larger capacities to increase the total capacity without having to rebuild the RAID from scratch, or even turn the storage device off.
Last thoughts on redundancy: No matter what type of redundancy RAID you use, remember that it's just like insurance, something you need to have just in case, and hope that you will never have to resort to. The option to hot-replace a drive should be used only when absolutely necessary and not viewed as a "fun" or "cool" feature. The more often you use this feature, the more likely you are to lose all of your data stored in the storage device. For this reason, when you get a RAID-capable storage device, it's best to get one that offers lots of storage space to avoid having to replace its internal drives to increase the storage space.
Finally, let me say this one more time: redundancy is not backup. And you should never put all of your data on a single storage device, even one that offers redundancy.
Remember: A redundancy RAID's option to hot-replace a drive should be used only when absolutely necessary and not just for fun. Due to the stress of the RAID rebuild process, the more often you use this feature, the more likely it becomes that you could lose all of the data stored in the storage device. You should never put all of your data on a single storage device, even one that offers redundancy.
Home users might not need redundancy but they definitely need backup, which basically means keeping separate copies of data in multiple places so that if something happens to one place you can turn to another. The more copies of data you have, the safer it is.
In the same grocery shopping analogy, backup is like getting two (or more) separate bags of exactly the same groceries. In this case, if the eggs stored in one bag break, or even in the rare case that you dropped one of the bags and it was run over by a car, you would still be able to make breakfast the next morning thanks to the contents of the other bag.
Backing up is much easier and might happen more often than you might think. For example, e-mailing a Word document to someone (or to yourself) is in itself a form of backup, because now there are at least two copies of the file, one on your computer and one on the recipient's. If you use a Web-based e-mail service, such as Gmail, a copy is stored on one of Google's servers, too. This goes for photos and other types of lightweight (in terms of storage size) data as well.
Obviously, you can't use e-mailing as the main backup method; that'd take too long and you would run out of creative energy very fast. Ideally, you should use more-robust approaches. Following are the most popular ways to back up your data, and who they are good for.
Online backup (also known as cloud backup)
An online backup service allows you to store your data at an off-site location by uploading it via the Internet to a remote computer or computers. Generally, you don't know where the computer that hosts your backup is. In reality, your data is likely hosted by multiple servers in multiple data centers around the world. There are many online backup services, such as , , and , and all of them automatically sync local content with the remote server in real time, or based on a schedule that you set. Most if not all of them offer about 5GB of online storage for free and you can purchase more if need be. Google also offers Google Docs, a Web-based alternative to Microsoft Office that hosts the documents in the cloud (meaning on Google servers) at all times.
The advantages of online backup are that it's convenient and generally keeps data safe from disasters. It also doesn't require you to purchase extra equipment, or use more energy than you already do. Generally you can restore data from anywhere, to any computer, as long as it has a connection to the Internet. This works best for those with a limited amount of data to back up (preferably 5GB or less) and a fast Internet connection.
The main drawback of online backup is it requires and depends heavily on your connection to the Internet, especially the upload speed. For example, say you have a connection that offers 12Mbps upload speed. It would take about an hour to upload 5GB of data. Most existing residential broadband Internet connections offer somewhere between 1Mbps and 3Mbps upload speeds. The same goes for the reverse process, data recovery, which can also take a long time and is only available with a live Internet connection. Online backups also mean that you have to rely on a third party for the safety and privacy of your personal data, and paid services can be expensive over time.
So, if you have lots of photos, songs, and especially homemade movies that you need to keep safe, it's better to find an additional means of backup.
Local or direct-attached backup
This means backing up data on external storage devices or media, such as external hard drives, USB thumbdrives or optical disks. The good news is that these devices are getting progressively larger in capacity and cheaper in price. There are three main types of direct-attached backups: portable storage devices, desktop external storage devices and optical media (CDs, DVD, or Blue-ray).
Portable storage devices are generally compact and bus-powered, meaning you don't need a separate power adapter since the drive uses the data cable, most likely USB or Thunderbolt, for both power and data purposes. These devices are generally based on the 2.5-inch (laptop) internal drive. They are affordable but almost always come in a single-volume design and currently can be found offering 2TB of storage space at most. Still, they make great backup options for those with a limited amount of data (2TB or less) and especially for people who want a backup they can carry around with them. Many portable storage devices come with backup software that runs by itself each time you plug the drive into a computer, making them convenient for home users. Personally, I find using one or two portable drives is the best way to back up my data, all things considered. If you haven't backed up your data at all, get one of these budget drives as soon as you get a chance.
Desktop storage devices are generally based on the 3.5-inch (desktop) internal drive. They come in both single-volume and multiple-volume options. Some multiple-volume devices use the 2.5-inch design to be more physically compact. All desktop storage devices require a separate power adapter but they offer more storage space than portable devices do, up to 4TB for single-volume devices and much more for multiple-volume, depending on the number of drives being used. Also, desktop storage devices can also offer redundancy, making them ideal for those who need both backup and redundancy or need to back up a huge amount of data. Some desktop storage devices can even withstand disasters such as fire or flood; the
Optical media, which includes CDs, DVD, and Blu-ray, has been in decline over the last few years, even though it's a good way to back things up, especially for permanent backups. This is because most optical media is designed to be written on once -- there's very low or no chance that information will be overwritten. (Even RW optical disks -- those that allow for rewriting -- can't be overwritten by accident, since you will need erase the entire disk first). Most computers comes with a DVD/CD optical writer (or burner), some even come with a Blu-ray burner, and the media itself is relatively cheap. Generally optical drives offers 700MB (CDs), up to 8GB (dual-layer DVD), and up to 50GB (dual-layer Blu-ray) of backup storage space. Note that writing on an optical drive takes time and during the writing, the computer needs to be left alone to do its job, without performing other tasks, to ensure successful backups.
The advantage of local backup is that it's fast, especially using drives that use the new Thunderbolt connection, and can handle lots of data. Depending on the backup software, it can also offer versioning, and in many cases the fact that data is stored on a separate device, in addition to the original location, means that you can always go back to the previous version if something happens to the file you're working on.
The main drawback of local backup is that you can only back up one computer or device at a time, and may need to manually plug the drive into a computer before a backup can be made. However easy this task is, a lot of people actually forget to do it. Local backup also means you will have more devices cluttering your desk.
If you have multiple computers at home and want to manage backups in one place, it's a good idea to think about network backup.
As the name suggests, network backup means you have one computer as a backup destination for all other computers and devices. The best way to do this is with a network-attached storage (NAS) server. Backing up is just one of the many functions of a NAS server. And a NAS server can offer all kinds of backups.
However, some NAS servers, such as the Time Capsule from Apple, are designed to work only as a backup device. This type of NAS server keeps copies of data that originally reside on different computers in the network. In the case of the Time Capsule, for example, you can use Time Machine to back up many Macs on the Capsule's internal drive.
Most NAS servers, however, also work as a file server, hosting shared data for the entire network. In this case you need to also back up the server's data to another NAS server or to an external storage device connected to the server's peripheral port (such as by USB or eSATA).
Advanced NAS servers, such as those from Synology, also offer a cloud backup feature, so you can back up remote computers on it over the Internet. It basically offers a personal online backup service similar to the kind mentioned above, but where you are in control of your own server.
Network backup has many advantages since it's similar to online backup minus the need for a Internet connection. It's also similar to local backup without the need to plug the drive in each time you want to do a backup. Once it's set up, it will run by itself and you don't have to do anything else. Additionally, multiple computers can back up to one device at a time and there's generally no limit in terms of storage space. A NAS server can also be hidden away in a corner, reducing clutter.
Drawbacks of network backup include that it generally costs more and that it is more complicated to set up than the other two types of backups. For most home use, network backup is also overkill. Also, compared with local backup and especially compared with Thunderbolt, a network backup is generally slower, since it's limited by the speed of the local network, which is for now capped at 1Gbps.
The best practice when it comes to protecting your data is using both redundancy and backup when possible. All multiple-volume storage devices will give you the redundancy option. However, when you have to choose between the two, remember that backup is generally more important, especially for home users.
And as far as backup goes, use any method that's available to you at the time. For example, if you have a mobile device such as a smartphone, make sure you regularly sync it with your computer or an online server, or make it a habit to e-mail your important documents to yourself or a close friend.
Note that though you might have lots of data, the amount of really important, irreplaceable data could actually be very small. For example, purchased digital content such as music and movies can always be downloaded again or even repurchased, so you might choose not to back it up if you run short on storage. However, if you're working on an important project, or are storing important financial data, it's a good idea to make a backup daily or even after each time you have made major changes.
Most importantly, don't ever keep only a single copy of your important data on a single storage device.
That's it for now. If you have questions, use the comment section below or feel free to send them my way via Twitter or my Facebook page.