The basic idea of RAID (Redundant Array of Inexpensive Disks) storage is combining multiple small, cheap disk drives into an array of disk drives (appearing to the computer as a single logical storage unit) that yields performance exceeding that of a SLED (Single Large Expensive Drive).
RAID offers many advantages over the use of single hard disks, including higher data security, fault tolerance, improved availability, and integrated capacity.
That said, RAID was invented more than 30 years ago and simply wasn't designed to work in the terabyte system world that is commonplace today. In fact, RAID is clearly beyond its design limitations for storage in the petabytes.
I discussed via e-mail the limits of RAID with Cleversafe CEO Chris Gladwin, and here's the problem as he sees it: RAID is mathematically reaching a breaking point for data reliability based on one-terabyte drives. RAID 6, based on parity, cannot recover from more than two simultaneous failures, or two non-simultaneous failures plus a bit rate error. It also doesn't automatically protect data, which remains exposed to software, hardware and user error.
Typical SATA drives have a published bit rate error (BRE) of 10^14, meaning once every 100,000,000,000,000 bits, there will be a bit that is unrecoverable. Although this failure rate seems insignificant, when reading 100 terabytes (note: 100 terabytes is 10^14 bits), it is nearly certain there will be an unreadable bit, and if this read happens to be during a rebuild, data will be lost.
There are still applications that can utilize RAID for increased I/O performance. For example, using RAID for a high I/O transactional system would be a good fit. Also, smaller storage applications, for example a terabyte or below, could still use RAID effectively.
Data continues to grow exponentially. Market researcher IDC estimated that the digital universe exceeded more than 281 exabytes in 2007 and will grow 10X by 2011. Enterprises in a number of industries, including media/entertainment, health care, and video surveillance, have already exceeded 100 terabytes of storage in use. Determining the appropriate long-term storage strategies for these industries will be a challenge as they realize the limitations of RAID.
The good news in addressing these data growth issues is the availability of low-cost processors and high-capacity drives. Combined, they provide great opportunities for disruptive innovations that will displace RAID.