Tech Planet: Redundant Array of Inexpensive Disks?

With the cost of hard disk falling, it is not uncommon for home desktop PCs to have RAID. In fact, most PC motherboards have a built-in RAID controller, even if it is not in use. RAID is short for "redundant array of inexpensive disks" and was developed to use smaller disks in an array for better performance, scalability, and data reliability and recovery.

RAID achieves all these aims via several configurations. The most common nowadays are RAID 0, RAID 1, RAID 0+1/ RAID 10 and RAID 5. Each of these RAID types has their own method of redundancy.

RAID 0 is simply striping the data across multiple disks. It's like dividing the data into smaller pieces of fixed size, called the "stripe width," and writing the stripes across the disks. If the file to be written was 5KB in size, with a stripe width of 1KB and there are 4 disks to a RAID set, the first stripe would be written on the first disk, the second on the second disk, and so on up the fourth stripe, the fifth stripe is written to the first disk. This is a fast way of writing large files but if one disk fails, the whole set fails.

RAID 1 is also called mirroring. That's because the same data is written on two disks at the same time. In other words, one disk being the mirror of another. The reads are done on both drives as well. If one disk fails, there is still another disk which keeps the data safe and read-write operations continue. This is an effective solution at the cost of having two disks doing the job of one.

RAID 0 + 1, is a combination of RAID 0 and RAID 1. The data is striped across disks first, and then mirrored across the same number of disks. If you have four disks, the data is striped across the first two, and then the pair is mirrored across the remaining two disks. This is a robust solution but very expensive.

RAID 5 is like RAID 0, but with a parity write. A parity is an error correction which is a combination of the data on the other stripes. Additionally, the parity is written on different disks, and not on any single disk. Because of this overhead, RAID 5 is slower than RAID 1 or RAID 0 + 1.

In case of a disk failure, the data can be recovered by calculating the parity contents of the failed disk from the contents of the other disks. In hot swap installations, the failed disk is taken out and replaced, and the data is rebuilt on the fly. There would be noticeable performance degradation while the data is being recovered automatically.

RAID was designed to be both scalable and robust. But with each additional hard disk the chances for a hard disk failure increases. RAID was also designed to continue running even in the event of a single disk failure. And depending on the RAID 0 + 1 configuration, even if more disks failed data would still be written and read correctly up to a certain point but with noticeable performance degradation.

However, in most instances, it would still be necessary to do a RAID recovery at some point of multiple disk failure. RAID recovery of important data is specialized for different types of disks and RAID implementation strategy. RAID data recovery in some instances need a laboratory clean room to study the disk failure before recovering the data.

Most data centers resort to further redundancy of data storage and recover the data after the RAID set has been repaired. In other instances, the data is recovered from tape backup after repairing the RAID set. In a worst case, for very important data, RAID data recovery is done by companies that specialize in dealing with RAID setups everyday.

Tech Planet

Pages

Thursday, July 3, 2008

Redundant Array of Inexpensive Disks?

No comments: