This article provides a detailed explanation of RAID (Redundant Array of Independent Disks), a storage technology that merges multiple disk drives into a single unit to enhance data redundancy and performance. It covers various RAID levels, from RAID 0 to RAID 10, explaining their unique configurations, advantages, and ideal use cases based on speed, capacity, cost, and fault tolerance.
Redundant Array of Independent Disk or Random Array of Inexpensive Disks (RAID) is a storage technology that combines multiple physical disk drives into a single logical unit recognized by the operating system (Stallings, 2018).
Raid storage can also be defined as a storage technology that creates a data loss fail-safe by merging two or more hard disk drives (HDDs) or solid-state drives (SSDs) into one cohesive storage unit, or array (Daniel, 2023). The main use or goal of RAID storage is to protect against the total loss of a disk drive’s data by repeating or recreating that data and storing it on the additional drive or drives, a process also known as data redundancy. Furthermore, by distributing the data across multiple drives, the RAID strategy, the data can be simultaneously accessed from multiple drives, therefore improving I/O performance. The data is distributed across the drive by striping. Striping is the process of dividing data into blocks and spreading the data blocks across multiple devices. Strips may be physical blocks, sectors, or some other unit.
The RAID system is organized into various levels, which can be defined as RAID 0, RAID 1, RAID 2, RAID 3, RAID 4, RAID 5, RAID 6, and RAID 10. Note: all RAID levels provide redundancy, except for RAID 0.
NOTE: From Stallings, 2018, Figure 11.8
RAID level 0
In RAID 0, the data is striped across multiple drives. As a result, it provides high read/write performances. Additionally, it does not incorporate redundancy. Not providing redundancy, i.e., not providing protection against total data lost, RAID level 0 is not commonly used.
RAID level 1
In RAID 1, the data is striped and duplicated into two separate drives. In other words, every drive in the array (of drives) has a mirror drive that contains the same data. This provides redundancy and increases read speeds. But on the other hand, compared to RAID 0, it lowers writing speeds. This technique is referred to as mirroring.
RAID level 2
RAID 2 uses parity. The data is striped, but the strips are very small, bite-size. Furthermore, instead of striping the blocks across the drives, the bits are stored in parallel across all the drives (Natarajan, 2016). Furthermore, two groups of the drive are used, one to write/read the data and another one to write/read the error correction codes (redundancy group). In other words, the bits of the code are stored in the corresponding bit positions on multiple parity drives, and on a single read, all disks are simultaneously accessed. Additionally, RAID 2 usually uses Hamming error correction code (ECC) and stores this information in the redundancy drives. If an error occurs, it uses the correction code to rebuild the data. Finally, it requires fewer drives than RAID 1, but it is costly. The number of redundant drives is proportional to the log of the number of data drives. RAID 2 can be beneficial in an environment in which many disk errors occur. However, due to the high reliability of modern drives, “RAID 2 is overkill and is not implemented” (Stallings, 2018, p. 502).
RAID level 3
As RAID 2, it uses parity, the difference is that RAID 3 requires only a single redundant drive. Therefore, it is less costly than RAID 2. RAID 3 allows very high data transfer rates, but only one I/O request can be executed at a time.
RAID level 4
RAID 4 also uses parity, the difference is that it utilizes an independent access technique, where the strips are relatively large, and drives operate independently allowing several I/O requests to be accessed in parallel. However, “every write operation must involve the parity disk, which therefore can become a bottleneck” (Stallings, 2018, p. 504).
RAID Level 5
RAID 5 scheme is similar to RAID 4, the difference is that it implements striping with parity. It implements strips of parity across the drives, this avoids the potential I/O bottleneck of the single parity disk found in RAID 4. It is the most common RAID configuration, and when compared to RAID 0 and RAID 1 which require a minimum of two drives, it requires a minimum of three drives to function. Like RAID 0, RAID 5 read speeds fast but its write speed is lower due to the redundant creation of parity strips. Furthermore, the loss of an entire drive will not result in any data loss.
RAID Level 6
RAID 6 scheme is similar to RAID 5, the difference is that instead of implementing one strip of parity across the drives, it implements two different strips of parity across the drives, see Figure 1. This “provides extremely high data availability. Three disks would have to fail within the MTTR (mean time to repair) interval to cause data to be lost” (Stallings, 2018, p. 504), but on the other hand, it severally affects write performance.
RAID Level 10
RAID 10 combines both data striping (RAID 0) and disk mirroring (RAID 1). This achieves redundancy by duplicating the data and performance by striping. A minimum of four drives is necessary to implement it. In other words, RAID 10 is the best of both RAID 0 and RAID 1, with fast read and write speeds and fault tolerance (Natarajan, 2016).
In conclusion, the best RAID configuration depends on the application’s requirements, which may be based on speed, capacity, cost, data redundancy, or a combination of these. For example, a supercomputer application may prefer RAID 0 where performance, capacity, and low-cost are more important than reliability. In comparison, applications where data integrity is crucial, such as healthcare, banking, and defense, may prefer RAID 6. While it may come at a higher cost and offer less performance, particularly when writing data, than RAID 0. Nonetheless, RAID 6 still provides high data access and protects against data loss even in the event of two drives failing simultaneously.
References:
Daniel, B. (2023, March 7). RAID levels 0, 1, 5, 6 and 10 & raid types (software vs. hardware). Trusted Computing Innovator. https://www.trentonsystems.com/blog/raid-levels-0-1-5-6-10-raid-types#:~:text=The%20best%20RAID%20configuration%20for,RAID%206%20and%20RAID%2010
Natarajan, R. (2016, March 25). RAID 2, RAID 3, RAID 4, raid 6 explained with diagram. The Geek Stuff. https://www.thegeekstuff.com/2011/11/raid2-raid3-raid4-raid6/
Stallings, W. (2018). Operating Systems: Internals and design principles. Pearson
Originally published at Alex.omegapy - Medium on September 18, 2024.
Top comments (0)