RAID 5 vs. RAID 6

There is a performance impact when you use RAID-6 (you update two parity drives instead of just one), but RAID-6 is recommended by Red Hat since drive current drive capacities increases the probability that a double fault will occur while repairing a failed drive.

For example, notice the following possible scenario with a RAID-5 set that contains 4 data drives + 1 parity drive:

Time 1: Drive 1 fails
Time 2: RAID rebuild starts and every block on every drive requires reading
Time 3: On disk 3, disk error reading on block 7
Time 4: On disk 4, disk error reading on block 100,000
Time 5: On disk 2, disk error reading on block 200,000

Latent errors with today’s drives are extremely common (almost a certainty). With RAID-5 in the above scenario, three RAID-5 stripes of user data are lost. With RAID-6, you can rebuild the failed sectors from the second parity drive (assuming it does not have the same bad sectors that failed on the other drives).

For Red Hat Storage, Red Hat highly recommends customers keep all of their data as safe as possible. Data safety is paramount to performance.

The performance impact of using RAID6 is most acute for random I/O write workloads with small transfer size, because there you face the penalty of multiple parity block updates per write request. I don’t think that’s a typical workload for RHS. But if you have a workload that fits this description that you think is important for RHS please let us know — possible examples are transaction processing and virtualization (filesystem embedded within a file).

There are Gluster users that use massive numbers of mirrored (RAID1) bricks rather than one RAID6 brick per server, and they do get better performance for certain workloads (reads on files with size comparable to RAID6 stripe size), but this would be a disadvantage not only of RAID6 but also of RAID5 or even RAID10! The key with reads is to make sure that the disks don’t spend all their time seeking, and this can be controlled using RAID6 stripe size. Greater throughput can be sometimes achieved for workloads with high request concurrency by limiting number of disks/brick.

image5_raid5__hero image6_raid6__hero raid_levels__hero

Jan D.
Jan D.

"The only real security that a man will have in this world is a reserve of knowledge, experience, and ability."

Articles: 675

One comment

  1. Be mindful of hard errors which are the reason that RAID no longer lives up to its original promise.
    If one of the drives fails, then during the rebuild if you get an error – the entire array will die.

    http://www.zdnet.com/article/why-raid-5-stops-working-in-2009/

    SATA drives are commonly specified with an unrecoverable read error rate (URE) of 10^14. Which means that once every 12.5 terabytes, the disk will not be able to read a sector back to you.

    http://www.lucidti.com/zfs-checksums-add-reliability-to-nas-storage

Leave a Reply

Vaše e-mailová adresa nebude zveřejněna. Vyžadované informace jsou označeny *