Saturday, May 8, 2010

Some thoughts about software RAID

Last week I upgraded from Ubuntu 9.10 to 10.04 LTS. I will leave it to you to decide whether this was such a good idea: But now it's done.

Somewhere in the process the system stalled (X-windows froze and there was also no way to ssh in from outside). The hard restart cause mdadm to see both my RAID10 and RIAD5 arrays as missing two of the four drives. The array wouldn't assemble. Then I may have done something stupid. I have in the past been able to recreate the array using mdadm --create. It's almost certainly not the right way to deal with the problem but experience as March notes, can be a bad teacher, and is has worked on several occasions.

When the arrays restarted the ext3 file systems on both /dev/md0 (Raid10) and /dev/md1 (Raid5) were completely destroyed.

I have three takeaways from this episode:
1) I should probably have tried to learn about mdadm recovery procedures, but there's not enough time in the world to do everything one ought, and I doubt that I will do so even now.
2) What is the weakest link in the system? The hardware (specifically the disks which haven't failed in 3 years) or the software of which mdadm is a part which has failed so many times I've lost count. If the goal is not just the illusion of security RAID offers but actual security, one has to wonder whether it's really such a panacea for somewhat naive end users like me.
3) When choosing a backup service, ask not how fast it can back stuff up but rather how quickly and easily you can restore your data.

No comments:

Post a Comment