Saturday, May 16, 2009

Virtual Elegance

The machine on which I do my simulation was a Fedora 9 based system with 4 320GB drives configures in three RAID arrays. The OS itself was in a RAID10 partition spanning all 4 disks. The swap partition is a RAID0 array, again across all the disks. Finally a RAID5 partition holds my data and three virtual machines (2 Fedora guests and one Windows XP).

The system was crashing several times a day - the mouse and keyboard would freeze in the guest and the only way out was a hard reset. Since Fedora is not one of VMWare's supported OSs, I decided to try Ubuntu, which is. The question was: could I pull the Fedora host OS out from under all this without disturbing the RAID array with my data and my VM disks. The process was complicated by the fact that the Ubuntu installer does not have RAID10 support, and an intervention is needed during the installation to download mdadm and configure RAID10 from the command line.

Many years ago, when I worked for IBM as an SE, I used to extoll the virtues of VM for its ability to allow testing before going live to all my S/360 customers; 25 years later this advice was still valid.

I set up a guest with four virtual disks on which I set up RAID partitions (10,0 and 5) to match the host configuration. I installed Fedora on the RAID10 array and mounted the RAID5 partition as /data, exactly matching the host OS configuration. Next I made a copy of the disks so I could repeat the migration process several times. Then I started the guest using the Ubuntu installer iso image, went though the installation process. When I finished I restored the Fedora disks and started again. After running though the installation three times to a point where I felt fairly comfortable, I did the same thing on the bare iron - and to my surprise and delight it worked like a charm. The RAID10 array was reformatted with the new OS while the RAID5 array was untouched and reassembled.

This kind of major change close to the hardware would normally have been the cause of a huge amount of grief - for example discovering that there is no RAID10 support in the Ubuntu installer might have thrown me for a loop if I had been working directly on the host. It's also a one shot game as backing out of an installation is impossible once any partition changes have been made. Testing the procedure in advance with VMware was invaluable. The feeling I had was that of the magician who's just pulled the tablecloth out from under a fully set table of dishes, glasses and cutlery.

It truns out that Fedora was the cause of the problem. With Ubuntu 8.04 LTS, the system hasn't frozen in 6 weeks. I've used Fedora since the first release of Core 1. It feels odd not to be using it, but at the end of the day I have other work to do.

No comments:

Post a Comment