Moar RAID Win

Had to upgrade my storage over the past week and decided to see just how good Linux software raid (mdadm) and SATA hot plugging were these days. The current setup was a five disk array with two 1TB and three 750GB drives attached to an ASUS M3A76-CM motherboard in AHCI mode. My first issue was that all five drives were attached internally so access would require mucking about in the case to replace the three disks so I dug around and ended up purchasing a Supermicro CSE-M35T-1B so I would be able to swap the disks out easily.

After getting the original drives shifted into the new rack, it was as easy as 1-2-3(ish) to replace the drives (with the filesystem live):

  1. mdadm /dev/md1 --fail $PARTITION --remove $PARTITION
  2. Replace disk, create partition
  3. mdadm /dev/md1 --add $PARTITION, resync

Each resync after the replacement took 3-4 hrs and the fan in the rack kept the disks no warmer than 32C. After the final disk was done and synced, I expanded the array to include the new space:
mdadm --grow /dev/md1 --size=max
Then let lvm know the new space was usable:
pvresize /dev/md1
And ran vgdisplay to see my extra ~940GB.

Gawd it’s nice when stuff Just Works.

RAID 5 via mdadm

Finally got the SATA PCI cards (SIIG SC-SA4011) in at work today so I was able to go ahead and build the new RAID array.

The first hurdle was to figure out what chipset the SATA cards use, but after removing the SIIG sticker I found a nice SiI 3114 chip which is well supported via the sata_sil 2.6x kernel module. After enabling it in the kernel (CONFIG_SCSI_SATA_SIL) along with SCSI disk support and rebooting I got a nice trio of disc devices (/dev/sd[abc]) and was able to move on to partitioning.

As I noted in my previous RAID 1 post, it’s pretty essential to type the RAID partitions as Linux raid autodetect (cfdisk screenie, cfdisk type screenie) so udev sets up the device nodes correctly. Other than that, just make sure you have at least three disks with partitions of approx. the same size and you should be ok. In my case I just created one big partition on each of my three drives since this is their only purpose in life…

All that needs to be done to create the array is run ‘mdadm --create /dev/md3 -l 5 -a --raid-devices=3 /dev/sd[abc]1‘ and you should see a new array being reconstructed in /proc/mdstat…

mdadm RAID 1 Migration

I’m migrating our work server over to a complete RAID setup (software) and needed to find a way to migrate the system partitions over to RAID 1 with a minimum of hassle. I found this how-to and was able to follow most of it verbatim for my test run, but I thought I’d post the actual steps I followed here for future reference.

I had a spare drive that exactly matched the drive currently in my office machine (Seagate ST3200822A), so I just did a ‘parted /dev/hda print‘ and copied the exact start/end of each partiton over to the new drive. Edit: Make sure you set the ‘Linux Raid’ flag on your partitions being used for the arrays. At boot (or module insertion), the kernel will automagically setup the arrays and needed devices making the Gentoo init patch below completely un-needed. Doh!

I then created a degraded RAID 1 array by running ‘mdadm --create /dev/md0 --level 1 --raid-devices=2 missing /dev/hdb5‘ and verified it had been setup using ‘cat /proc/mdstat‘.

I created my xfs filesystem on the new array and mounted it on /mnt/tmp so I could copy over my data from the live partition using ‘cd /home && tar cf - .| tar -C /mnt/tmp -xf -‘ and grabbed a smoke while I waited for that to finish.

The final step in creating the array was to add the old partition to the new array and rebuild it using ‘mdadm /dev/md0 -a /dev/hda5‘. The rebuilding process took about 15 minutes for a 38G partition and a successful mount verified all had gone well.

Edit:
Just make sure you have your array partitions marked as above and the kernel md drivers do the rest…
I decided to reboot to test that things had indeed gone as smoothly as I thought and had a rude awakening. I had forgotten to edit /etc/mdadm.conf and add my arrays, so after a quick edit I tried rebooting again and watched in horror as md0 was activated, but 1 and 2 failed. Did a bit of digging on the system and discovered that udev was creating the device nodes as /dev/md/x with a symlink from /dev/mdx at activation. mdadm has two issues with this currently, the first being it won’t activate an array from a scan if the device node doesn’t exist and the second is that the ‘–auto’ param pukes (instead of assuming it’s setup already) if it sees a symlink. I started to edit the checkfs init script to correct the issue manually, but found a patch on the Gentoo bugzilla that fixed everything…