Hivearchive Downtime

October 5, 2007

Had a bit of hardware trouble, with a hard drive failing today. However, due to my sysadmin ninja skills, no data was lost and the RAID 1 is rebuilding.

[root@nexus ~]# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 hdc1[2] hda1[0] 79360960 blocks [21] [U_] [>………………..] recovery = 2.2% (180940879360960) finish=25.9min speed=49725K/sec md0 : active raid1 hdc2[1] hda2[0] 1052160 blocks [22] [UU] unused devices:

If you need to do this:

dd if=/dev/hda of=/dev/hdc bs=512 count=1 # mdadm –manage /dev/md0 –add

/dev/hdc2 # mdadm –manage /dev/md1 –add /dev/hdc1

In other words, copy the master boot record from the good drive to the new drive so you have the same partitions, then hot add the new partitions to your array. WARNING DANGER DANGER WARNING. Backup all your data first, and test your backups work. Change the partitions and drives to match your own situation. Failure to do so will cause you to hose your system… That is all. Now if only Linux could do all this automatically like other sane operating systems. Update: I’m getting a little suspicious that just copying the MBR from one hard drive to another, messed up something with Linux’s software RAID. Sigh. This is exactly how I would do it in Solaris, but Linux has no great documentation on how to do it easily. Lazyweb? Update, 2 Aug 2008: I did this again and this time realized that copying the MBR will work fine with DD, but Linux needs to be explicitly told to rescan the partition table. I simply opened up the device with fdisk, checked the partitions looked how I wanted and then rewrote the partition table. Fdisk then calls the IOCTL to tell the kernel to rescan the partitions. Problem solved. :)

comments powered by Disqus