[ale] Disaster! (Raid)

David Corbin dcorbin at machturtle.com
Sat Apr 8 11:01:49 EDT 2006


Sigh.  Bad things happen.

I have a linux server at home that is not 'doing well'.  Let me describe how 
it was configured, and what I've seen.

Two drives, hda, and hdc.  Both were configured exactly the same.
hda1 => /boot (ext3)
hda2 => / (ext3)
hda3 => swap
hda4 => part of software-raid drive (/md0) 
md0 => /data (reiserfs)

Normally, hdc1 and hdc2 are not mounted, and I would periodically do "block 
copies" from hdaX to hdcX.

I have a third drive, just like them, put away in the safe deposit box, and I 
swap one out every now and then.

Kernel 2.4.19, Debian.

The intent was for this server to be reasonably safe in holding my data with 
any manual backup step.

This setup has worked great for since Nov 2002.  The only oddity that I've 
seen is that one time I had a failure where it would not boot off of 
my /dev/hda:  If I remember right, the BIOS wouldn't recognize that drive on 
that controller, but I swapped the two drives, and it worked fine, including 
the raid stuff.  That was at least 2 years ago.

============
This morning, (after the storm went through) I notice the system had rebooted 
(I thought it was on a UPS, but I suppose it might not be).  It was stuck at 
the BIOS - my CMOS battery is dead.  No problem.  Press F1, and I get "LI" 
from the LILO prompt.  

I try a few things, and now it looks like hdc is dead.  It spins, does not go 
"ker-clunk", but is not recognized by the BIOS. I tried it another box too.

OK.  The boot block is messed up.  I pull hda, put it another system, boot 
from CDROM (I can't boot from a CDROM in this system because of the dead CMOS 
battery).  I rewrite the boot block, and put the drive back.  



More information about the Ale mailing list