[ILUG] Quick question

Conor Wynne weeboy at conorwynne.com
Thu Mar 3 19:47:24 GMT 2005


> Hi folks,
>
> I've run into a slight problem here, namely a dead Linux cluster!

What type of cluster?

> Basically, on start-up I am being informed that one of the disks in the
> RAID is dead or dying.
> That's fair enough.. however, the system then goes on to boot and hits a
> kernel panic and stops booting.

Whats the panic say then?

> I've a sneaking feeling that we've got some corruption in the root file
> system.
> Obviously this isn't exactly great news. However... at the moment I have
> one quick and simple question.
> Should I try and get in as a single user or using the rescue environment
> to see what I can do now (FSCK?)
> or would it be more prudent to wait until I replace the dodgy disk in the
> RAID?

Think of a RAID array as simply a LUN which has lost its redundancy.
Doing an fsck will simply add more I/O's (during a reconstruction), so the
question should be:

Can you afford to wait for the new disk? Does this have to go immediately
back into production? Do you have any (verified) backups in case it goes
tits-up?

I recently dealt with such an issue - but the entire LUN was gone due to
silliness, and when they went to restore from backup, they discovered the
backups were giving I/O errors. Luckily they had a secondary backup which
worked grand.

> Cheers
> Austin

-- 
Conor Wynne
http://www.conorwynne.com/
YZF-R6 http://yzf-r6.kicks-ass.net



More information about the ILUG mailing list