[ale] 9.10 smart errors

Jeff Lightner jlightner at water.com
Mon Nov 2 09:59:10 EST 2009


What's really funny is when a major DBMS maker tells you the reason for
your DB failure was that they could predict a failure on a disk before
both the hardware and the OS.   They did that to a previous employer of
mine and I ridiculed what I called their psychic software claim then
asked even if it were true, in what conceivable scenario would it be
appropriate to "proactively" crash and corrupt the DB based on such a
prediction as had happened to us?  They quickly changed the discussion
to more mundane causes of crash and corruption.  It always made me
wonder if they'd actually gotten away with that BS at other places.

As you might imagine that event was a final straw that made management
there decide they simply weren't scalable enough and move to a
completely different product.

-----Original Message-----
From: ale-bounces at ale.org [mailto:ale-bounces at ale.org] On Behalf Of Mike
Harrison
Sent: Monday, November 02, 2009 9:26 AM
To: Atlanta Linux Enthusiasts - Yes! We run Linux!
Subject: Re: [ale] 9.10 smart errors

> SMART may not be as smart as everyone thinks.

In the old days, I used to be good at predicting drive failure.

You could hear them.. as the bearings started to fail
or heads seeked a lot trying to get data off/on.
I could walk the racks of the colo room and
hear the whines and clicks of imminent failure.

Sometimes you'd get days or months of errors
in the log files, seek errors and more.

I've had more than one drive that'd be fine
if you helped spin it up from a cold start with a
pencil eraser on the spindle (back when they
were exposed).

Luckily, they seem to fail a lot less often then they used to.
I haven't had a production machine (< 3 years old) drive fail
in a loooong time. Except for one 2.5" server drive in a strange place
after multiple power failures, lack of AC and other issues,

But when they do fail now, the seem to instantly transmutate
into small bricks. Less warnings, less notice and no chance of
recovery. they no longer spin up or work after a cool-off cycle.

I miss the whine of a failing hard drive bearing, but not much..
not very much at all.

At least I don't park/lock the heads with a little lever anymore.
(Data General Nova III w/ a 5MB HD and others...)




_______________________________________________
Ale mailing list
Ale at ale.org
http://mail.ale.org/mailman/listinfo/ale
See JOBS, ANNOUNCE and SCHOOLS lists at
http://mail.ale.org/mailman/listinfo
 
Proud partner. Susan G. Komen for the Cure.
 
Please consider our environment before printing this e-mail or attachments.
----------------------------------
CONFIDENTIALITY NOTICE: This e-mail may contain privileged or confidential information and is for the sole use of the intended recipient(s). If you are not the intended recipient, any disclosure, copying, distribution, or use of the contents of this information is prohibited and may be unlawful. If you have received this electronic transmission in error, please reply immediately to the sender that you have received the message in error, and delete it. Thank you.
----------------------------------



More information about the Ale mailing list