[ale] Bad SATA interactions

Michael Trausch mike at trausch.us
Sun Nov 4 12:38:31 EST 2012


So I had an interesting few days... Aside from the fact that I have been
sick, it turns out I have had an interesting problem appear.

I changed motherboards recently, to test UEFI and so forth out. When I did
so I started having some problems that traditionally scream "memory
errors", except my RAM was just fine.

I hadn't immediately thought to check the drive's SMART log because I am
used to distributions signaling via the UI when such events happen. Well,
it turns out that Fedora doesn't do smart monitoring by default!

I had an apparently bad SATA cable (am running tests now to see if the new
cable is actually the solution here). The symptom was UDMA CRC error counts
through the roof, which the drive detected and then aborted the
corresponding command.

I mention this as we recently had a thread on silent corruption.

So, to the question part: even with smartctl and friends not installed and
running, shouldn't modern file systems be storing checksums to catch this
sort of thing without obscure errors? I thought that ext4 had such support,
but I would appear to be incorrect there.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ale.org/pipermail/ale/attachments/20121104/4b6f696b/attachment.html>


More information about the Ale mailing list