[ale] read after write verify?, data scrubbing procedures

Thu Oct 25 23:25:06 EDT 2012

On 10/25/2012 04:43 PM, Ron Frazier (ALE) wrote:
> Then, there is the question of the data scrubbing on the source
> drive.  In this case, once I've completed a backup, I will have read
> each sector on the source drive.  Assuming there are no read errors,
> (If there were, I have to get out the big guns.)  then, this has
> accomplished 1/2 of what my scrubbing does, the read half.

This is only true if you always back up every single block on your 
device.  The Linux RAID self-scrubbing process, however, will read every 
single block on every single drive in the RAID unit, whether the block 
is in use or not.

If you want to implement the same scrubbing process, but without the use 
of RAID, you can simply have a script that runs once per week, dd'ing 
all of the bits of your HDDs to /dev/null and if any of the dd processes 
return with an error code, you know that it encountered a Bad Thing™ and 
you investigate then.

You don't get the assurances that come with the RAID self-scrub, since 
the RAID self-scrub also does other higher-level things like verify that 
mirrors are actually mirrored and that parity is correct.  This, too, is 
read-only: however, it detects errors ranging from "corrupted data came 
through the cable and landed on the disk" to "there are bad sectors, 
meaning this drive is out of remapping space or is lagging in remapping 
and will therefore soon run out of remapping space".

You'll only get protection from the latter without RAID.

You can *detect* the latter without RAID, but not correct for it, both 
at the block level (for more expensive block devices) and at the 
filesystem level.  Or you could even do so by taking md5sums of all your 
files and when you read the files back, comparing the md5sums.  However, 
then you have to reconcile the list of changed files with... you know, 
it just becomes a lot easier to use RAID.  :-)

Interestingly enough, the Wikipedia article has a little tutorial that 
shows you how to create a RAID 5 device on Linux.

	http://en.wikipedia.org/wiki/Mdadm

All you need are 3 drives, and one command, and you get a single logical 
drive with (slightly) increased robustness.

Oh, and in a RAID, keep the members down to less than 1 TB, unless 
things have drastically changed in HDD manufacturing, the odds of 
catastrophic failure go way up with size, and the last I remember 
hearing, 720-1000 GB was about the far end of "maybe" in terms of RAID 
use.  I've had good luck with 750 GB array members (5 of them, in a RAID 
6) myself.

	--- Mike

-- 
A man who reasons deliberately, manages it better after studying Logic
than he could before, if he is sincere about it and has common sense.
                                    --- Carveth Read, “Logic”