[ale] read after write verify?, data scrubbing procedures

Ron Frazier (ALE) atllinuxenthinfo at techstarship.com
Fri Oct 26 10:20:49 EDT 2012


see below

"mike at trausch.us" <mike at trausch.us> wrote:

>On 10/25/2012 04:43 PM, Ron Frazier (ALE) wrote:
>> Then, there is the question of the data scrubbing on the source
>> drive.  In this case, once I've completed a backup, I will have read
>> each sector on the source drive.  Assuming there are no read errors,
>> (If there were, I have to get out the big guns.)  then, this has
>> accomplished 1/2 of what my scrubbing does, the read half.
>
>This is only true if you always back up every single block on your 
>device.  The Linux RAID self-scrubbing process, however, will read
>every 
>single block on every single drive in the RAID unit, whether the block 
>is in use or not.
>
>If you want to implement the same scrubbing process, but without the
>use 
>of RAID, you can simply have a script that runs once per week, dd'ing 
>all of the bits of your HDDs to /dev/null and if any of the dd
>processes 
>return with an error code, you know that it encountered a Bad Thing™
>and 
>you investigate then.
>
>You don't get the assurances that come with the RAID self-scrub, since 
>the RAID self-scrub also does other higher-level things like verify
>that 
>mirrors are actually mirrored and that parity is correct.  This, too,
>is 
>read-only: however, it detects errors ranging from "corrupted data came
>
>through the cable and landed on the disk" to "there are bad sectors, 
>meaning this drive is out of remapping space or is lagging in remapping
>
>and will therefore soon run out of remapping space".
>
>You'll only get protection from the latter without RAID.
>
>You can *detect* the latter without RAID, but not correct for it, both 
>at the block level (for more expensive block devices) and at the 
>filesystem level.  Or you could even do so by taking md5sums of all
>your 
>files and when you read the files back, comparing the md5sums. 
>However, 
>then you have to reconcile the list of changed files with... you know, 
>it just becomes a lot easier to use RAID.  :-)
>
>Interestingly enough, the Wikipedia article has a little tutorial that 
>shows you how to create a RAID 5 device on Linux.
>
>	http://en.wikipedia.org/wiki/Mdadm
>
>All you need are 3 drives, and one command, and you get a single
>logical 
>drive with (slightly) increased robustness.
>
>Oh, and in a RAID, keep the members down to less than 1 TB, unless 
>things have drastically changed in HDD manufacturing, the odds of 
>catastrophic failure go way up with size, and the last I remember 
>hearing, 720-1000 GB was about the far end of "maybe" in terms of RAID 
>use.  I've had good luck with 750 GB array members (5 of them, in a
>RAID 
>6) myself.
>
>	--- Mike
>
>-- 
>A man who reasons deliberately, manages it better after studying Logic
>than he could before, if he is sincere about it and has common sense.
>                                    --- Carveth Read, “Logic”

Hi Mike T,

Thanks for all this info and the link.  I'll take a look at that.  That last part about 1 TB max drives is good to know too.  Had these drives I'm replacing not been under warranty, I would have been certainly tempted to get 2 or 3 TB drives.  I understand the BIOS may have a problem with anything over 2 TB, and you have to do funcky things to make it work, but I haven't studied the details.  I hope Seagate doesn't give me grief because the one doesn't show many bad (reallocated) sectors according to SMART.

Let's say that I have two drives which I've mirrored, not in the sense of running live with raid 0, but just in the sense of having the same stuff on both drives.  This is actually the case with the 1 TB drives I've been discussing.  I have duplicated everything on drive A to drive B with MS SyncToy.  I'm sure there are tools which would do the same thing in Linux.  So, all the files should be the same on each, except for a few hidden files and the SyncToy tracking database, etc.  In this case, they're data only drives, so I don't care about the MBR.  As we've discussed, there was most likely (don't know for sure) no read after write verification when the duplicating process was done.

So, the next question this brings up is, how do I compare every file on drive A with those on drive B, and make sure each exists, and make sure that each one is the same?  In this case, both drives are NTFS, one big partition, and nothing but data, ie no OS.  However, in the case of the drives which I run on, they DO have 2 OS's, as well as data, as well as NTFS and EXT4 partitions.  So, when I clone one of those for backup purposes, either with Acronis, or dd, or whatever; how can I verify that the clone really is a clone.  Essentially, by running Spinrite or badblocks on the backup drive  before using it (which I don't do every time, but periodically), I have verified that the drive CAN be written successfully, but have nothing to REALLY show that the clone DID write successfully.  In this case, I would be interested in all the contents, mbr, hidden and system files, etc.  When I use Acronis TrueImage to clone, it copies NTFS partitions file by file, the MBR (probably) sector by sector but I'm not sure, and the EXT4 partition sector by sector.  So, the NTFS partitions should have exactly the same files, but may not be identical sector copies.  That's OK, as it will still work if I have to put the backup drive into use.  But, back to the original question, how do I verify the clone?  I guess I could put it in and boot it up, but that's somewhat of a pain, and it still wouldn't completely verify it.  I have both Windows and Linux set up to use a page files, instead of a swap partition.  Windows does that anyway.  Windows also has a hibernate file that comes into play if you hibernate the computer.  So, I guess those would need to be verified too.

Thanks for all the help.

Sincerely,

Ron




--

Sent from my Android Acer A500 tablet with bluetooth keyboard and K-9 Mail.
Please excuse my potential brevity.

(To whom it may concern.  My email address has changed.  Replying to former
messages prior to 03/31/12 with my personal address will go to the wrong
address.  Please send all personal correspondence to the new address.)

(PS - If you email me and don't get a quick response, you might want to
call on the phone.  I get about 300 emails per day from alternate energy
mailing lists and such.  I don't always see new email messages very quickly.)

Ron Frazier
770-205-9422 (O)   Leave a message.
linuxdude AT techstarship.com




More information about the Ale mailing list