[ale] rsync comparisons

Björn Gustafsson bg-ale at bjorng.net
Fri Apr 2 07:09:59 EDT 2010


The speed is because you took the "nfs mount on a server" out of the
equation.  Doing a stat() over NFS is a surprisingly expensive
operation, and you have eliminated over a million of these.
Presumably your SAN handles this much better than a traditional NFS
server does.

The distribution of work may also be a factor, since you are now
splitting work across two servers.  But most likely the majority of
the improvement is from removing the NFS overhead.

On Fri, Apr 2, 2010 at 12:53 AM, Robert Coggins <ale at cogginsnet.com> wrote:
>
> Well, I have the sync running within a respectable amount of time.  The
> first way I was running the rsync was taking over an hour:
>
> rsync -a --delete --exclude=afolder /path/a /path/b
> where /path/a is an nfs mount on a SAN and /path/b is an nfs mount on a
> server
>
> But now with the following I am down to about 5 minutes:
>
> rsync -a -e ssh --delete --exclude=afolder /path/a server:/real/path/b
>
> Any ideas why this is going so much faster?
>
> Thanks!
> Rob
>
> On 4/1/10 4:50 PM, Björn Gustafsson wrote:
>> Oh, worst-case scenario: millions of files and NFS.  The overhead from
>> NFS stat() calls is probably 95% of your time.  If you can run where
>> *one* of the directories is on a local drive, that'll probably nearly
>> double the speed.
>>
>> On Thu, Apr 1, 2010 at 3:56 PM, Robert Coggins <ale at cogginsnet.com> wrote:
>>> looks like it but they are 2 nfs mounts on one box.
>>>
>>> On 04/01/2010 03:55 PM, Jeff Hubbs wrote:
>>>> Is this a box-to-same-box rsync?
>>>>
>>>> On 4/1/10 3:29 PM, Robert Coggins wrote:
>>>>> I am using -a --delete and --eclude.  Pretty Basic.
>>>>>
>>>>> To be honest I think my problem is with the Source.  It even does an ls
>>>>> very slow.  There are probably more than 1 million files in this 20GB.
>>>>>
>>>>> On 04/01/2010 02:49 PM, Björn Gustafsson wrote:
>>>>>
>>>>>> Which switches are you using now?  It doesn't sound like adding the
>>>>>> ones discussed so far will help.
>>>>>>
>>>>>> On Thu, Apr 1, 2010 at 11:41 AM, Robert Coggins<ale at cogginsnet.com>  wrote:
>>>>>>
>>>>>>> Well, what I a seeing is the syncing of roughly 20GB taking over an hour
>>>>>>> for just a few megs of differences.  It stays in the "building file
>>>>>>> list..." for almost all of this time.  I am trying to find a way to
>>>>>>> speed that up.
>>>>>>>
>>>>>>> Rob
>>>>>>>
>>>>>>> On 04/01/2010 11:37 AM, scott wrote:
>>>>>>>
>>>>>>>> rsync compares on a file level BUT it compares timedate stamps/sizes
>>>>>>>> in "quick mode" (which is default).  but if you want it to compare
>>>>>>>> file to file, use "-c or --checksum" option.  Now this puts a heavier
>>>>>>>> load on both systems, since it does a MD5 checksum on every file that
>>>>>>>> has the same timedate stamp/size on both sides of the sync.  Now if
>>>>>>>> you want to force the copy of the whole file instead of the changed
>>>>>>>> blocks, use the --whole-file option with it.
>>>>>>>>
>>>>>>>> I would use this ( -c&  --whole-file) sparingly.  It is going to slow
>>>>>>>> down the copies, put heavier loads on both ends and transfer more data
>>>>>>>> (control data) back and forth.  I dont know your situation so I cant
>>>>>>>> say to use it or or not.
>>>>>>>>
>>>>>>>> scott
>>>>>>>>
>>>>>>>> On Thu, Apr 1, 2010 at 11:00 AM, Robert Coggins<ale at cogginsnet.com>  wrote:
>>>>>>>>
>>>>>>>>> Is there a way to do file level comparisons and not block level
>>>>>>>>> comparisons using rsync?
>>>>>>>>>
>>>>>>>>> Rob

-- 
Björn Gustafsson



More information about the Ale mailing list