[ale] very intermittent weird SCP failure?
Neal Rhodes
neal at mnopltd.com
Fri Aug 11 14:04:58 EDT 2017
Thanks for the reply. No, no chance of running out of space.
Yes, if I could find the original scp invokation, I could save the
return code, assuming scp posts a different return when it doesn't
succeed. And as you note, I could run a sum or md5sum and send that
along. I've got host-ssh equivalence, so I could do some comparison.
Although as Jim noted, you can only chase squirrels so hard and so
long.
On Mon, 2017-08-07 at 16:48 +0000, Lightner, Jeffrey wrote:
> Any chance the filesystem on the backend server has gone temporarily
> full around the time you do the scp’s that have the issue? You’d see
> it in /var/log/messages if it happened.
>
>
>
> One thing you could do to avoid finding this later and manually
> resending is put your scp inside a script and before the scp run
> md5sum on the file that is being sent then after the scp run md5sum
> (via ssh) on the backend server and compare the values. If they’re
> not the same have the script resend and check the md5sum again. You
> could try it multiple times (e.g. 5 with appropriate pauses between
> attempts) then have it send email to you on last failed attempt.
>
>
>
> We use sftp/scp fairly heavily here on RHEL5/RHEL6 and I’ll have to
> say I’ve never run into much trouble with the actual file transfers.
> Having said that I will say I have a preference for sftp usually and
> it may have its own built in retransmit of packets like the old ftp
> did. I’m not sure scp does that.
>
>
>
>
>
>
> From: ale-bounces at ale.org [mailto:ale-bounces at ale.org] On Behalf Of
> Neal Rhodes
> Sent: Monday, August 07, 2017 12:08 PM
> To: Atlanta Linux Enthusiasts
> Subject: [ale] very intermittent weird SCP failure?
>
>
>
>
>
> We have a client running a ----------------------------- application
> on three linux servers running
>
> Linux HDISATBE3 2.6.32-696.1.1.el6.x86_64 #1 SMP Tue Apr 11 17:13:24
> UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>
> The two primary event servers accumulate a file of
> connection/heartbeat activity, and once a week, a crontab job does an
> SCP of this file to the backend server, which imports these two files
> to calculate uptime. The respective user has ssh host equivalence,
> so this proceeds without password challenge.
>
> This has worked for about 10 years.
>
> Very occasionally, like, once in 3 months, we will find the copied
> file on the backend server is garbled, to wit:
>
> - total size is correct; matches source
> - the first XXX bytes of the file is NULL characters.
>
> Which hoses up everything. I usually figure out which file is
> boogered, re-do the scp by hand, and re-do the import. Then all is
> well.
>
> I am just bumfuzzled as to what would cause this. It's always on the
> front of the file.
>
> I should check and see exactly how many NULLS, but usually when it
> happens my hair is on fire. I'm guessing about 512 or 1K.
>
> Neal Rhodes
> MNOP Ltd
>
>
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ale.org/pipermail/ale/attachments/20170811/4685e4d2/attachment.html>
More information about the Ale
mailing list