[ale] Odd network issue

Lightner, Jeff JLightner at dsservices.com
Fri Jul 10 13:53:27 EDT 2015


We've now run more than 24 hours since changing the gateway to what it should be rather than the server's IP and haven't seen the issue come back.   Given that it had come back every day this week (and more than once on one day) this may be resolved.

The only question now is, given that this server was installed in 2011 why was this suddenly a problem this week?  This past weekend they updated our core network switches and I suspect something in that triggered this new behavior.

This was one of our original RHEL6 installations so I suspect the gateway configuration was a holdover from the way the hated NetworkManager originally configured things.   We now turn that off on RHEL6 in favor of the older network service (and had at some point turned off on this one).

From: ale-bounces at ale.org [mailto:ale-bounces at ale.org] On Behalf Of Lightner, Jeff
Sent: Thursday, July 09, 2015 11:03 AM
To: Atlanta Linux Enthusiasts
Subject: Re: [ale] Odd network issue

By "machines" if you mean the servers we are testing FROM where one is able to ping the printer but the affected server can not the answer is yes they're in the same subnet/VLAN.

If you mean the affected IPs that the server can't ping no as I noted in original post their in our WAN at various locations in the US so are not in the same subnet as the servers.

On checking the gateway just now I saw it was using the interfaces IP as the default gateway instead of the default gateway we normally use for the VLAN.   I don't really think that is the issue because as noted we can reach things after bouncing the interface so it was using that both before and after bounce.   I went ahead and changed it just now to rule it out.


From: ale-bounces at ale.org<mailto:ale-bounces at ale.org> [mailto:ale-bounces at ale.org] On Behalf Of Jeremy T. Bouse
Sent: Thursday, July 09, 2015 9:34 AM
To: ale at ale.org<mailto:ale at ale.org>
Subject: Re: [ale] Odd network issue

Are both machines on the same subnet? If not are their default routes correct and default gateways know how to get to each subnet? This sounds like behavior I've seen before when default gateway route was incorrect and trying to communicate with server not on the same subnet.
On 7/9/2015 9:17 AM, Lightner, Jeff wrote:
We have a server that appears to be unable to connect to various IPs (mostly printers) in our WAN at times.

The odd thing is that the interface is not down on either the server or the printer.   The server can reach multiple other IPs when this occurs but for some reason can't reach a few.    Similarly when this occurs we can reach these other IPs from other servers in the same VLAN.

There are no errors shown in statistics on the interface (eth0) on the server nor on the port it attached to on the switch.   There are no errors shown in /var/log/messages or dmesg.

The IPs are not always the same ones.   In fact yesterday when we saw the issue on IPs I tested the IPs that it couldn't reach the day before that I'd resolved and they were still working.


We've tried:
Killing the lp process that is hung at the start of this issue on server side.
Clearing arp on the server.
Clearing arp on the switch.
Bouncing cups (note issue is NOT just cups - when this occurs we can not ping the affected IPs nor can we telnet in on port 9100 as we would normally be able to do).

The only thing that seems to resolve the issue (and does each time) is having the interface bounce on the server.
We''ve done that by:
Rebooting the server
Ifup/ifdown on the server interface
Removing and reseating the cable.
Resetting the switch port.
In each of those we see the port go down then come back up and after that the previously unreachable IPs are again reachable.

I'm suspecting a bad interface port on either the server or the switch but in the absence of actual errors can't prove one way or the other.

This is different than any other network issue I've seen.   I'm wondering if anyone has run into this and can shed any light on it?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ale.org/pipermail/ale/attachments/20150710/6d2f18a7/attachment.html>


More information about the Ale mailing list