[ale] Network issue that makes absolutely NO sense

Ryan Fish FishR at bellsouth.net
Thu May 11 23:54:56 EDT 2006


The first thing I did when I got the switch was update the firmware to
4.0.3.15.  However, it seems this is still an issue within the firmware.

In this case all three servers are in the same VLAN.  Two have gigabit NICs
and one (the Oracle server) does not.  Setting everything to 100BaseTx-FD
and forcing the ports on the switch to 100FD has allowed the jobs to run
simultaneously however I continuously lose connectivity to the NAS from both
boxes that are writing to it at this time.  Since the NFS file systems are
hard mounted they keep going once the NAS can be contacted again so at least
I am not losing data; it is just painfully slow to backup.

-Ryan


-----Original Message-----
From: ale-bounces at ale.org [mailto:ale-bounces at ale.org] On Behalf Of Jerry Yu
To: ale at ale.org
Sent: Thursday, May 11, 2006 11:35 PM
To: Atlanta Linux Enthusiasts
Subject: Re: [ale] Network issue that makes absolutely NO sense

wonder what version of fimware you have on that switch.  One of the
bug fixed in a newer firmware indicates the the switch itself can't
handle well if the speed varies on different ports of the switch, with
the initial version 4.0.3.7.

Info can be found at http://kbserver.netgear.com/products/GSM7248.asp

Firmware Version 4.0.3.15       Published Feb. 10, 2006
Fixes
1. Fixed: Switch may crash and reboot if packets are forwarded from
lower 24 ports (1 to 24) to higher 24 ports (25-48) where the port
speeds are different, e.g., if one port is using 10M half duplex, and
the other is using 1000M bps full duplex.



On 5/11/06, Jerry Yu <jjj863 at gmail.com> wrote:
>
> am not familiar with that model. a rule of thumb is to set both switch and
nic to 'auto_negotiation' such that the best common speed/dupex combo can be
automatically negotiated when the link comes up between the nic and the
switch port. short of that,  one'd force a combo of speed/duplex on both
ends. The bottom line is that both ends should be in sync in terms of
auto_negotiation (or manual/forced), speed (1000 or 100 or 10), and duplex
(full or half).
>
>
>
> On 5/11/06, Ryan Fish <FishR at bellsouth.net> wrote:
> >
> >
> >
> >
> >
> > The switch is a Netgear 7248.  It shows me that everything is in FD on
all used ports (although I had to force some of them to FD even though the
NICs are set at FD).
> >
> >
> >
> > My next step is going to be matching the NICs and switch ports at 100FD
because the NIC in the Oracle box can only go that fast.  I just have to
wait for a job to complete.
> >
> >
> >
> > Thank you.
> >
> > -Ryan
> >
> >
> >
> >   ________________________________

> >
> > From: ale-bounces at ale.org [mailto: ale-bounces at ale.org] On Behalf Of
Jerry Yu
> >  Sent: Thursday, May 11, 2006 9:31 PM
> >  To: Atlanta Linux Enthusiasts
> >  Subject: Re: [ale] Network issue that makes absolutely NO sense
> >
> >
> >
> >
> > to have a nice smooth communication,
> >  1) nic and the switch port it connects to should have matching
speed/duplex/auto_neg|manual
> >  2) two-end points of a switched communication should have matching
speed and duplex.
> >
> >  what's the model of the new switch?  You may find speed/duplex/auto_neg
settings per port on the switch itself.
> >
> >
> >
> >
> > On 5/11/06, Ryan Fish < FishR at bellsouth.net> wrote:
> >
> >
> >
> >
> > A bit more info that may be helpful:
> >
> >
> >
> > - The Oracle server only fails because it is unable to read from the
NAS.  This causes the IOWait on the processors to hit the high 90% range and
stay there until the box eventually is too busy to respond to requests from
the application that uses it.

> >
> >
> >
> > Is there some way to test if a switch is truly using Full Duplex on a
port?
> >
> > Does it make any difference if the NIC in the Oracle server is set to
100FD (the highest it can go) and the NIC on the server running the other
backup scripts is set to 1000FD?  The NAS is set to 1000FD.  Is there
something in the way 100FD and 1000FD work that keeps them from being able
to truly work together properly?
> >
> >
> >
> > Thank you again.
> >
> > -Ryan
> >
> >
> >
> >   ________________________________

> >
> > From:  ale-bounces at ale.org [mailto: ale-bounces at ale.org] On Behalf Of
Ryan Fish
> >  Sent: Thursday, May 11, 2006 8:53 PM
> >  To: 'Atlanta Linux Enthusiasts'
> >  Subject: [ale] Network issue that makes absolutely NO sense
> >
> >
> >
> >
> > I have found the following issue with two different backup processes
after putting a new switch in place within the network:
> >
> >
> >
> > 1) RHEL3 AS/Oracle 9i server using RMAN and Export for backups.
> >
> >     - As long as the NIC on the NAS device to where all backup
information is written is set to 100FD the backup processes will run as per
normal and all is well.  Once the NIC on the NAS is set to 1000FD the
backups fail because the Oracle server is unable to connect to the NAS
device over the NFS mount.
> >
> >
> >
> > 2) RHEL3 ES server running multiple bash scripts to back up portions of
almost every other box in the same network.  The backup scripts run fine
when the NIC on the NAS is set to 1000FD but fail when I set it to 100FD.
> >
> >
> >
> > Prior to replacing the failed switch this was never an issue as all
backups ran fine every night with the exception of one that ran fine most
times.  Only the switch was swapped out did this network strangeness occur.
> >
> >
> >
> > What could/would cause this?
> >
> > Why would it matter when speed the NIC on the NAS is set to for
particular backup processes to function properly?
> >
> > Is there anywhere within the RMAN and/or Export processes that the NIC
speed on the receiving end could or would be hard coded to only accept
100FD?  If so, why?
> >
> >
> >
> > I am at a complete loss here and have been fighting this for two weeks
already so any help will be greatly appreciated.
> >
> >
> >
> > Thank you.
> >
> > -Ryan
> >
> >
> >  _______________________________________________
> >  Ale mailing list
> >  Ale at ale.org
> >  http://www.ale.org/mailman/listinfo/ale
> >
> >
> >
> > _______________________________________________
> > Ale mailing list
> > Ale at ale.org
> >  http://www.ale.org/mailman/listinfo/ale
> >
> >
>
>
_______________________________________________
Ale mailing list
Ale at ale.org
http://www.ale.org/mailman/listinfo/ale





More information about the Ale mailing list