[ale] New worm destablized Internet

Sun Jan 26 16:07:04 EST 2003

At 02:35 PM 1/26/03 -0500, you wrote:
>On Sat, Jan 25, 2003 at 09:10:40PM -0500, James S. Cochrane wrote:
> > I just spent four hours at work, none of our Unix servers were DIRECTLY
> > impacted, but the amount of broadcast traffic did impact our networks, and
> > convinced several of our HA systems that they were experiencing network
> > outages (mainly seemed to impact HP boxes, going to have to look into
> > whether there are problems with HP-UX 11.0's network stack).
>
>That sounds like a bug or design flaw in your HP boxen.  I would bet that
>the problem is not in the network stack but rather in some high level
>non-kernel code that was not designed well.

Wouldn't surprise me... HP-UX seems to have several areas where it is 
severely lacking when compared to other Unix OS's (still better than the 
SCO I was dealing with the first time I ever met you, though, when you 
consulted for my first IT employer back in '95 or '96...  Would you believe 
that Consolidated Traffic Management Systems managed to hang on until 2000 
or 2001?).  I noticed the problems on an MC Service Guard cluster and a 
cluster running Veritas Cluster Services, the Sun clusters running VCS 
didn't have much problem, although one did spit out a few 
warnings.  Fortunately, this may finally be the impetus to get upper 
management to let us make some network changes we've been advocating for a 
while, as well as accelerate some changes that were already in the pipeline...

>I've done a lot of work with high availability and it should not fail
>this way from this worm.  It should take, at least, an amount of traffic
>exceeding the network bandwidth of your boxes by a factor of 2-10 before
>failure occurred.  This is unlikely unless you have a T3 feed.

Allegedly we were seeing enough traffic to give our routers problems...  Of 
course, we're in the process of redesigning our internal network to get 
away from some poor configurations done by the network admins at our parent 
corporation, which probably contributed to our overall problems.  The fact 
that we're on shared network segments with related companies means we're 
exposed to their network insecurities, which is why we were already 
migrating to our own ring to connect datacenters...  But this isn't the 
first problem I've seen with the HP boxes and networking, I've had some 
issues with Veritas volume replication where connections were dropping due 
to lost heartbeats or somesuch... Veritas hasn't given me a clear answer 
yet, but I've only got 9 volume groups (relatively low rates of change) 
being replicated on a dedicated pair of OC-3's from the HP boxes, I've got 
two larger volume groups being replicated on Sun's with no issues...

James

> > So it might
> > not be impacting the ATM network directly, but could be impacting the
> > back-end networks where their servers are, preventing the ATM's from
> > connecting to verify account balances and funds available, etc...
>
>The only scenario I could see (that did not involve stupidity on B of A's
>part, like being vulnerable to the worm) was if their ATMs are connected
>to the servers over the Internet via a VPN and the 1434 noise flooded the
>bandwidth.
>
> > James
>Bob
>_______________________________________________
>Ale mailing list
>Ale at ale.org
>http://www.ale.org/mailman/listinfo/ale

_______________________________________________
Ale mailing list
Ale at ale.org
http://www.ale.org/mailman/listinfo/ale