[ale] Re: indecipherable crash

joshy joshy at joshy.org
Wed Aug 7 09:47:22 EDT 2002


 thanks for the help. it appears to be some bad memory.
 i ran memtest and it hacked on the same stick of memory
 in the same place several times.  the other stick seems
 to work by itself but it really doesn't like to be in there
 with anything else.  looks like i just got some cheap memory.
 i'm going take it all back and order some good stuff from
 micron.
 
 thanks for all the help.
 
 now that i have a stable server i hope to launch my
 opensource webmail client next week.  more coming soon. :)
 
 - joshy
 
> On Thursday, August 1, 2002, at 09:39  PM, Master Wizard wrote:
> 
> > What you describe sounds like a lock-up, which I consider somewhat 
> > different from a crash. It is possible for java to grab enough 
> > resources to bring down even a hefty Sun box. I know cause I did it ;-)
> >
> > My problem was a thread creation process that entered into an infinite 
> > loop and spawned threads until it used up 100% of the memory and CPU 
> > time, at which point it froze into a state of non-responsiveness. As it 
> > was also writing to a log for each thread, it also ate all of the hard 
> > disk space.
> >
> > OOOps. I had a max on the loop during development which I removed prior 
> > to deployment to production.
> >
> > You might want to try starting tomcat first and then feeding in the 
> > servlets one at a time to see if any one in particular is causing the 
> > problem.
> >
> > Mike Panetta wrote:
> >
> >> On Wed, 2002-07-31 at 07:23, joshy wrote:
> >>> Anyone know if it's possible for a JVM to hard crash a linux box?
> >>>
> >> Anything is possible...
> >>> The story:
> >>>
> >> [snip of explanation]
> >>> At this point I can only assume that either the memory is bad
> >>> or misconfigured some how. Still, I would have expected to see
> >>> more errors printed somewhere.
> >>>
> >> If the memory is bad, and you have no form of error
> >> correction/detection, you may not get any messages at all.  I suggest
> >> you download a copy of memtest86 and try to see if its a memory error. 
> >> You may need to run it for several hrs (possibly over half a day) for
> >> that much memory, but it would be worth it to solve such a frustrating
> >> problem.  Maybe an even quicker way to find a problem (assuming only 
> >> one of the
> >> dimms was bad) is to swap the memory dimms.  If the problem happens
> >> after loading fewer programs into memory, or even immediately after
> >> boot, then you know within all probability that the dimm in the first
> >> dimm slot is bad.  Another thing to check would be to make sure both 
> >> dimms are the same
> >> speed, and put the slowest in the first slot.  I am not sure, but some
> >> motherboards may only read the SPD EEPROM on the first dimm to get the
> >> timing info, and apply that to both dimms.  This would obviously be bad
> >> if you had a PC133 dimm in the first slot, and a PC100 dimm in the
> >> second (for example).
> >> Do you know if the hardware supports ECC ram?  If it does did you buy
> >> ECC ram? (I assume not, since you did not get any error messages, I
> >> think an ECC error would give you a bluesmoke error on a 2.4 kernel.)
> >>> Any ideas?
> >>> Thanks,
> >>>
> >>> - Joshua
> >>>
> >>>
> >> HTH,
> >> Mike
> >> ---
> >> This message has been sent through the ALE general discussion list.
> >> See http://www.ale.org/mailing-lists.shtml for more info. Problems 
> >> should be sent to listmaster at ale dot org.
> >
> >
> >
> >
> > ---
> > This message has been sent through the ALE general discussion list.
> > See http://www.ale.org/mailing-lists.shtml for more info. Problems 
> > should be sent to listmaster at ale dot org.
> >
> - Joshua
> 
>     ... but I still haven't found what I'm looking for.

---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.






More information about the Ale mailing list