[ale] indecipherable hard crash

Master Wizard mainwizard at vei.net
Thu Aug 1 21:39:16 EDT 2002


What you describe sounds like a lock-up, which I consider somewhat 
different from a crash. It is possible for java to grab enough resources 
to bring down even a hefty Sun box. I know cause I did it ;-)

My problem was a thread creation process that entered into an infinite 
loop and spawned threads until it used up 100% of the memory and CPU 
time, at which point it froze into a state of non-responsiveness. As it 
was also writing to a log for each thread, it also ate all of the hard 
disk space.

OOOps. I had a max on the loop during development which I removed prior 
to deployment to production.

You might want to try starting tomcat first and then feeding in the 
servlets one at a time to see if any one in particular is causing the 
problem.

Mike Panetta wrote:

> On Wed, 2002-07-31 at 07:23, joshy wrote:
> 
>>Anyone know if it's possible for a JVM to hard crash a linux box?
>>
> 
> Anything is possible...
> 
> 
>>The story:
>>
> 
> [snip of explanation]
> 
> 
>>At this point I can only assume that either the memory is bad
>>or misconfigured some how. Still, I would have expected to see
>>more errors printed somewhere.
>>
> 
> If the memory is bad, and you have no form of error
> correction/detection, you may not get any messages at all.  I suggest
> you download a copy of memtest86 and try to see if its a memory error. 
> You may need to run it for several hrs (possibly over half a day) for
> that much memory, but it would be worth it to solve such a frustrating
> problem.  
> 
> Maybe an even quicker way to find a problem (assuming only one of the
> dimms was bad) is to swap the memory dimms.  If the problem happens
> after loading fewer programs into memory, or even immediately after
> boot, then you know within all probability that the dimm in the first
> dimm slot is bad.  
> 
> Another thing to check would be to make sure both dimms are the same
> speed, and put the slowest in the first slot.  I am not sure, but some
> motherboards may only read the SPD EEPROM on the first dimm to get the
> timing info, and apply that to both dimms.  This would obviously be bad
> if you had a PC133 dimm in the first slot, and a PC100 dimm in the
> second (for example).
> 
> Do you know if the hardware supports ECC ram?  If it does did you buy
> ECC ram? (I assume not, since you did not get any error messages, I
> think an ECC error would give you a bluesmoke error on a 2.4 kernel.)
> 
> 
> 
>>Any ideas?
>>Thanks,
>>
>>- Joshua
>>
>>
> 
> HTH,
> Mike
> 
> 
> ---
> This message has been sent through the ALE general discussion list.
> See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
> sent to listmaster at ale dot org.
> 
> 
> 




---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.






More information about the Ale mailing list