[ale] Dazed and confused, but trying to continue

Jim Popovitch jimpop at yahoo.com
Sun Jul 16 14:10:07 EDT 2006


:-) In the time it took to get a response for someone (thank you Dow), I 
had migrated things to other places.  Interestingly enough, there is 
nothing but ssh and syslog running on that host now and the messages 
have stopped.  A few more days and she'll be put down for good.

R.I.P Bugs, you served me well.  ;-)

-Jim P.

Dow Hurst wrote:
> Jim,
> NMI's are typically (others correct me if I'm wrong) generated by kernel 
> crashes caused by hardware problems.  You may see this again or never 
> again.  I'd be sure to visit the machine and check on hardware contacts, 
> dust buildup, stray neutrinos or gamma rays.  It is a tough business 
> blocking those gamma rays from flipping bits but those issues must be 
> prevented by the lead lined cases on high availability hardware.  ;)
> Dow
> 
> 
> Jim Popovitch wrote:
>> Thanks Dow,
>>
>> I had seen one of those posts earlier.  It's a strange error/situation, 
>> esp in my case as their is no power-saving features enabled.  That 
>> system is isolated in a datacenter downtown.  It hasn't been rebooted in 
>> two weeks, and the case hasn't been opened in 3 years.  Apps on the box 
>> remained working, there was no cores or other memory related issues. 
>> Seems to me an error like that could include a bit more hint as to what 
>> it saw/felt that caused it to print that text.
>>
>> -Jim P.
>>
>> Dow Hurst wrote:
>>   
>>> Jim,
>>> Your dealing with a hardware problem.  Seems obvious from the posts I 
>>> saw after googling the error.  The guy who put the message and code into 
>>> the kernel is:
>>> Gareth Hughes <gareth at valinux.com>, May 2000
>>>
>>> Here are the posts:
>>> http://forum.myriadnetwork.com/showthread.php?t=186
>>> http://forums.digium.com/viewtopic.php?p=25442&sid=dece16abaef600e4274b1703699af9a7
>>> http://www.redhat.com/archives/k12osn/2005-January/msg00570.html
>>>
>>> Hope this helps!  Sounds like reseating cards and the CPU, just checking 
>>> hardware connections might fix it.  If not then different hardware was 
>>> the general solution.
>>> Dow
>>>
>>>
>>> Jim Popovitch wrote:
>>>     
>>>> What could possible cause these log messages:
>>>>
>>>>    kernel: Uhhuh. NMI received for unknown reason 31 on CPU 0.
>>>>    kernel: Dazed and confused, but trying to continue
>>>>    kernel: Do you have a strange power saving mode enabled?
>>>>
>>>> The system's been up for 10 days, but those messages just started to appear.
>>>>
>>>> Tia,
>>>>
>>>> -Jim P.
>>>> _______________________________________________
>>>> Ale mailing list
>>>> Ale at ale.org
>>>> http://www.ale.org/mailman/listinfo/ale
>>>>
>>>>   
>>>>       
>>> _______________________________________________
>>> Ale mailing list
>>> Ale at ale.org
>>> http://www.ale.org/mailman/listinfo/ale
>>>
>>>     
>> _______________________________________________
>> Ale mailing list
>> Ale at ale.org
>> http://www.ale.org/mailman/listinfo/ale
>>
>>   
> 
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://www.ale.org/mailman/listinfo/ale
> 



More information about the Ale mailing list