[ale] Dazed and confused, but trying to continue
Jim Popovitch
jimpop at yahoo.com
Sun Jul 16 14:10:07 EDT 2006
:-) In the time it took to get a response for someone (thank you Dow), I
had migrated things to other places. Interestingly enough, there is
nothing but ssh and syslog running on that host now and the messages
have stopped. A few more days and she'll be put down for good.
R.I.P Bugs, you served me well. ;-)
-Jim P.
Dow Hurst wrote:
> Jim,
> NMI's are typically (others correct me if I'm wrong) generated by kernel
> crashes caused by hardware problems. You may see this again or never
> again. I'd be sure to visit the machine and check on hardware contacts,
> dust buildup, stray neutrinos or gamma rays. It is a tough business
> blocking those gamma rays from flipping bits but those issues must be
> prevented by the lead lined cases on high availability hardware. ;)
> Dow
>
>
> Jim Popovitch wrote:
>> Thanks Dow,
>>
>> I had seen one of those posts earlier. It's a strange error/situation,
>> esp in my case as their is no power-saving features enabled. That
>> system is isolated in a datacenter downtown. It hasn't been rebooted in
>> two weeks, and the case hasn't been opened in 3 years. Apps on the box
>> remained working, there was no cores or other memory related issues.
>> Seems to me an error like that could include a bit more hint as to what
>> it saw/felt that caused it to print that text.
>>
>> -Jim P.
>>
>> Dow Hurst wrote:
>>
>>> Jim,
>>> Your dealing with a hardware problem. Seems obvious from the posts I
>>> saw after googling the error. The guy who put the message and code into
>>> the kernel is:
>>> Gareth Hughes <gareth at valinux.com>, May 2000
>>>
>>> Here are the posts:
>>> http://forum.myriadnetwork.com/showthread.php?t=186
>>> http://forums.digium.com/viewtopic.php?p=25442&sid=dece16abaef600e4274b1703699af9a7
>>> http://www.redhat.com/archives/k12osn/2005-January/msg00570.html
>>>
>>> Hope this helps! Sounds like reseating cards and the CPU, just checking
>>> hardware connections might fix it. If not then different hardware was
>>> the general solution.
>>> Dow
>>>
>>>
>>> Jim Popovitch wrote:
>>>
>>>> What could possible cause these log messages:
>>>>
>>>> kernel: Uhhuh. NMI received for unknown reason 31 on CPU 0.
>>>> kernel: Dazed and confused, but trying to continue
>>>> kernel: Do you have a strange power saving mode enabled?
>>>>
>>>> The system's been up for 10 days, but those messages just started to appear.
>>>>
>>>> Tia,
>>>>
>>>> -Jim P.
>>>> _______________________________________________
>>>> Ale mailing list
>>>> Ale at ale.org
>>>> http://www.ale.org/mailman/listinfo/ale
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Ale mailing list
>>> Ale at ale.org
>>> http://www.ale.org/mailman/listinfo/ale
>>>
>>>
>> _______________________________________________
>> Ale mailing list
>> Ale at ale.org
>> http://www.ale.org/mailman/listinfo/ale
>>
>>
>
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://www.ale.org/mailman/listinfo/ale
>
More information about the Ale
mailing list