[ale] Server crashing - Help!!

Keith Hopkins hne at inetnow.net
Fri Nov 30 09:40:52 EST 2001


Charles Marcus wrote:

> Help!!
> 
> I have been having som trouble with a server crashing, and was wondering if
> someone could tell me where to look for the problem based on this snippet
> from the message log...
> 
> Everything seems to be running fine, except for these seemingly innocuous
> messages which happen every 5 minutes (anybody know what these mean or how I
> can get rid of them?)
> 
> Nov 29 18:14:42 sfla kdm[15117]: Can't lock pid file /var/run/xdm-pid,
> another kdm is running (pid 1556)


Looks like your lock file is stale (or, there kdm/xdm really is already running).  check the output of "ps -ef" to see if it really is running, and if it is NOT, then just delete the lock file.


> 
> Then, here's what happens when the server dies:
> 
> Nov 29 18:17:06 sfla kernel: fh_verify: ltsroot/dev permission failure,
> acc=3, error=30
> Nov 29 18:17:06 sfla kernel: fh_verify: ltsroot/dev permission failure,
> acc=3, error=30
> Nov 29 18:17:06 sfla su(pam_unix)[15378]: session closed for user nobody
> Nov 29 18:18:54 sfla kdm[1556]: Unknown session exit code 253 from process
> 13341
> Nov 29 18:18:54 sfla su(pam_unix)[15397]: session opened for user nobody by
> (uid=0)
> Nov 29 18:18:54 sfla kernel: fh_verify: ltsroot/dev permission failure,
> acc=3, error=30
> Nov 29 18:18:54 sfla kernel: fh_verify: ltsroot/dev permission failure,
> acc=3, error=30
> Nov 29 18:18:54 sfla su(pam_unix)[15397]: session closed for user nobody
> Nov 29 18:18:56 sfla kdm[1556]: Unknown session exit code 253 from process
> 15408
> Nov 29 18:18:56 sfla su(pam_unix)[15416]: session opened for user nobody by
> (uid=0)
> Nov 29 18:18:56 sfla kernel: fh_verify: ltsroot/dev permission failure,
> acc=3, error=30
> Nov 29 18:18:56 sfla kernel: fh_verify: ltsroot/dev permission failure,
> acc=3, error=30
> Nov 29 18:18:56 sfla kernel: Unable to handle kernel paging request at
> virtual address 0001000c
> Nov 29 18:18:56 sfla kernel:  printing eip:
> Nov 29 18:18:56 sfla kernel: c0113a82
> Nov 29 18:18:56 sfla kernel: pgd entry dbb4a000: 0000000000000000
> Nov 29 18:18:56 sfla kernel: pmd entry dbb4a000: 0000000000000000
> Nov 29 18:18:56 sfla kernel: ... pmd not present!
> Nov 29 18:18:56 sfla kernel: Oops: 0002
> Nov 29 18:18:56 sfla kernel: CPU:    0
> Nov 29 18:18:56 sfla kernel: EIP:    0010:[schedule+194/944]
> Nov 29 18:18:56 sfla kernel: EIP:    0010:[<c0113a82>]
> Nov 29 18:18:56 sfla kernel: EFLAGS: 00010096
> Nov 29 18:18:56 sfla kernel: eax: 00000008   ebx: dbbe0000   ecx: dbbe0000
> edx: 00000009
> Nov 29 18:18:56 sfla kernel: esi: 00000000   edi: 0000000d   ebp: dbbe1fbc
> esp: dbbe1f9c
> Nov 29 18:18:56 sfla kernel: ds: 0018   es: 0018   ss: 0018
> Nov 29 18:18:56 sfla kernel: Process sort (pid: 15424, stackpage=dbbe1000)
> Nov 29 18:18:56 sfla kernel: Stack: 40017000 dbbe0000 00000006 dbbe0000
> c02ad600 dbbe0000 40016734 bffffd8c
> Nov 29 18:18:56 sfla kernel:        bfffda48 c01090f5 4015d700 00000000
> 400e4654 40016734 bffffd8c bfffda48
> Nov 29 18:18:56 sfla kernel:        0000e325 0000002b 0000002b ffffffff
> 0804f0e2 00000023 00010286 bfffda2c
> Nov 29 18:18:56 sfla kernel: Call Trace: [reschedule+5/12]
> Nov 29 18:18:56 sfla kernel: Call Trace: [<c01090f5>]
> Nov 29 18:18:56 sfla kernel:
> Nov 29 18:18:56 sfla kernel: Code: 89 50 04 89 02 c7 43 3c 00 00 00 00 8b 55
> e4 c7 42 14 00 00
> Nov 29 18:18:56 sfla su(pam_unix)[15416]: session closed for user nobody
> 
> Anyone??  I'd rather not have to fly down to Miami unless I have to.
> 


What is running on the server when this happens?  Is it a web, mail, file, print, general or whatever kind of server?  Is this during a shutdown?  What kind of hardware are you running?  LVM/MD/*fs ?

Lost in Tokyo (which is much farther than Miami for you...)
   Keith



---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.






More information about the Ale mailing list