[ale] Server crashing - Help!!

Dow Hurst dhurst at kennesaw.edu
Fri Nov 30 17:54:11 EST 2001


Your not showing your ignorance so much as asking the right questions! 
Kdm does not have to be running if the X-terminals are contacting the
server to run an application that will have its graphical display
managed and physically displayed on the X-terminal by the Xterminal's
X-server.  This is a situation similar to a PC running a Xwin32 using
rsh for application startup.  The only reason for Kdm or xdm to run on
the machine AFAICROH (as far as I can remember off hand) is for a
directly connected monitor to have a graphical display or for the server
to advertise XDMCP availability remotely.  Using XDMCP over a network
advertises the presence of the server for other X-servers to allow
remote logins.  XDMCP is what many use in your situation.  The local
X-terminal's X-server is pointed to the server in the startup scripts
and the server's Kdm or xdm manages the login but the local X-terminal's
X-server actually displays the graphical output.  This what "GUI login
mode" represents and most PC Window's based X-server programs operate
this way.  As far as your crashing, I don't know much.  What I could
suggest is a possible Reiserfs problem if that is your file system
type.  If you have files that are reported as available yet even root
can't delete, touch, or cat then you might have a bad inode or incorrect
tree.  I just had a system that had been cut off and on repeatably due
to lockups because of unsupported hardware.  This system ended up with a
damaged reiserfs root partition which was fixed by booting to a miniroot
(linuxrc) and running "reiserfsck --rebuildtree".  That fixed the file
system just fine.  But until the file system was fixed, anytime you ran
"ls" the system would crash.  That is probably not your problem since
you at least are getting error messages in your logs which I didn't get.
Dow

Charles Marcus wrote:
> 
> > -----Original Message-----
> > From: Keith Hopkins [mailto:hne at inetnow.net]
> > Sent: Friday, November 30, 2001 9:41 AM
> > To: Charles Marcus
> > Cc: Ale (E-mail)
> > Subject: Re: [ale] Server crashing - Help!!
> 
> > > Nov 29 18:14:42 sfla kdm[15117]: Can't lock pid file \
> > > /var/run/xdm-pid, another kdm is running (pid 1556)
> 
> > Looks like your lock file is stale (or, there kdm/xdm really
> > is already running).  check the output of "ps -ef" to see if
> > it really is running, and if it is NOT, then just delete the
> > lock file.
> 
> Hmmmm.  This is an LTS (Linux Terminal Services) server, which serves about
> 15 Terminals (all using GUI login mode).  Could this be because the server
> itself is booting into run-level 5, so when LTS starts up the X-servers for
> the clients, kdm IS already running?
> 
> I just went and looked, and process ID 1529 IS being run by root.  I'm not
> sure thats the problem though - wouldn't it do the same thing when the
> second person started up kdm (simply by turning on their workstation)?  Or
> is the process run by root different?
> 
> If so, I guess the answer would be to set the server to boot to run-level 3,
> but would that affect the workstations ability to boot to run-level 5?
> 
> I'm sure I'm showing my ignorance here...
> 
> > > Then, here's what happens when the server dies:
> > >
> > > Nov 29 18:17:06 sfla kernel: fh_verify: ltsroot/dev
> > permission failure,
> > > acc=3, error=30
> > > Nov 29 18:17:06 sfla kernel: fh_verify: ltsroot/dev
> > permission failure,
> > > acc=3, error=30
> > > Nov 29 18:17:06 sfla su(pam_unix)[15378]: session closed
> > for user nobody
> > > Nov 29 18:18:54 sfla kdm[1556]: Unknown session exit code
> > 253 from process
> > > 13341
> > > Nov 29 18:18:54 sfla su(pam_unix)[15397]: session opened
> > for user nobody by
> > > (uid=0)
> > > Nov 29 18:18:54 sfla kernel: fh_verify: ltsroot/dev
> > permission failure,
> > > acc=3, error=30
> > > Nov 29 18:18:54 sfla kernel: fh_verify: ltsroot/dev
> > permission failure,
> > > acc=3, error=30
> > > Nov 29 18:18:54 sfla su(pam_unix)[15397]: session closed
> > for user nobody
> > > Nov 29 18:18:56 sfla kdm[1556]: Unknown session exit code
> > 253 from process
> > > 15408
> > > Nov 29 18:18:56 sfla su(pam_unix)[15416]: session opened
> > for user nobody by
> > > (uid=0)
> > > Nov 29 18:18:56 sfla kernel: fh_verify: ltsroot/dev
> > permission failure,
> > > acc=3, error=30
> > > Nov 29 18:18:56 sfla kernel: fh_verify: ltsroot/dev
> > permission failure,
> > > acc=3, error=30
> > > Nov 29 18:18:56 sfla kernel: Unable to handle kernel paging
> > request at
> > > virtual address 0001000c
> > > Nov 29 18:18:56 sfla kernel:  printing eip:
> > > Nov 29 18:18:56 sfla kernel: c0113a82
> > > Nov 29 18:18:56 sfla kernel: pgd entry dbb4a000: 0000000000000000
> > > Nov 29 18:18:56 sfla kernel: pmd entry dbb4a000: 0000000000000000
> > > Nov 29 18:18:56 sfla kernel: ... pmd not present!
> > > Nov 29 18:18:56 sfla kernel: Oops: 0002
> > > Nov 29 18:18:56 sfla kernel: CPU:    0
> > > Nov 29 18:18:56 sfla kernel: EIP:    0010:[schedule+194/944]
> > > Nov 29 18:18:56 sfla kernel: EIP:    0010:[<c0113a82>]
> > > Nov 29 18:18:56 sfla kernel: EFLAGS: 00010096
> > > Nov 29 18:18:56 sfla kernel: eax: 00000008   ebx: dbbe0000
> >  ecx: dbbe0000
> > > edx: 00000009
> > > Nov 29 18:18:56 sfla kernel: esi: 00000000   edi: 0000000d
> >  ebp: dbbe1fbc
> > > esp: dbbe1f9c
> > > Nov 29 18:18:56 sfla kernel: ds: 0018   es: 0018   ss: 0018
> > > Nov 29 18:18:56 sfla kernel: Process sort (pid: 15424,
> > stackpage=dbbe1000)
> > > Nov 29 18:18:56 sfla kernel: Stack: 40017000 dbbe0000
> > 00000006 dbbe0000
> > > c02ad600 dbbe0000 40016734 bffffd8c
> > > Nov 29 18:18:56 sfla kernel:        bfffda48 c01090f5
> > 4015d700 00000000
> > > 400e4654 40016734 bffffd8c bfffda48
> > > Nov 29 18:18:56 sfla kernel:        0000e325 0000002b
> > 0000002b ffffffff
> > > 0804f0e2 00000023 00010286 bfffda2c
> > > Nov 29 18:18:56 sfla kernel: Call Trace: [reschedule+5/12]
> > > Nov 29 18:18:56 sfla kernel: Call Trace: [<c01090f5>]
> > > Nov 29 18:18:56 sfla kernel:
> > > Nov 29 18:18:56 sfla kernel: Code: 89 50 04 89 02 c7 43 3c
> > 00 00 00 00 8b 55
> > > e4 c7 42 14 00 00
> > > Nov 29 18:18:56 sfla su(pam_unix)[15416]: session closed
> > for user nobody
> > >
> > > Anyone??  I'd rather not have to fly down to Miami unless I have to.
> > >
> >
> >
> > What is running on the server when this happens?  Is it a
> > web, mail, file, print, general or whatever kind of server?
> > Is this during a shutdown?  What kind of hardware are you
> > running?  LVM/MD/*fs ?
> 
> The users could be running anything from Netscape, Mozilla or StarOffice
> 5.2, and all use KDE 2.1.2 for their Desktop GUI.  Server services are
> limited to X, LDAP, and LTS (www.ltsp.org).
> 
> Thanks for any suggestions...
> 
> Charles
> 
> ---
> This message has been sent through the ALE general discussion list.
> See http://www.ale.org/mailing-lists.shtml for more info. Problems should be
> sent to listmaster at ale dot org.

-- 
__________________________________________________________
Dow Hurst                   Office: 770-499-3428
Systems Support Specialist  Fax:    770-423-6744
1000 Chastain Rd.
Chemistry Department SC428  Email:dhurst at kennesaw.edu
Kennesaw State University         Dow.Hurst at mindspring.com
Kennesaw, GA 30144
*********************************
*Computational Chemistry is fun!*
*********************************

---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.






More information about the Ale mailing list