[ale] checking for process in uninterruptable sleep state

Scott Plante splante at insightsys.com
Tue May 10 11:21:22 EDT 2016


How about if you do lsusb in the background then check the PID to see if it's still running/stuck, like: 


lsusb >/dev/null 2>&1 & 
usbpid=$! 
sleep 4 #or however long > max lsusb exec time 
if ps -p $usbpid >/dev/null 2>&1 
then 
#lsusb is hung--do your stuff here, reboot etc. 
fi 
----- Original Message -----

From: "Todor Fassl" <fassl.tod at gmail.com> 
To: "Atlanta Linux Enthusiasts" <ale at ale.org> 
Sent: Tuesday, May 10, 2016 10:39:26 AM 
Subject: [ale] checking for process in uninterruptable sleep state 

Okay, so my latest problem with these lab workstations is that accessing 
the usb sub-system puts the calling process into an uninterruptable 
sleep. I'd like to write a script to check for that so at least I'd know 
that I have to go over and reboot the machine. 

Details: I have 15 Dell workstations running ubuntu 15.10 (2 are running 
16.04 -- that did not help). Occasionally, the keyboard and mouse 
freeze. Logging in remotely and running lsusb hangs such that you can't 
even control-c outand it cannot be killed even with a -9. The process 
goes into an uninterruptable sleep during a system call to open the file 
/sys/bus/usb/devices/usb1/descriptors. That file is part of the kernel's 
control files for the usb controller itself. So you can see why the 
keyboard and mouse are dead, the driver for the usb controller itself is 
hung. 

We've upgraded the kernel and installed Dell's latest bios upgrades. No 
joy. I am thinking the only remaining thing to do is to file a bug 
report. However, I could eleaviate the problem a little if I could 
easily detect it and reboot. 

The problem is that I can't figure out how to write a script to detect a 
process in a uninterruptable sleep state. No matter what I do,it seems 
to hang. I've tried something like "bash -c "lsusb' and 'timeout 5 
lsusb'. They both hang. The only thing I've been able to do is to have 
2 different scripts. One running lsusb and another checking for blocked 
lsusb procs. But that is way ugly. 

PS: I wouldn't mind ideas wrt the original problem either. Not that I 
hold out any hope for that. 
-- 
Todd 
_______________________________________________ 
Ale mailing list 
Ale at ale.org 
http://mail.ale.org/mailman/listinfo/ale 
See JOBS, ANNOUNCE and SCHOOLS lists at 
http://mail.ale.org/mailman/listinfo 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ale.org/pipermail/ale/attachments/20160510/d5b9c1bc/attachment.html>


More information about the Ale mailing list