[ale] sometimes you just need to reboot NOW!

Michael H. Warfield mhw at WittsEnd.com
Wed Jul 30 13:50:09 EDT 2014


On Wed, 2014-07-30 at 13:41 -0400, Jim Kinney wrote:
> 
> 
> 
> On Wed, Jul 30, 2014 at 10:28 AM, Michael H. Warfield
> <mhw at wittsend.com> wrote:
>         On Wed, 2014-07-30 at 09:28 -0400, Jim Kinney wrote:
>         > Stupid power strip can only monitor power usage. Addressable
>         pdu was seen
>         > as a waste of money. Sort of like cooling that shuts off and
>         requires a
>         > manual reset when there's a power hiccup.
>         
>         
>         Ah, it's not the monitoring.  It's the ability to shoot the
>         node in the
>         head and force him to reboot.  When all else fails, it's hard
>         for a
>         server to resist the removal of primary power to the power
>         supplies. 
> 
> 
> I am crafting a stonith script now to include in /root for
> emergencies. I have many machines with no remote power control.
> 
>  
>          In
>         the past, I've always used X10 controllers as a poor man's PDU
>         for
>         controlling power to my colo units.  Now, some relatively
>         cheap web
>         controlled power strips have become available that, when
>         combined with a
>         Raspberry Pi for access control and security, makes for a very
>         powerful
>         management solution.
> 
> 
> Adding a RasPi with a thermistor as a manage point for the next
> cooling system failure. Not much more and I can add stonith for the
> entire stack. Will need to figure out the vmware CLI method of
> suspending VMs. 
> 
>         
>         Corollary to that:  Have a maintenance / forensice CD or USB
>         drive in
>         the system with serial consoles so you can intercept the boot
>         process to
>         an assured safe media to repair possible damage.
>         
>         Also...  Before doing a forced reboot, if you have ANY control
>         at all,
>         sync and unmount your disks.  SysRQ - S and U
> 
> 
> Will add that ability plus a "just shoot me" version with no safety as
> a last ditch fallback. 
> 

I had a "just shoot me" macro built into my X10 interface.  Any time it
sensed an OFF command (like M15 OFF) it would initiate a delay macro
that would issue a corresponding ON command (like M15 ON) after 5
minutes.  Included a responding macro for the entire house code, so M
ALL OFF would result in an M1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,15 ON
sequence.  That was my "shoot myself in the head" recovery macro set
that had me covered even in the case of a whole buss code off.

>         > On Jul 30, 2014 9:18 AM, "Chris Fowler"
>         <cfowler at outpostsentinel.com> wrote:
>         
>         > > Intelligent power strip.
>         > >
>         > > Power off.  Power on.
>         > >
>         > > _______________________________________________
>         > > Ale mailing list
>         > > Ale at ale.org
>         > > http://mail.ale.org/mailman/listinfo/ale
>         > > See JOBS, ANNOUNCE and SCHOOLS lists at
>         > > http://mail.ale.org/mailman/listinfo
>         > >
>         
>         > -------------- next part --------------
>         > An HTML attachment was scrubbed...
>         
>         > URL:
>         <http://mail.ale.org/pipermail/ale/attachments/20140730/bd900026/attachment.html>
>         > _______________________________________________
>         > Ale mailing list
>         > Ale at ale.org
>         > http://mail.ale.org/mailman/listinfo/ale
>         > See JOBS, ANNOUNCE and SCHOOLS lists at
>         > http://mail.ale.org/mailman/listinfo
>         >
>         
>         
>         --
>         Michael H. Warfield (AI4NB) | (770) 978-7061 |
>          mhw at WittsEnd.com
>            /\/\|=mhw=|\/\/          | (678) 463-0932 |
>          http://www.wittsend.com/mhw/
>            NIC whois: MHW9          | An optimist believes we live in
>         the best of all
>          PGP Key: 0x674627FF        | possible worlds.  A pessimist is
>         sure of it!
>         
>         -------------- next part --------------
>         A non-text attachment was scrubbed...
>         Name: signature.asc
>         Type: application/pgp-signature
>         Size: 465 bytes
>         Desc: This is a digitally signed message part
>         URL:
>         <http://mail.ale.org/pipermail/ale/attachments/20140730/3bcba4f5/attachment.sig>
>         _______________________________________________
>         Ale mailing list
>         Ale at ale.org
>         http://mail.ale.org/mailman/listinfo/ale
>         See JOBS, ANNOUNCE and SCHOOLS lists at
>         http://mail.ale.org/mailman/listinfo
>         
> 
> 
> 
> -- 
> -- 
> James P. Kinney III
> 
> Every time you stop a school, you will have to build a jail. What you
> gain at one end you lose at the other. It's like feeding a dog on his
> own tail. It won't fatten the dog.
> - Speech 11/23/1900 Mark Twain
> 
> http://heretothereideas.blogspot.com/
> 
> 
> -- 
> This message has been scanned for viruses and 
> dangerous content by MailScanner, and is 
> believed to be clean.

-- 
Michael H. Warfield (AI4NB) | (770) 978-7061 |  mhw at WittsEnd.com
   /\/\|=mhw=|\/\/          | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9          | An optimist believes we live in the best of all
 PGP Key: 0x674627FF        | possible worlds.  A pessimist is sure of it!

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 465 bytes
Desc: This is a digitally signed message part
URL: <http://mail.ale.org/pipermail/ale/attachments/20140730/fa4b8532/attachment.sig>


More information about the Ale mailing list