[ale] RedHat HA Assistance

Scott McBrien smcbrien at gmail.com
Wed Sep 12 13:23:38 EDT 2012


It's because your cluster is split-brained.  Best option would be to add a third node.  If you can't do that, you could add a qdisk *wretches*.  Seriously, third node.

Also, I the machines aren't constantly rebooting each other, I'd guess your fencing isn't working as well.

-Scott

On Sep 12, 2012, at 1:15 PM, Sam Davis <aracthabar at gmail.com> wrote:

> Hello All,
> 
>    I'm wondering if anyone had expirence with RedHat's HA solution.  We're trying to setup a 2 node failover cluster for a Drupal install using multipath disks attached via HBAs.   It seems to work fine at the start, but after a while neither node can take ownership of the clustered volume group.  At this point, even trying to stop the clvmd service just hangs.  Any hints or pointers would be appreciated.  Below is the cluster.conf
> 
> 
> Thank you,
> Sam Davis
> 
> 
> 
> <?xml version="1.0"?>
> <cluster config_version="25" name="moodle_cluster">
>         <clusternodes>
>                 <clusternode name="mario.REDACTED.REDACTED" nodeid="1">
>                         <fence>
>                                 <method name="RSA">
>                                         <device name="Mario_RSA"/>
>                                 </method>
>                         </fence>
>                 </clusternode>
>                 <clusternode name="luigi.REDACTED.REDACTED" nodeid="2"/>
>         </clusternodes>
>         <fencedevices>
>                 <fencedevice agent="fence_ipmilan" auth="password" ipaddr="REDACTED" login="admin" name="Mario_RSA" passwd="REDACTED"/>
>                 <fencedevice agent="fence_ipmilan" auth="password" ipaddr="REDACTED" login="admin" name="Luigi_RSA" passwd="REDACTED"/>
>         </fencedevices>
>         <rm>
>                 <failoverdomains>
>                         <failoverdomain name="Mario_Preferred" nofailback="1" ordered="1">
>                                 <failoverdomainnode name="mario.REDACTED.REDACTED" priority="1"/>
>                                 <failoverdomainnode name="luigi.REDACTED.REDACTED"/>
>                         </failoverdomain>
>                 </failoverdomains>
>                 <resources>
>                         <ip address="REDACTED" sleeptime="10"/>
>                         <clusterfs device="/dev/vg_moodle/moodle_lv" force_unmount="on" fsid="27336" fstype="gfs2" mountpoint="/srv" name="moodle_gfs" options="rw,noatime,_netdev"/>
>                         <script file="/etc/init.d/httpd" name="Httpd"/>
>                 </resources>
>                 <service domain="Mario_Preferred" name="moodle_service" recovery="relocate">
>                         <clusterfs ref="moodle_gfs"/>
>                         <ip ref="REDACTED"/>
>                         <script ref="Httpd"/>
>                 </service>
>         </rm>
>         <cman expected_votes="1" two_node="1"/>
> </cluster>
> 
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo



More information about the Ale mailing list