Configure Multiple Fencing Devices Using pcs

From ClusterLabs
(Redirected from STONITH Levels)
Jump to: navigation, search

This describes how to Configure Multiple Fencing Devices (using that page's example of IPMI followed by two switched PDUs) via the pcs tool.

Starting Point

For a frame of reference, the cluster starts with this configuration:

Cluster Name: an-cluster-03
Corosync Nodes:
 pcmk-1 pcmk-2 
Pacemaker Nodes:
 pcmk-1 pcmk-2 

Resources: 

Stonith Devices: 
Fencing Levels: 

Location Constraints:
Ordering Constraints:
Colocation Constraints:

Cluster Properties:
 cluster-infrastructure: corosync
 dc-version: 1.1.14-70404b0
 no-quorum-policy: ignore
 stonith-enabled: false

We will need to make a few assumptions about our example cluster;

  • It is a two-node cluster with the node names "pcmk-1" and "pcmk-2".
  • The two PDUs are accessible at the network address "pdu-1" and "pdu-2" and will be accessed using the "fence_apc_snmp" fence agent.
  • The fencing details for "pcmk-1" are:
    • IPMI device address is "pcmk-1.ipmi", the login name is "admin" and the password is "secret".
    • Its power supplies are connected to "pdu-1" on port 1 and "pdu-2" on port 1.
  • The fencing details for "pcmk-2" are:
    • IPMI device address is "pcmk-2.ipmi", the login name is "admin" and the password is "secret".
    • Its power supplies are connected to "pdu-1" on port 2 and "pdu-2" on port 2.

Please adapt the example below to the names, addresses, credentials and fence agents you are using in your cluster.

Configure IPMI fencing

Configure the IPMI fence device for "pcmk-1":

pcs stonith create fence_pcmk1_ipmi fence_ipmilan \
    pcmk_host_list="pcmk-1" ipaddr="pcmk-1.ipmi" \
    action="reboot" login="admin" passwd="secret" delay=15 \
    op monitor interval=60s

Configure the IPMI fence device for "pcmk-2":

pcs stonith create fence_pcmk2_ipmi fence_ipmilan \
    pcmk_host_list="pcmk-2" ipaddr="pcmk-2.ipmi" \
    action="reboot" login="admin" passwd="secret" delay=15 \
    op monitor interval=60s


Configure PDU fencing

If using Pacemaker 1.1.14 or newer

Configure the PDU fence devices for "pcmk-1":

pcs stonith create fence_pcmk1_psu1 fence_apc_snmp \
    pcmk_host_list="pcmk-1" ipaddr="pdu-1" \
    port="1" op monitor interval="60s"
pcs stonith create fence_pcmk1_psu2 fence_apc_snmp \
    pcmk_host_list="pcmk-1" ipaddr="pdu-2" \
    port="1" power_wait="5" op monitor interval="60s"

Configure the PDU fence devices for "pcmk-2":

pcs stonith create fence_pcmk2_psu1 fence_apc_snmp \
    pcmk_host_list="pcmk-2" ipaddr="pdu-1" \
    port="2" op monitor interval="60s"
pcs stonith create fence_pcmk2_psu2 fence_apc_snmp \
    pcmk_host_list="pcmk-2" ipaddr="pdu-2" \
    port="2" power_wait="5" op monitor interval="60s"

If using an older Pacemaker version

The PDU fencing is more complicated with version of Pacemaker older than 1.1.14.

In order for fencing to work when two separate PDUs are used, we must ensure that there is a period of time where both PDUs have their ports powered off at the same time. To do this, we need to setup four primitives; One for each device set to an "off" action and another for each device set to an "on" action. This will allow us to call "pdu1:x off -> pdu2:x off -> pdu1:x on -> pdu2:x on".

Template note icon.png
Note: Prior to version 1.1.10, 'action="..."' was ignored. If you have a version of pacemaker below this (including 1.1.10 rc5 and older), you will need to replace 'action="..."' with 'pcmk_reboot_action="..."'.

Now configure the four PDU fence methods for "pcmk-1". Note that we've added 'power_wait="5"' to the second PDU's "off" action. Later, we will stitch these actions together and this argument will tell pacemaker to wait 5 seconds after turning off the second PDU before restoring power. This gives plenty of time for the node's power supplies to completely drain, ensuring that the node loses power. You will also note that the "monitor" operation is only set on the "off" actions. There is no need to monitor the status of the "on" actions as it would be redundant.

# Node 1 - Off
pcs stonith create fence_pcmk1_psu1_off fence_apc_snmp \
    pcmk_host_list="pcmk-1" ipaddr="pdu-1" action="off" \
    port="1" op monitor interval="60s"
pcs stonith create fence_pcmk1_psu2_off fence_apc_snmp \
    pcmk_host_list="pcmk-1" ipaddr="pdu-2" action="off" \
    port="1" power_wait="5" \
    op monitor interval="60s"

# Node 1 - on
pcs stonith create fence_pcmk1_psu1_on fence_apc_snmp \
    pcmk_host_list="pcmk-1" ipaddr="pdu-1" action="on" \
    port="1"
pcs stonith create fence_pcmk1_psu2_on fence_apc_snmp \
    pcmk_host_list="pcmk-1" ipaddr="pdu-2" action="on" \
    port="1"

Finally, configure the four PDU fence methods for "pcmk-2";

# Node 2 - Off
pcs stonith create fence_pcmk2_psu1_off fence_apc_snmp \
    pcmk_host_list="pcmk-2" ipaddr="pdu-1" action="off" \
    port="2" op monitor interval="60s"
pcs stonith create fence_pcmk2_psu2_off fence_apc_snmp \
    pcmk_host_list="pcmk-2" ipaddr="pdu-2" action="off" \
    port="2" power_wait="5" \
    op monitor interval="60s"

# Node 2 - on
pcs stonith create fence_pcmk2_psu1_on fence_apc_snmp \
    pcmk_host_list="pcmk-2" ipaddr="pdu-1" action="on" \
    port="2"
pcs stonith create fence_pcmk2_psu2_on fence_apc_snmp \
    pcmk_host_list="pcmk-2" ipaddr="pdu-2" action="on" \
    port="2"

Enable fencing

Now that fencing is configured, we can enable the "stonith-enabled" property.

pcs property set stonith-enabled=true

Use IPMI as first fencing level

The next step is to tell pacemaker what order we want the fencing methods to run, using fencing levels.

Each fencing level may have one or more fence devices. When fencing is required, Pacemaker will try each level in sequence, stopping at the first level that succeeds. Therefore, separate levels function as a "fallback" mechanism (logical "or"). At any given level, all the devices in that level will be tried in succession, and all must succeed for the level to succeed (logical "and").

In this example, tell pacemaker that the IPMI-based fence devices are the first methods to use:

pcs stonith level add 1 pcmk-1 fence_pcmk1_ipmi
pcs stonith level add 1 pcmk-2 fence_pcmk2_ipmi

Use PDUs as second fencing level

If using Pacemaker 1.1.14 or newer

Next, we tell pacemaker to use the switched PDUs as the second method:

pcs stonith level add 2 pcmk-1 fence_pcmk1_psu1,fence_pcmk1_psu2
pcs stonith level add 2 pcmk-2 fence_pcmk2_psu1,fence_pcmk2_psu2

Pacemaker will automatically remap sequential reboots in a fencing level to all-off-then-all-on.

If using an older Pacemaker version

Next, we tell pacemaker to use the switched PDUs as the second method:

pcs stonith level add 2 pcmk-1 fence_pcmk1_psu1_off,fence_pcmk1_psu2_off,fence_pcmk1_psu1_on,fence_pcmk1_psu2_on
pcs stonith level add 2 pcmk-2 fence_pcmk2_psu1_off,fence_pcmk2_psu2_off,fence_pcmk2_psu1_on,fence_pcmk2_psu2_on

Test

You can (and should!) test this my unplugging the IPMI interface for "pcmk-1" and then crashing it, triggering "pcmk-2" to call a fence against it. After the IPMI interface times out, you should see PDU 1, port 1 turn off, then PDU 2, port 1 turn off, the crashed node will power down, then PDU 1 port 1 should turn back on and finally PDU 2 port 1 should turn back on. If you configured your server's BIOS to power on after power loss or to return to last state after power loss, your server should start to power back on.