DRBD HowTo 1.0

= Using DRBD with Pacemaker 1.0 =

You basically have two options to integrate DRBD with Pacemaker 1.0 (which will be called just Pacemaker from now on in this document).


 * 1) Use the legacy Heartbeat v1 style drbddisk resource agent to move the Primary role. In this case, you have to let /etc/init.d/drbd load and configure DRBD.
 * 2) Use the DRBD OCF resource agent. In this case, you must not let init load and configure DRBD, because this resource agent does that itself.

This document describes the second option.

Note: as of 2008-02-15, the DRBD developers recommend to use the drbddisk RA, although the DRBD OCF RA has been reported to work by some users (decide on your own!)

Advantages of using the OCF RA
Pacemaker supports multi-state resources natively; those are resources which can be in one of three states instead of the usual two: stopped and started is completed with promoted (slave is equivalent to started). See http://wiki.linux-ha.org/v2/Concepts/MultiState for more details.

The first resource agent to make full use of this functionality is the DRBD one. DRBD's primary and secondary concept directly map to this concept; in fact, they were used to design the model.

While the configuration in this way - as opposed to just letting drbddisk move the primary state, as in v1-legacy style configurations - is slightly more complex, it does provide advantages:


 * Slave/secondary also monitored for health.
 * Secondary can be relocated and moved as well in response to failures (see below: Floating peers).
 * Smarter placement of the master/primary because pacemaker gets feedback which side is preferable to promote.
 * Complete status reflected in monitoring tools.

Prerequisites

 * DRBD must not be started by init. Prevent DRBD from being started by your init system: (chkconfig drbd off, insserv -r drbd) The DRBD RA takes care of loading the DRBD module and all other start-up requirements.
 * The DRBD RA is tested and known to work with DRBD v7 and v8
 * The xml bits described here are for Pacemaker versions = 1.0. If you want information on using it in Heartbeat installations with pacemaker 0.6x or the even older built-in crm, please follow http://wiki.linux-ha.org/DRBD/HowTov2
 * A basic familarity with DRBD, Pacemaker and the xml-style configuration is assumed.
 * The XML snippets below will need to be somewhat customized to fit your setup and be loaded into the CIB using cibadmin. You should be familar with how to use this tool.
 * This cannot be correctly configured using the GUI. You will need to use the commandline tools.
 * DRBD version 8's Primary/Primary mode is not supported (yet)

Basic configuration
The most common way to configure DRBD to replicate a volume between two fixed nodes, using IP addresses statically assigned on each.

Setting up DRBD
Your /etc/drbd.conf will look similar to this; of course, you must adjust it to fit your environment. This example configures /dev/drbd0 to be replicated between xen-1 and xen-2; the instance is called drbd0:

resource drbd0 { protocol C; incon-degr-cmd echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f; startup { degr-wfc-timeout 120;   # 2 minutes. } disk { on-io-error  pass_on; } net { # TODO: Should these timeouts be relative to some heartbeat settings? # timeout      60;    #  6 seconds  (unit = 0.1 seconds) # connect-int  10;    # 10 seconds  (unit = 1 second) # ping-int     10;    # 10 seconds  (unit = 1 second) on-disconnect reconnect; } syncer { rate 100M; group 1; al-extents 257; } on xen-1 { device    /dev/drbd0; disk      /dev/hdd1; address   192.168.200.1:7788; meta-disk internal; }  on xen-2 { device    /dev/drbd0; disk      /dev/hdc1; address   192.168.200.2:7788; meta-disk internal; } }
 * /etc/drbd.conf:
 * 1) 'drbd0' is the identifier of this DRBD instance. You will need it to configure the resource
 * 2) in the CIB correctly. This name is arbitrary, but I chose to name it after the device node.

From now on, we will assume that you've setup DRBD and that it is working (test it with the DRBD init script outside Pacemaker's control). If not, debug this first. The DRBD users guide is quite a helpful document to set DRBD up. Read http://www.drbd.org/users-guide/index.html

Configuring the resource in the CIB
As explained, the resource is configured differently than a drbddisk resource before. It makes use of some more advanced CIB features (explained below), and goes into the resources section:

master id=ms-drbd0 meta_attributes id=ma-ms-drbd0 nvpair id=ma-ms-drbd0-1 name=clone-max value=2/ nvpair id=ma-ms-drbd0-2 name=clone-node-max value=1/ nvpair id=ma-ms-drbd0-3 name=master-max value=1/ nvpair id=ma-ms-drbd0-4 name=master-node-max value=1/ nvpair id=ma-ms-drbd0-5 name=notify value=yes/ nvpair id=ma-ms-drbd0-6 name=globally-unique value=false/ nvpair id=ma-ms-drbd0-7 name=target-role value=stopped/ /meta_attributes primitive id=drbd0 class=ocf provider=heartbeat type=drbd instance_attributes id=ia-ms-drbd0 nvpair id=ia-ms-drbd0-1 name=drbd_resource value=drbd0/ /instance_attributes operations op name=monitor id=op-drbd0-1 interval=59s timeout=30s role=Master/ op name=monitor id=op-drbd0-2 interval=60s timeout=30s role=Slave/ /operations /primitive /master

The primitive DRBD resource, similar to what you would have used to configure drbddisk, is now embedded in a complex object master. This specifies the abilities and limitations of DRBD there can be only two instances (clone-max), one per node (clone-node-max), and only one master ever (master-max). The notify attribute specifies that DRBD needs to be told about what happens to its peer; globally_unique set to false lets Pacemaker know that the instances cannot be told apart on a single node.

Note that we're creating the resource in stopped state first, so that we can finish configuring its constraints and dependencies before activating it.

Specifying the nodes where the DRBD RA can be run
If you have a two node cluster, you could skip this step, because obviously, it can only run on those two. If you want to run drbd0 on two out of more nodes only, you will have to tell the cluster about this constraint:

rsc_location id=ms-drbd0-placement rsc=ms-drbd0 rule id=ms-drbd0-placement-rule-1 score=-INFINITY boolean-op=and expression id=ms-drbd0-placement-exp-1 attribute=#uname operation=ne value=xen-1/ expression id=ms-drbd0-placement-exp-2 attribute=#uname operation=ne value=xen-2/ /rule /rsc_location

These two constraints tell the Policy Engine that, first, drbd0 can not run anywhere else except on xen-1 or xen-2. Second, they tell the PE that yes, it can run on those two.

Note: This assumes a symmetric cluster. If your cluster is asymmetric, you will have to invert the rules (Don't worry - if you do not specifically configure asymmetric, your cluster is symmetric by default).

Prefering a node to run the master role
With the configuration so far, the cluster would pick a node to promote DRBD on. If you want to prefer a node to run the master role (xen-1 in this example), you can express that like this:

rsc_location id=ms-drbd0-master-placement rsc=ms-drbd0 rule id=ms-drbd0-master-on-xen-1 role=master score=100 expression id=ms-drbd0-master-on-xen-1-exp attribute=#uname operation=eq value=xen-1/ /rule /rsc_location

First success!
You can now activate the DRBD resource by changing its target-role:

crm_resource -r ms-drbd0 -v '#default' --meta -p target-role

It should be started and promoted on one of the two nodes - or, if you specified a constraint as shown above, on the node you preferred.

Note: Until Bug 1995 is fixed, you will need to delete the target-role attribute now to get a promoted drbd device. It is not sufficient to just delete the attribute in the first place. Consider this an inconvenient workaround until the bug is fixed. I don't have a better one.

crm_resource -r ms-drbd0 --meta -d target-role

Referencing the master or slave resource in constraints
DRBD is rarely useful by itself; you will propably want to run a service on top of it. Or, very likely, you want to mount the filesystem on the master side.

Let us assume that you've created an ext3 filesystem on /dev/drbd0, which you now want managed by the cluster as well. The filesystem resource object is straightforward, and if you have got any experience with configuring Pacemaker at all, will look rather familar:

primitive class=ocf provider=heartbeat type=Filesystem id=fs0 meta_attributes id=ma-fs0 nvpair name=target-role id=ma-fs0-1 value=stopped/ /meta_attributes instance_attributes id=ia-fs0 nvpair id=ia-fs0-1 name=fstype value=ext3/ nvpair id=ia-fs0-2 name=directory value=/mnt/share1/ nvpair id=ia-fs0-3 name=device value=/dev/drbd0/ /instance_attributes /primitive

Make sure that the various settings match your setup. Again, this object has been created as stopped first.

Now the interesting bits. Obviously, the filesystem should only be mounted on the same node where drbd0 is in primary state, and only after drbd0 has been promoted, which is expressed in these two constraints:

rsc_order id=ms-drbd0-before-fs0 first=ms-drbd0 then=fs0 first-action=promote then-action=start/ rsc_colocation id=fs0-on-ms-drbd0 rsc=fs0 with-rsc=ms-drbd0 with-rsc-role=Master score=INFINITY/

Et voila! You now can activate the filesystem resource and it'll be mounted at the proper time in the proper place.

Just as this was done with a single filesystem resource, this can be done with a group: In a lot of cases, you will not just want a filesystem, but also an IP-address and some sort of daemon to run on top of the DRBD master. Put those resources in a group, use the constraints above and replace fs0 with the name of your group. The following example includes an apache webserver.

resources master id=ms-drbd0 meta_attributes id=ma-ms-drbd0 nvpair id=ma-ms-drbd0-1 name=clone-max value=2/ nvpair id=ma-ms-drbd0-2 name=clone-node-max value=1/ nvpair id=ma-ms-drbd0-3 name=master-max value=1/ nvpair id=ma-ms-drbd0-4 name=master-node-max value=1/ nvpair id=ma-ms-drbd0-5 name=notify value=yes/ nvpair id=ma-ms-drbd0-6 name=globally-unique value=false/ nvpair id=ma-ms-drbd0-7 name=target-role value=stopped/ /meta_attributes primitive id=drbd0 class=ocf provider=heartbeat type=drbd instance_attributes id=ia-drbd0 nvpair id=ia-drbd0-1 name=drbd_resource value=drbd0/ /instance_attributes operations op id=op-drbd0-1 name=monitor interval=59s timeout=10s role=Master/ op id=op-drbd0-2 name=monitor interval=60s timeout=10s role=Slave/ /operations /primitive /master group id=apache-group meta_attributes id=ma-apache nvpair id=ma-apache-1 name=target-role value=started/ /meta_attributes primitive class=ocf provider=heartbeat type=Filesystem id=fs0 instance_attributes id=ia-fs0 nvpair id=ia-fs0-1 name=fstype value=ext3/ nvpair id=ia-fs0-2 name=directory value=/usr/local/apache/htdocs/ nvpair id=ia-fs0-3 name=device value=/dev/drbd0/ /instance_attributes /primitive primitive class=ocf provider=heartbeat type=apache id=webserver instance_attributes id=ia-webserver nvpair id=ia-webserver-1 name=configfile value=/usr/local/apache/conf/httpd.conf/ nvpair id=ia-webserver-2 name=httpd value=/usr/local/apache/bin/httpd/ nvpair id=ia-webserver-3 name=port value=80/ /instance_attributes operations op id=op-webserver-1 name=monitor interval=30s timeout=30s/ /operations /primitive primitive id=virtual-ip class=ocf type=IPaddr2 provider=heartbeat instance_attributes id=ia-virtual-ip nvpair id=ia-virtual-ip-1 name=ip value=10.0.0.1/ nvpair id=ia-virtual-ip-2 name=broadcast value=10.0.0.255/ nvpair id=ia-virtual-ip-3 name=nic value=eth0/ nvpair id=ia-virtual-ip-4 name=cidr_netmask value=24/ /instance_attributes operations op id=op-virtual-ip-1 name=monitor interval=21s timeout=5s/ /operations /primitive /group /resources constraints rsc_order id=ms-drbd0-before-apache-group first=ms-drbd0 then=apache-group first-action=promote then-action=start/ rsc_colocation id=apache-group-on-ms-drbd0 rsc=apache-group with-rsc=ms-drbd0 with-rsc-role=Master score=INFINITY/ rsc_location id=ms-drbd0-master-placement rsc=ms-drbd0 rule id=ms-drbd0-master-on-xen-1 role=master score=100 expression id=ms-drbd0-master-on-xen-1-exp attribute=#uname operation=eq value=xen-1/ /rule /rsc_location /constraints

This will load the drbd module on both nodes and promote the instance on xen-1. After successful promotion, it will first mount /dev/drbd0 to /usr/local/apache/htdocs, then start the apache webserver and in the end configure the service IP-address 10.0.0.1/24 on network card eth0.

Moving the master role to a different node
If you want to move the DRBD master role the other node, you should not attempt to just move the master role. On top of DRBD, you will propably have a Filesystem resource or a resource group with your application/Filesystem/IP-Address or whatever (remember, DRBD isn't usually useful by itself). If you want to move the master role, you can accomplish that by moving the resource that is co-located with the DRBD master (and properly ordered). This can be done with crm_resource. Given the group example from above, you would use

crm_resource -M -r apache-group [-H hostname]

This will stop all resources in the group, demote the current master, promote the other DRBD instance and start the group after successful promotion.