DRBD HowTo 1.0

From ClusterLabs

Credits

This HowTo is based on the excellent HowTo for DRBD in older Heartbeat CRM/Pacemaker Versions by Lars Marowsky Bree.

Using DRBD with Pacemaker 1.0

You basically have two options to integrate DRBD with Pacemaker 1.0 (which will be called just Pacemaker from now on in this document).

  1. Use the legacy Heartbeat v1 style drbddisk resource agent to move the Primary role. In this case, you have to let /etc/init.d/drbd load and configure DRBD.
  2. Use the DRBD OCF resource agent. In this case, you must not let init load and configure DRBD, because this resource agent does that itself.

This document describes the second option.

Note: as of 2008-02-15, the DRBD developers recommend to use the drbddisk RA, although the DRBD OCF RA has been reported to work by some users (decide on your own!)

Note: added 2009-07-26, the DRBD developers, LinBit, have a very good HowTo on their on site, they now use the DRBD OCF RA. You can find it here:

Advantages of using the OCF RA

Pacemaker supports multi-state resources natively; those are resources which can be in one of three states instead of the usual two: stopped and started is completed with promoted (slave is equivalent to started). See http://wiki.linux-ha.org/v2/Concepts/MultiState for more details.

The first resource agent to make full use of this functionality is the DRBD one. DRBD's primary and secondary concept directly map to this concept; in fact, they were used to design the model.

While the configuration in this way - as opposed to just letting drbddisk move the primary state, as in v1-legacy style configurations - is slightly more complex, it does provide advantages:

  • Slave/secondary also monitored for health.
  • Smarter placement of the master/primary because pacemaker gets feedback which side is preferable to promote.
  • Complete status reflected in monitoring tools.

Prerequisites

  • DRBD must not be started by init. Prevent DRBD from being started by your init system: (chkconfig drbd off, insserv -r drbd) The DRBD RA takes care of loading the DRBD module and all other start-up requirements.
  • The DRBD RA is tested and known to work with DRBD v7 and reported to work but not 100% tested with v8
  • This HowTo makes use of the crm shell. If you want information on using the DRBD OCF RA in Heartbeat installations with pacemaker 0.6x or the even older built-in crm, please follow http://wiki.linux-ha.org/DRBD/HowTov2
  • A basic familarity with DRBD, Pacemaker and the crm-shell-style configuration is assumed.
  • The crm commands below will need to be somewhat customized to fit your setup. You should be familar with how to use this tool.
  • This cannot be correctly configured using the GUI. You will need to use the commandline tools.
  • DRBD version 8's Primary/Primary mode is not supported (yet)

Basic configuration

The most common way to configure DRBD to replicate a volume between two fixed nodes, using IP addresses statically assigned on each.

Setting up DRBD

Please refer to the DRBD docs on how to install it and set it up.

From now on, we will assume that you've setup DRBD and that it is working (test it with the DRBD init script outside Pacemaker's control). If not, debug this first.

Configuring the resource in the CIB

In the crm shell, you first have to create the primitive resource and then embed that into the master resource.

crm commands
configure

primitive drbd0 ocf:heartbeat:drbd \ 
 params drbd_resource=drbd0 \
 op monitor role=Master interval=59s timeout=30s \
 op monitor role=Slave interval=60s timeout=30s

ms ms-drbd0 drbd0 \
 meta clone-max=2 notify=true globally-unique=false target-role=stopped

commit

quit

The primitive DRBD resource, similar to what you would have used to configure drbddisk, is now embedded in a complex object master. This specifies the abilities and limitations of DRBD there can be only two instances (clone-max), one per node (clone-node-max), and only one master ever (master-max). The notify attribute specifies that DRBD needs to be told about what happens to its peer; globally-unique set to false lets Pacemaker know that the instances cannot be told apart on a single node.

Note that we're creating the resource in stopped state first, so that we can finish configuring its constraints and dependencies before activating it.

Specifying the nodes where the DRBD RA can be run

If you have a two node cluster, you could skip this step, because obviously, it can only run on those two. If you want to run drbd0 on two out of more nodes only, you will have to tell the cluster about this constraint:

crm configure location ms-drbd0-placement ms-drbd0 rule -inf: \#uname ne xen-1 and \#uname ne xen-2

This will tell the Policy Engine that, first, drbd0 can not run anywhere else except on xen-1 or xen-2. Second, it tells the PE that yes, it can run on those two.

Note: This assumes a symmetric cluster. If your cluster is asymmetric, you will have to invert the rules (Don't worry - if you do not specifically configure asymmetric, your cluster is symmetric by default).

Prefering a node to run the master role

With the configuration so far, the cluster would pick a node to promote DRBD on. If you want to prefer a node to run the master role (xen-1 in this example), you can express that like this:

crm configure location ms-drbd0-master-on-xen-1 ms-drbd0 rule role=master 100: \#uname eq xen-1

First success!

You can now activate the DRBD resource:

crm resource start ms-drbd0

It should be started and promoted on one of the two nodes - or, if you specified a constraint as shown above, on the node you preferred.

Referencing the master or slave resource in constraints

DRBD is rarely useful by itself; you will propably want to run a service on top of it. Or, very likely, you want to mount the filesystem on the master side.

Let us assume that you've created an ext3 filesystem on /dev/drbd0, which you now want managed by the cluster as well. The filesystem resource object is straightforward, and if you have got any experience with configuring Pacemaker at all, will look rather familar:

crm configure primitive fs0 ocf:heartbeat:Filesystem params fstype=ext3 directory=/mnt/share1 \
 device=/dev/drbd0 meta target-role=stopped

Make sure that the various settings match your setup. Again, this object has been created as stopped first.

Now the interesting bits. Obviously, the filesystem should only be mounted on the same node where drbd0 is in primary state, and only after drbd0 has been promoted, which is expressed in these two constraints:

crm commands
configure

order ms-drbd0-before-fs0 mandatory: ms-drbd0:promote fs0:start

colocation fs0-on-ms-drbd0 inf: fs0 ms-drbd0:Master

commit

quit

Et voila! You now can activate the filesystem resource and it'll be mounted at the proper time in the proper place.

crm resource start fs0

Just as this was done with a single filesystem resource, this can be done with a group: In a lot of cases, you will not just want a filesystem, but also an IP-address and some sort of daemon to run on top of the DRBD master. Put those resources in a group, use the constraints above and replace fs0 with the name of your group. The following example includes an apache webserver.

crm commands
configure

primitive drbd0 ocf:heartbeat:drbd \
 params drbd_resource=drbd0 \
 op monitor role=Master interval=59s timeout=30s \
 op monitor role=Slave interval=60s timeout=30s

ms ms-drbd0 drbd0 \
 meta clone-max=2 notify=true globally-unique=false target-role=stopped 

primitive fs0 ocf:heartbeat:Filesystem \ 
 params fstype=ext3 directory=/usr/local/apache/htdocs device=/dev/drbd0

primitive webserver ocf:heartbeat:apache \
 params configfile=/usr/local/apache/conf/httpd.conf httpd=/usr/local/apache/bin/httpd port=80 \ 
 op monitor interval=30s timeout=30s

primitive virtual-ip ocf:heartbeat:IPaddr2 \
 params ip=10.0.0.1 broadcast=10.0.0.255 nic=eth0 cidr_netmask=24 \
 op monitor interval=21s timeout=5s

group apache-group fs0 webserver virtual-ip

order ms-drbd0-before-apache-group mandatory: ms-drbd0:promote apache-group:start

colocation apache-group-on-ms-drbd0 inf: apache-group ms-drbd0:Master

location ms-drbd0-master-on-xen-1 ms-drbd0 rule role=master 100: #uname eq xen-1

commit

end

resource start ms-drbd0

quit

This will load the drbd module on both nodes and promote the instance on xen-1. After successful promotion, it will first mount /dev/drbd0 to /usr/local/apache/htdocs, then start the apache webserver and in the end configure the service IP-address 10.0.0.1/24 on network card eth0.

Moving the master role to a different node

If you want to move the DRBD master role the other node, you should not attempt to just move the master role. On top of DRBD, you will propably have a Filesystem resource or a resource group with your application/Filesystem/IP-Address or whatever (remember, DRBD isn't usually useful by itself). If you want to move the master role, you can accomplish that by moving the resource that is co-located with the DRBD master (and properly ordered). This can be done with the crm shell or crm_resource. Given the group example from above, you would use

crm resource migrate apache-group [hostname] 

This will stop all resources in the group, demote the current master, promote the other DRBD instance and start the group after successful promotion.

Keeping the master role on a network connected node

It is most likely desirable to keep the master role on a node with a working network connection. I assume you are familiar with [pingd]. So if you configured pingd, all you need to do is a rsc_location constraint for the master role, which looks at the pingd attribute of the node.

crm configure location ms-drbd-0_master_on_connected_node ms-drbd0 \
 rule role=master -inf: not_defined pingd or pingd lte 0

This will force the master role off of any node with a pingd attribute value of less or equal 0 or without a pingd attribute at all.

Note: This will prevent the master role and all its colocated resources from running at all if all your nodes lose network connection to the ping nodes.

If you don't want that, you can also configure a different score value than -INFINITY, but that requires cluster-individual score-maths depending on your number of resources, stickiness values and constraint scores.