SystemHealth

System health is a feature which allows resources in a system to implicitly include a score which indicates the health of a node.

This feature is implemented in two parts. The first part consists of a change in pacemaker. The second part consists of health daemons setting health attributes.

Changes in Pacemaker
Pacemaker's policy engine will include a number of configuration entries. The first is node-health-strategy. The possible values for this key are:


 * none
 * migrate-on-red
 * only-green
 * progressive
 * custom

none is the default value. This setting will have no effect on weight calculations within Pacemaker.

The next three values (migrate-on-red, only-green, and progressive) will have the following effect on weight calculations within Pacemaker. Every resource which is defined within Pacemaker will now search for attributes in a node that start with #health. Examples would include #health, #health-ipmi, #health-smart, #health-foo-bar, et cetera. An attribute can have the following values:


 * red
 * yellow
 * green
 * integer value

Each attribute in a node starting with #health will be summed up with whatever other weights that are defined for resources in the system. The weights will determine on which node a resource will run.

Now the differences between migrate-on-red, only-green, and progressive are as follows:


 * migrate-on-red - red will have a value of -INF, yellow and green will have values of 0.


 * only-green - red and yellow will have values of -INF, green will have a value of 0.


 * progressive - red, yellow, and green will take their values from the corresponding policy engine settings:
 * pe-node-health-score-red (Note: the default is -INF)
 * pe-node-health-score-yellow (Note: the default is 0)
 * pe-node-health-score-green (Note: the default is 0)

custom indicates to Pacemaker that the system administrator will define rules to include whichever health attributes that they deem appropriate for their setup.

Health Daemons
A health daemon is a program that will periodically query or listen to events about the health status of a system. When it detects changes in the health, it will notify Pacemaker via the attrd_updater command.

Some mechanisms which report the status about the health of a system include:


 * IPMI (Intelligent Platform Management Interface) http://www.intel.com/design/servers/ipmi/ipmi.htm


 * iBMC (Integrated Baseboard Management Controller)


 * /var/log/mcelog


 * /var/log/messages


 * RSA2 (Remote Supervisor Adapter 2)


 * sysfs (Linux kernel filesystem)


 * SMART (Self-Monitoring, Analysis, and Reporting Technology)