Update Resource Agent for OCF 1.1

The OCF 1.1 Resource Agent API standard released in 2021 offers new features for resource agents. This document describes how to update an existing OCF 1.0 compliant resource agent to be compatible with OCF 1.1.

Version
The only required step for OCF 1.1 support is to update the  element in the top level of meta-data: 1.1

That's it! Everything else is optional.

Note that  is the OCF standard that the agents supports; the   attribute of the   element is for the version of the agent itself (and can be any value desired).

Description
In the OCF 1.0 standard, there was no place for a description of the agent itself. OCF 1.1 adopted the already-common practice of using  and   elements in the top level of the meta-data for this purpose. If you don't already use them, add them if desired. Example: This is a long description of a theoretical resource agent that doesn't really exist. You could say whatever you want about its purpose here. The short description below is, well, a short description. Super-duper resource agent that does everything

Unique parameters
The  attribute in parameters is now deprecated. You can keep it if you want to be compatible with older software that looks for it, but removing it is recommended.

Instead, add  attributes for every set of parameters that should be unique for each instance of the resource agent. Here is an example (just the relevant portions) of an agent that requires an IP address and port combination that must be unique (the value "address" is arbitrary):

 ...   ...  ...

Required parameters
Mark any required parameters (those the user must specify) with the new  attribute.

Deprecated parameters
Mark any deprecated parameters with the new  child element, which may optionally contain   child elements indicating parameters that should be used instead, and   child elements explaining the deprecation for users (potentially with multiple translations). Example:  Don't use foo, it's bad. Nepoužívej foo, sic to schytáš. Whether the example daemon should operate with foo factor Foo factor

Enumerated parameter values
If you have any parameters that take specific values, you can now enumerate those values instead of allowing free-form text. Example: The mode the example daemon should operate in. Allowed values are "dry-run" and "live". Run mode  

Reloadable parameters
OCF 1.1 supports the concept of reloadable parameters, which is the same as how Pacemaker used the now-deprecated  attribute.

If a parameter value can be changed without requiring a full stop and start of the service itself, mark the parameter with the new  attribute. This is not related to reloading the service itself, just the agent parameter values.

An example might be a web server agent that can use one of several clients to check the server status. The parameter that specifies the client can be changed without restarting the web server itself, so it could be marked as reloadable. The user can change the value of that parameter with no downtime for the web server.

If you mark any parameters as reloadable, you also have to implement a  action as described below, and advertise the action in meta-data.

notify
Pacemaker implemented an extension to OCF 1.0 for clone resources (which can run on multiple cluster nodes at the same time). These resource agents could optionally receive notifications before and after resource actions on any instance, via the  action.

OCF 1.1 has adopted the  action, but left its behavior undescribed. Continue using it or not as desired.

promote and demote
Another Pacemaker extension was promotable resources (clones whose instances can run in one of two modes). The  and   actions bring an instance to the default mode, and the   action brings the instance to the special mode. OCF 1.1 adopts these actions.

A major difference from the older Pacemaker implementation is that the role names are now  and   rather than   and. Newer versions of Pacemaker support both sets of names.

If your agent already implements promotable clones, update any mentions of the role names. The agent won't be able to support both the old and new names, because only one set can be advertised in monitor action meta-data. If you advertise the old names, advertise OCF 1.0 support; if you advertise the new names, advertise OCF 1.1 support.

reload and reload-agent
The  action previously had conflicting uses; most resource agents used it to reload the service itself, while Pacemaker used it to reload agent parameters.

In OCF 1.1, the  action is now reserved for reloading the service itself. For example, if the service can re-read its configuration file after receiving a signal, the reload action can send that signal. This is equivalent to how init scripts and systemd unit files use reload.

The new  action is for making effective any changes in parameters marked. Many times this will be a no-op -- in the earlier example of a web server agent that has a reloadable parameter for which client to use to contact the web server, nothing special needs to be done if that parameter is changed (the agent will simply use the new value the next time it needs to contact the web server). A different example might be a database agent with a reloadable parameter for whether the database is in read-only or read/write mode; the agent might contact the database server with a client to change the mode, which would be much quicker (and have no downtime) compared to a full database restart.

OCF_OUTPUT_FORMAT
In OCF 1.1, agents may optionally support displaying output in multiple formats. The desired format will be passed via the  environment variable. The specific formats supported are left to the agent, as are the values used to identify them (it is recommended to use "text" for human-readable text and "xml" for XML, if supported).

Following existing practice, the  action must default to using XML output, and all other actions must default to text. It is totally up to you whether to support anything else.

Mainly this is expected to be used for the  action, to be able to return XML for better machine parsing. However the XML schema has not been standardized, so this will be an area of experimentation in the near future.

OCF_CHECK_LEVEL
OCF 1.0 and 1.1 both support the  environment variable for the   action, to determine the depth (service impact) of check done.

OCF 1.1 extends this to the  action as well. If not specified or 0, only syntax and consistency checks should be done (for example, verifying that a parameter value is an integer if that's appropriate). If 10, the agent may additionally verify the suitability of the local host (for example, that a necessary directory exists).

Exit statuses
The meaning of a couple of exit statuses has been clarified:


 * OCF_ERR_ARGS (2): parameters are invalid in the context of the local host (such as a nonexistent configuration file)
 * OCF_ERR_CONFIGURED (6): parameters are internally invalid (such as a string given where only an integer is allowed)

In addition, new exit statuses that were Pacemaker extensions have been adopted:


 * OCF_RUNNING_PROMOTED (8): properly running in the promoted role
 * OCF_FAILED_PROMOTED (9): failed in the promoted role
 * OCF_RUNNING_DEGRADED (190): properly running but failure is more likely in the near term
 * OCF_PROMOTED_DEGRADED (191): properly running in the promoted role but degraded

The symbolic names for these new statuses might or might not be defined by shell include files, so be aware of what includes you are using. If you want to maintain compatibility with older includes, you can define each symbol you need if it's not already defined, like:
 * ${OCF_RUNNING_PROMOTED:=8}