Error Handling

From ClusterLabs
Jump to: navigation, search

Error handling in Pacemaker is closely tied with libqb for logging.

Log levels

Guidelines for when to use each log level:

  • critical: unrecoverable error
  • error: something (e.g. a resource, a daemon, or a connection) failed
  • warning: something bad might have or be happening but we don't have enough context to know right now
  • notice: important events (e.g. node join/leave, resource actions, config changes)
  • info: things that would be interesting after a failure of some kind (e.g. status section updates, crmd elections, "i'm about to run resource action X"); by default, these are logged to files but not syslog
  • debug: like info but for noisier messages; these are not logged by default
  • trace: like info but for very noisy or low-level stuff (e.g. tracing in libraries); these are not logged by default

Logging macros

crm_crit(fmt, args...)
crm_err(fmt, args...)
crm_warn(fmt, args...)
crm_notice(fmt, args...)
crm_info(fmt, args...)
crm_debug(fmt, args...)
crm_trace(fmt, args...)
These will log a message via libqb at the specified logging level. They can be used independently for logging purposes are via the higher-level interfaces described below.
crm_perror(severity, fmt, args...)
This is intended primarily for use by command-line tools. It simultaneously logs a message at the given severity (LOG_CRIT, etc.) and prints the message to stderr, with the system error message corresponding to the current value of errno at the end. It is likely to be deprecated and/or replaced in the future.

Error handling functions and macros

CRM_ASSERT(expr)
If expr is false, this will call crm_err() with a "Triggered fatal assert" message (with details), then abort execution. This should be used for errors that shouldn't happen and can't be handled gracefully (for example, memory allocation failures).
CRM_LOG_ASSERT(expr)
If expr is false, this will generally log a message without aborting. If the log level is below trace, it just calls crm_err() with a "Triggered assert" message (with details). If the log level is trace, and the caller is a daemon, then it will fork a child process in which to dump core, as well as logging the message. If the log level is trace, and the caller is not a daemon, then it will behave like CRM_ASSERT() (i.e. log and abort). This should be used for serious errors that nonetheless require no special handling (for example, an unexpected request to close something that isn't open).
CRM_CHECK(expr, failed_action)
If expr is false, behave like CRM_LOG_ASSERT(expr) (that is, log a message and dump core if requested) then perform failed_action (which must not contain continue, break or errno). This should be used for serious errors that can be handled, usually by returning an error status (for example, returning NULL or -1 or such if a NULL pointer is passed as an object to operate on).