GitHub

Check Aggregation: check_many

Thomas Guyot-Sionnest, June 9, 2009

Overview

This proposal is for a simple plugin wrapper allowing aggregation and serialization of multiple checks.

Problem

There is no easy way to configure a single Nagios service that aggregates multiple results together. Taking for example a standard check_nagios between servers, how can such checks be extended to cover additional components? Usually it involves writing a custom shell wrapper around them, or configuring all the checks separately and using check_cluster to aggregate them. There ought to be a better way …

Proposal

Written in C, check_many would be a fairly simple and fast solution for this issue. The idea is a plugin that takes checks commands from STDIN; one command per line. It would run them and aggregate them according to processing preferences as configured in the plugin arguments.

Check Options

The following options can be used to control plugin processing (grouped by category):

Command Parsing

-s, --shell=<always|never|auto>
   Specify when a shell should be invoked for executing commands.  "always"
   invokes the shell for every command, "never" forces commands to be executed
   directly, and "auto" (default) invokes the shell only if shell meta
   characters are present in the check command.  Unless -d (--delimiter) is
   specified, any whitespace is used for separating arguments.

-d, --delimiter=CHARACTER
   Delimiter to use for separating command arguments when shell is not used.
   Implies --shell=never and is mutually exclusive with any other shell option.
   Standard backslash escapes are allowed, except "\n".

Note: Should we allow strings as delimiters?

Processing Option

-P, --process=<all|first-fail|first-ok>
   By default, all commands are processed and the worst state is returned
   ("all").  "first-fail" stops at the first non-ok check and returns it, while
   "first-ok" stops at the first successful check and returns it.  The latter
   two override --status and --output and return the plugins's instead.

-f, --file=FILE
   Read checks from FILE instead of STDIN.

Output Options

--output=<normal|oneline|status>
   "normal" outputs Nagios v3+ multi-line result, first line being a summary of
   the checks performed; "oneline" squeezes everything into a single line; and
   "status" returns only a status line.  This option has no effect with
   --process=first-fail|first-ok.

Note: How about allowing nth result?

Examples

Aggregate multiple checks together:

$ echo '/path/check_http -H www.example.com
        /path/check_http -H www.example.com -p 443' | /path/check_many

Get list of checks from a file:

$ /path/check_many <~nagios/multiple_checks.txt
$ /path/check_many -f /home/nagios/multiple_checks.txt

Using a delimiter:

$ echo '/path/check_foo:-H:example.com
        /path/check_bar:-H:example.com:-s:$string with special chars;' \
  | /path/check_many -d: