[Nagiosplug-devel] New threshold syntax - changes

Richard Edward Horner rich at richhorner.com
Tue Oct 13 18:31:02 CEST 2009


I don't have any real concerns with 1, 2 or 4.

I think 3 is an important clarification.

One thing that has bugged me with thresholds is that the spec is only
useful if it's followed but most plugins don't follow it and just
implement what's quick. I'll give the pertinent example.

On Sabyon 5 I type equo install nagios-plugins as root to pull
1.4.13-r4. I cd to /usr/lib64/nagios/plugins and do:

./check_users --help
./check_disk --help

check_users says:
 -c, --critical=INTEGER
    Set CRITICAL status if more than INTEGER users are logged in

check_disk says:
 -c, --critical=INTEGER
    Exit with CRITICAL status if less than INTEGER units of disk are free

According to the spec, neither of these are correct. check_users
behaves as the spec defines when you specify --critical=10 but with
check_disk --critical=10 behaves like --critical=10:

Really, they shouldn't have any qualifiers, they should say "this is
where you specify the alert range" and there should be a --range_help
option that prints something like:

The format for ranges in Nagios can be confusing and it isn't always followed.

[@]start[:[end]]

Here are some example ranges:

Range   |  Generate an alert if value is    |  In English
--------+-----------------------------------+---------------------------------
10      |  outside the range of {0 .. 10}   |  Greater than 10
@10     |  inside the range of {0 .. 10}    |  Less than or equal to 10
10:     |  outside {10 .. ∞}                |  Greater than 10
~:10    |  outside the range of {-∞ .. 10}  |  Less than 10 including negative
10:20   |  outside the range of {10 .. 20}  |  Between 10 and 20
@10:20  |  inside the range of {10 .. 20}   |  Anything from 10 to 20
10      |  outside the range of {0 .. 10}   |  Greater than 10 or less than 0

Formal Rules:
1. start ≤ end
2. start and ":" is not required if start=0
3. if range is of format "start:" and end is not specified, end is infinity
4. to specify negative infinity, use "~"
5. alert is raised if metric is outside start and end range (inclusive)
6. if range starts with "@", then alert if inside this range (inclusive)
    10      < 0 or > 10, (outside the range of {0 .. 10})
    10:     < 10, (outside {10 .. ∞})
    ~:10    > 10, (outside the range of {-∞ .. 10})
    10:20   < 10 or > 20, (outside the range of {10 .. 20})
    @10:20  ≥ 10 and ≤ 20, (inside the range of {10 .. 20})
    10      < 0 or > 10, (outside the range of {0 .. 10})

More help at http://nagiosplug.sourceforge.net/developer-guidelines.html



That's according to spec. Now, this begs the question though, is this
what we really want? If developer patterns are to be believed, it
would seem this is not the case. People just want what's easiest and
most intuitive to use.

I think if you're going to have a spec, it needs to be followed. If
you have a spec and it's not followed, it's confusing and worse than a
waste of time as it is probably costing ppl time.

Perhaps the concept of thresholds in the developer guidelines should
say something like, "left to the developer's discretion" but then
there can be a subsection that explains how to accept ranges of values
if you want to accept ranges. A lot of plugins don't need ranges and
would only ever want to check whether some number is more than some
other number which I realize can be represented as a range using
infinity or negative infinity as one end but this is overkill and I've
never seen a production system using this.

So, yeah, I vote for being all UNIX-y and to KISS. Give guidelines for
how to do ranges but make ranges optional and leave thresholds to
"what makes sense". Afterall, it's opensource. If someone doesn't like
it, they can change it but I think it's counterproductive to have a
spec that appears to be meaningless because it's not followed.

Thanks, Rich(ard)

On Tue, Oct 13, 2009 at 6:35 AM, Thomas Guyot-Sionnest <dermoth at aei.ca> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I talked about all these changes in the threads a while ago but I
> received no comments on them. If you have anything against them please
> speak now!
>
> 1. --th=metric={metric} instead of --{metric}:
>
> Ok, this makes it a bit longer, but I believe it's a much clearer
> separation between thresholds and other command-line arguments. Both
> "--threshold" and "--th" would be equally valid. The list of metrics
> could be printed in a table after the argument list, and could include
> dynamic metrics for some plugins (i.e. based on the data being checked)
> using a common prefix. Additionally it allow setting default parameters
> by omitting the metric.
>
> 2. Making the "end" optional in range?
>
> At one point it was suggested making the end parameter optional. The
> current spec says the range must have bots start and end, which means
> most threshold ranges will have to be like this (ex. for a 10 second
> delay warning/critical):
>
> 10..inf
>
> instead of just "10".
>
>
> 3. Clarification of the ok= level
>
> Is is unclear how levels interact regarding the OK level. Here's what I
> proposed (by "check" I mean to also return the value if within range):
>
> 1. Without OK range:
>  a. check for critical range if specified
>  b. check for warning range if specified
>  c. return OK
>
> 2: With OK range:
>  a. check for OK range
>  b. check for critical range if specified
>  c. check for warning range if specified
>  d. return CRITICAL.
>
>
>
> 4. Separation of uom and prefix
>
> In the RFC the "uom" parameter specify both the unit and its prefix.
> This parameter has to be separated to prevent a parsing nightmare.
> Therefore there should be:
>
>  a. unit: unit of the data, useful (and valid) only when the plugin
> doesn't knwo what is it (ex. check_snmp)
>
>  b. prefix: this is the SI prefix for using the range values and
> usually printing on the normal output as well (performance data should
> be a double value and therefore shouldn't be affected by this as
> precision is retained anyway). Should be valid everywhere with a default
> value provided by the plugin where applicable. Ex.: M, K, m, KiB, etc..
>
>
> Please let me know if you see any issue here...
>
> - --
> Thomas
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFK1B+g6dZ+Kt5BchYRAtK5AJkBKhWWohyT7yc+Z+n910W39v8SvACg1D8r
> O5IQn1S2h14yHpCAAFPD994=
> =krRz
> -----END PGP SIGNATURE-----
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> _______________________________________________________
> Nagios Plugin Development Mailing List Nagiosplug-devel at lists.sourceforge.net
> Unsubscribe at https://lists.sourceforge.net/lists/listinfo/nagiosplug-devel
> ::: Please include plugins version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>



-- 
Richard Edward Horner
Engineer / Composer / Electric Guitar Virtuoso
richhorner.com | rhosts.net | sabayonlinux.org




More information about the Devel mailing list