[Nagiosplug-devel] Re: Improved check_oracle (TS check with autoextend)

Paul L. Allen pla at softflare.com
Sun Mar 28 16:31:03 CEST 2004


Andreas Ericsson writes: 

> It appears as though a few new return codes might be needed after all.

Perhaps. 

> Suggestion for the new list;
> OK = 0
> WARNING = 1
> CRITICAL = 2
> USER_ERROR = 3 (bad arguments to plugin)

Re-using an existing code for another reason is a Bad Idea[tm].  Yes,
bad arguments do generate status 3 at present, but so do other events.
Better to assign a new status code and retain 3 as UNKNOWN.  That way
old plugins and new Nagios return something sensible even if new plugins
and old Nagios do not. 

> FAILURE = 4 (plugin failed to fetch threshhold data (snmp, nrpe, nsclient, 
> nwstat, check_by_ssh, check_by_rsh))

Status 4 is already assigned to DEPENDENT. 

> The 'UNKNOWN' state has quite deliberately been removed since it seems to 
> cause confusion.

It causes confusion only because the documentation is not quite as
explicit as it could be.  Even so, it is not hard to deduce that UNKNOWN
means "I don't know what state the service is in because of some sort
of error that is not to do with the service itself."  If the plugin fails
because of bad arguments, the state of the service is UNKNOWN.  If the
plugin fails because of a transport error, the state of the service is
UNKNOWN.  If the plugin cannot find resources it needs to determine the
state of the service, the state of the service is UNKNOWN. 

More gradations of UNKNOWN may or may not be useful, but they are hardly
necessary.  When Nagios reports something as UNKNOWN, it means that it
doesn't know what state the service is in. Whether that be because of
bad arguments, missing utilities or a transport failure, it still means
that you do not know what state the service is in. 

I would argue that when passive check results go stale that Nagios should
flag them as UNKNOWN rather than CRITICAL just the same as it would if
a direct check_by_ssh timed out.  Either way, you do not know what state
the service is in. 

-- 
Paul Allen
Softflare Support 






More information about the Devel mailing list