[Nagiosplug-devel] Working on testcases

Ethan Galstad nagios at nagios.org
Wed Nov 9 11:12:11 CET 2005


I'm a bit late into this thread, but here are some of my thoughts...

At least one person should be getting notifications for UNKNOWN 
states, as they can be important.  The UNKNOWN state doesn't really 
have a clear definition, but here's what I think it should be used to 
signify...  

1. Invalid command line args passed to the plugin (e.g. the plugin 
doesn't know what to do).

2. Internal failures in the plugin itself which prevent it from 
performing a check (i.e. malloc() failures, unexpected system call 
failures, or anything else that needs to be done - but fails - before 
a check can be performed).  As an example, the check_dhcp plugin 
returns an UNKNOWN state if it can't determine the local hardware 
address or bind to port 68.

3. Nagios will also assign an UNKNOWN state to any 
plugin/script/whatever that either doesn't exist on the filesystem or 
returns a code that is out-of-bounds in accordance with the plugin 
specs.

I think that DNS errors and the like should either result in a 
WARNING or (preferably) a CRITICAL state.  By specifying a host/DNS 
name in the config, the admin has implicity indicated to Nagios/the 
plugins that the name should resolve.  If for whatever reason it 
doesn't, it should be treated as a serious error and the admin should 
be notified appropriately.

I don't think there's a real need to add another state type to Nagios 
for indicating transport or resolution errors.  The existing states 
should provide enough to indicate various problems, althought I'm 
sure there are different opinions on this. :-)


On 9 Nov 2005 at 13:27, Andreas Ericsson wrote:

> Ton Voon wrote:
> > 
> > On 7 Nov 2005, at 12:20, Andreas Ericsson wrote:
> > 
> >> I'd just like to point out that this is in no way incompatible with  
> >> the "transport error" service-status, since nagios by default sets  
> >> all out-of-bounds return codes to UNKNOWN.
> >>
> > 
> > Andreas,
> > 
> > Just trying to make your suggestion clear: are you proposing  "transport 
> > error" as another status from the plugins, but one that  Nagios will 
> > (currently) support because it will map it onto UNKNOWN?
> > 
> 
> Yes.
> 
> > I'm against adding another state unless absolutely necessary. I don't  
> > think there is sufficient difference between UNKNOWN and  TRANSPORT_ERROR.
> > 
> > And while I agree with the idea of "transport errors", I'm not sure  if 
> > we agree on the concept. In your previous email, you said "UNKNOWN  can 
> > be used for user-error only", but then you contradicted yourself  by 
> > saying when "a dns fails, the service *is* UNKNOWN".
> > 
> 
> What I meant was that the service isn't in a determined state (i.e. 
> "unknown" as humans read the word which isn't necessarily how Nagios 
> understands it).
> 
> > I wonder if the word "UNKNOWN" is causing problems. Maybe "OTHER"  makes 
> > more sense meaning "any other failure to stop analysis of the  service".
> > 
> 
> Apparently it does. I don't think a change would be very welcome though.
> 
> > Any other opinions? This feels like it could be a big change that  could 
> > impact how Nagios works so I want to make sure this is the  right route 
> > to go.
> > 
> 
> It needs lots and lots of testing. It would probably be better if Nagios 
> did something along the lines of
> 
> 	service_status = return_code & 0x3;
> 	plugin_flags = return_code & ~0x3;
> 
> This would mean that plugins would still only have 4 valid exit-codes, 
> but could choose to pass on additional information to Nagios which would 
> help users determine what went wrong (if the GUI supports it, ofcourse).
> 
> I'm not sure if this would break anything for now though, but if it does 
> it'll have to wait until there's a new major release (i.e. 3.0). I don't 
> think it's worth fiddling with right now.
> 
> -- 
> Andreas Ericsson                   andreas.ericsson at op5.se
> OP5 AB                             www.op5.se
> Tel: +46 8-230225                  Fax: +46 8-230231
> 
> 
> -------------------------------------------------------
> SF.Net email is sponsored by:
> Tame your development challenges with Apache's Geronimo App Server. Download
> it for free - -and be entered to win a 42" plasma tv or your very own
> Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
> _______________________________________________________
> Nagios Plugin Development Mailing List Nagiosplug-devel at lists.sourceforge.net
> Unsubscribe at https://lists.sourceforge.net/lists/listinfo/nagiosplug-devel
> ::: Please include plugins version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 
> 



Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org





More information about the Devel mailing list