[Nagiosplug-devel] Improved check_oracle (TS check with autoextend)

joerg.helmert at aracomp.de joerg.helmert at aracomp.de
Wed Mar 24 04:21:14 CET 2004


> -----Original Message-----
> From: nagiosplug-devel-admin at lists.sourceforge.net 
> [mailto:nagiosplug-devel-admin at lists.sourceforge.net] On 
> Behalf Of Stanley Hopcroft
> Sent: Wednesday, March 24, 2004 11:26 AM
> To: Andreas Ericsson
> Cc: Poitschke Kai; nagiosplug-devel at lists.sourceforge.net
> Subject: Re: [Nagiosplug-devel] Improved check_oracle (TS 
> check with autoextend)
> 
> > > The second change is a minor change. I set the status code to 
> > >UNKNOWN if the tablespace could not be examined.  I 
> personally don't 
> > >want to get CRITICAL notifications when I check a  
> tablespace and get 
> > >an Oracle error, because of a broken network connection or 
> a database 
> > >that is in shutdown/startup state etc.
> > > 
> > Bad Thing.
> > There's been a discussion on this on the nagios-users list. UNKNOWN
> > should only be reported when there is a user error (too few 
> arguments, 
> > etc. etc.). If it fails to fetch information from the 
> network, that is 
> > considered to be a critical error.
> 
> Eeek !
> 
> The I think new committed infrastructure for embedded Perl 
> Nagios (ePN) support does precisely this ie if the plugin 
> bombs out because of
> 
>  . compile time errors (probably because of the ePN environment)
> 
>  . run time errors
> 
> then UNKNOWN is returned (along with a dump depending on log 
> level of the ePN).
> 
> I share the former writers concern about spurious alerts.
> 
> I canvassed this proposal (for new behaviour for ePN) with an 
> RFC to both Nag-users and Nag-devel and possibly plugindevel 
> as well, and got _no_ comments.
> 
> Personally, I have been running this way for some months now 
> and much prefer it to the former nightmare of committing a 
> new plugin only to find it notifies people unnecessarily 
> (yes, I test; use the epn simulator etc but still things go wrong).
> 
> Yours sincerely.
> 
> Stanley Hopcroft
Hmmh, maybe I'm not long enough listening to the list
Or didn't understand
Or felt, I was to new to nagios to answer right...

I think, for compile time errors that will be right.
If I understand right, that plugin would never run successfull. Instead bail
out with a compile error.
Same like with missing commandline options.

But think of following:
A plugin runs successfully and returns ok.
You start to rely on.
Now something occurs, causing a runtime error.
(someone deletes a file needed or changes permission or filesystem gets
corrupted or whatever)
It is true, that the status of that check in reality is unknown.
But for me the overal picture is more important.
Something is going wrong after it was ok.
I want to KNOW a status but only find out that the status is unknown.
That is critical for me. 

I reread the development guidelines and found that I missed something:

3 | Unknown | <snipped> or the plugin was unable to check the status of the
given hosts/service

That clearly states, what you implemented.

My opinion is still different.

Implementing it UKNOWN is more polite and keeps operators sleeping...
...but if knowing what is going on is most important, lets wake them up. ;-)


Regarding the argumentation from Kai:
<citation>
I personally don't want to get CRITICAL notifications when I check a
tablespace and get an Oracle error, because of a broken network connection
or a database that is in shutdown/startup state etc.
</citation>

Couldn't that be done with dependencies?
Didn't play with dependencies yet, but I thought, thats what they are for.
Tablespace check dependent on database up check
All databases dependent on network check

(I would use the tablespace check to check if db is up. No need for an
additional check. I'm intersted in db being up. So again, critical for
me...)


Bye,

Joerg





More information about the Devel mailing list