[Nagiosplug-devel] RFC on proposed ePN changes/Regression testing of Perl plugins.

Stanley Hopcroft Stanley.Hopcroft at IPAustralia.Gov.AU
Tue Jan 13 19:01:02 CET 2004


Dear Gentlemen,

Is anyone interested in checking the latest Perl plugins with
'usability' modifications to ePN (to wit, patches to p1.pl described
below) ?

It really needs someone with :-

. an interest in Perl plugins

. a test or production configuration against which the plugin set is
_known_ to work.

. hopefully an interest in ePN or the willingness to bring one up
briefly.

If anything this a good regression test since any plugins that generate
warnings or errors at both compile and run time get booted with a
message like

tsitc> tail -1500 nagios.log | grep -i ePN
[1073856915] SERVICE ALERT: firewall;Mainframe access via outsourced
firewall;UNKNOWN;SOFT;1;**ePN 'check_mf' problem connecting to
"202.14.186.30", port 23: connection timed-out at (eval 40) line 65

[1073862005] SERVICE ALERT: asterix;COMS ad-hoc
check;UNKNOWN;SOFT;1;**ePN 'check_coms' Use of uninitialized value at
(eval 54) line 219

[1073862065] SERVICE ALERT: asterix;COMS ad-hoc
check;UNKNOWN;SOFT;2;**ePN 'check_coms' Use of uninitialized value at
(eval 54) line 219

[1073862125] SERVICE ALERT: asterix;COMS ad-hoc
check;UNKNOWN;HARD;3;**ePN 'check_coms' Use of uninitialized value at
(eval 54) line 219

[1073934009] SERVICE ALERT: sapintranet;Web interface to SAP
R3;UNKNOWN;SOFT;1;**ePN 'check_sapintranet' Use of uninitialized value
at /usr/local/lib/perl5/site_perl/5.005/Dfa/MainframeSession.pm line
174, <F> chunk -7

[1073934069] SERVICE ALERT: sapintranet;Web interface to SAP
R3;UNKNOWN;HARD;2;**ePN 'check_sapintranet' Use of uninitialized value
at /usr/local/lib/perl5/site_perl/5.005/Dfa/MainframeSession.pm line
174, <F> chunk -7
tsitc>

In all these cases, these erors represent shortcomings or faults in my
own plugins.

Yours sincerely.

-- 
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------

'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'

from Meditation 17, J Donne.

More details/original RFC for testers.

I am writing to invite those with embedded Perl Nagios (ePN)
installations to consider, and or comment, and or test patches to the
Perl component (p1.pl) intended to increase the usability of ePN by

1 (minor correction) return UNKNOWN for plugins that do not conform to
  plugin guidlelines by failing to call exit (either abend or by
  design) - formerly Nag would report CRITICAL

2 Logging run-time errors in the Nagios Log.  eg

tsitc> grep -i epn nagios.log 
[1072960109] SERVICE ALERT: asterix;J2EE
Scheduler;UNKNOWN;SOFT;1;**ePN plugin 'check_scheduler' has syntax
errors. Check ePN log. 

[1073100771] SERVICE ALERT: external;AUB search;UNKNOWN;SOFT;1;**ePN
'check_aub' syntax errors. Check ePN log.

[1073100851] SERVICE ALERT: asterix;J2EE
Scheduler;UNKNOWN;SOFT;1;**ePN 'check_scheduler' runtime
error: Illegal division by zero at (eval 52) line 35.

[1073281491] SERVICE ALERT: external;ATMOSS
search;UNKNOWN;SOFT;1;**ePN 'check_atmoss' syntax errors. Check ePN
log.

[1073732647] SERVICE ALERT: external;ATMOSS
search;UNKNOWN;SOFT;1;**ePN 'check_atmoss' Bareword found where
operator expected at (eval 49) line 66, near
"'http://external/atmoss/falcon.application_start :
tsitc> 

3 Logging the text that ePN actually runs on detection of a plugin
syntax error eg

[1073730787] **ePN 'ap5' error 'Global symbol "$ppp" requires explicit
package name at (eval 23) line 15.
' in text "
         0  
         1                      package main;
         2                      use subs 'CORE::GLOBAL::exit';
         3                      sub CORE::GLOBAL::exit { die
"ExitTrap: $_[0] (Embed::ap5)"; }
         4                      package Embed::ap5; sub hndlr { 
         5  shift(@_);
         6  @ARGV=@_;
         7  local $^W=1;
         8  #!/usr/bin/perl -w
         9  
        10  use strict ;
        11  
        12  # use diagnostics ;
        13  
        14  $ppp = 0 ;                          # var name that is
_unlikely_ to be used by other plugins.
        15  
        16  while ($_ = shift @ARGV) {
        17    # print "\$ARGV\[$ppp\]: $_  ;    # NB embedded Perl only
reads __1__ (one) line of output !
        18    print "\$ARGV\[$ppp\]: $_ " ;             # NB embedded
Perl only reads __1__ (one) line of output !
        19    $ppp++ ;
        20  }
        21  
        22  exit 0;                                     # Like all
plugins should do (otherwise, unknown [3] return status).
        23   }
        24                      "

As you can see, the text run by ePN is turned into a subroutine. This
has consequences for plugin authors since there are other subtleties to
be concerned about (eg closure retention of values)

Plugin syntax errors are logged to a new log file, tentatively named
/usr/local/nagios/var/epn.log

4 Plugins that generate warnings are _not_ run.

This is a good idea since the basic ePN function is retain namespaces
(stashes) containing globals and packages (modules). If the plugin is
allowed to live, it may fail the first time but succeed the next time
its run (after another plugin loads a module the first one needs, or
worse set a global value that the first plugin should have set).

Obviously this is confusing if not infuriating and dangerous.

Apart from these changes 

1 The operation, intent and style of p1.pl is largely retained. In
particular the API is unchanged.

There are _no_ changes to the Nag C code (at this stage).

2 Sites that are vigilant about the plugin standards are unlikely to
notice any difference in behaviour. Be aware however that ePN covers not
only plugins but _also_ Perl event handlers.

3 There is some docco and maybe some test code. A Perl version of
mini_epn is also planned - about time I can hear many say.

The modified p1.pl has been tested informally by me with 

1 my production ePN (200 hosts/420 services) with FreeBSD 4.9/Perl
5.005

2 the mini_epn ePN simulator (/contrib)

3 a test Nag (hacked plugins/few checks/no alerts) with FreeBSD
4.9/Perl 5.8.0

Please note that these changes are designed to prevent 

- wierd ePN error messages being logged

- anti-social ePN behaviour (alerts from a new buggy plugin)

and to provide sysadmins and Perl plugin developers feedback about
plugin execution by ePN.

The patches do _not_ address the performance of ePN. They do _not_

1 Deal with ePN memory leaks - my _informal_ testing shows that leak
rate seems unchanged.

2 Make ePN faster (or slower - probably)

Future changes however may increase the scope to include returning
plugin output via the calls to Perl (dispensing with the file system
method. This should speed things up a little and save Nag some sys IO
calls).

If there is someone out there waiting to write mod_nagios, I will shut
up real quick but in the meantime you are stuck with someone that is
learning - real slowly.

If you are interested in receiving the patches, please write me
privately.

Note that although the changes are likely to be applied to Nag 2.0, this
is a strictly private iniative that has nothing to do with the Nag
core or Mr Galstad.  As usual, it is on your own head, and don't blame
Ethan.







More information about the Devel mailing list