[Nagiosplug-devel] RFC: Nagios 3 and Embedded Perl Plugins

Andreas Ericsson ae at op5.se
Thu Jan 4 11:06:52 CET 2007


Florian Gleixner wrote:
> Andreas Ericsson wrote:
>> Florian Gleixner wrote:
>>> True, leaks and crashes could make nagios more unstable. dl-plugins
>>> should be used with care. "Worker threads" could isolate some of the risk.
>>>
>>> The performance gain is simply the time a C plugin needs to create a
>>> process. You could say, that this is not very much time, but some nagios
>>> setups make thousands of checks per minute. Here is a very simple test:
>>> The bash has the echo command build in. On most linux systems you will
>>> find a /bin/echo program with same functionality too. So compare:
>>>
>>> time for ((i=0 ; i< 10000 ; i++)) ; do echo bla ; done
>>> real    0m1.536s
>>> user    0m0.172s
>>> sys     0m0.020s
>>>
>>> time for ((i=0 ; i< 10000 ; i++)) ; do /bin/echo bla ; done
>>> real    0m34.047s
>>> user    0m8.761s
>>> sys     0m15.365s
>>>
>>> I think some default plugins like ping or tcp-check could be made as dl
>>> module, the more complicated or the plugins that are usually executed at
>>> the monitored nodes should be "normal" plugins.
>>>
>>> I never had a look at the nagios code, it was just a idea popping up.
>>>
>> A lower hanging apple is to make Nagios use fork() / execve() instead of 
>> using popen(), which does a double fork() / exec() thing.
>>
> 
> or use the popen() call from popen.{h,c} from the nagios plugins.


That doesn't leave room for passing the environment though, which will 
break a very valuable feature in Nagios atm. Btw, popen.[hc] have been 
replaced by runcmd.[hc]. How old a version are you running?


> The nagios plugins also call external programs via this call. So at the
> moment one plugin check usually creates a shell process, the plugin
> executable process and if the plugin creates a process we have three
> process created for one simple ping.

No, there is the fork()/execve() in nagios (done through popen(3)) which 
spawns a shell. Then there's the fork()/execve() in the shell, and 
finally the plugin is run, so it's always three processes per plugin 
invocation. If the plugin spawns fe /bin/ps or /bin/df we have four 
processes for one plugin.

> Ideally a dynamically loaded plugin, that does not call external
> programs but has the code of for example "ping" complied in, does not
> create a single process.
> 

This is a Bad Idea beacuse the core program can't block on read()'s, 
which means all plugins that work over the network will have their 
timing values skewed unless you run each check in a separate thread or 
fork() a new nagios daemon for each check to run dynamically, in which 
case you've already lost 90% of the gain and ended up with a wicked 
burden of maintainability. That's without considering the initial cost 
(in developer time) to rewrite all plugins to never use signals[1] (or 
alarm(3)), which will be huge.

Also, for PING checks you're opening a new can of worms, since 
implementing the ICMP protocol generally requires access to raw sockets, 
which is, on almost all systems, restricted to the super-user. It's 
possible to work around this by obtaining one[2] raw socket prior to 
dropping the root privileges at startup, but then you'd be up for a 
fairly complex ping program that needs to keep track of all the hosts 
that currently has echo requests pending and assign each response to the 
right check.


[1] All module-based checks would want to catch the same signals, so the 
signal-handlers would be overwritten. alarm(3) is sometimes implemented 
through signals, so that's not usable.

[2] Obtaining one socket per ping-check at start-up and keeping them is 
not feasible, since most systems normally only allow 1024 
file-descriptors / process.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231




More information about the Devel mailing list