[Nagiosplug-devel] Ping: I would expect Nagios to return -1, or 3 for "system14;UP;HARD;1;(No output!)"

Jason Crawford jasonrcrawford at gmail.com
Wed Feb 22 10:41:00 CET 2006


On 2/22/06, Lee Fitz <leefitzg at aol.com> wrote:
> I have a most perplexing couple of issues:
> ---- Nagios (ping plugin) appears to be returning '1' and this is causing an
> erroneous system UP notification
> - seems like is should be returning -1, or 3 = UNKNOWN which I
> would assume would not translate to system UP
>
> date
> mmdd hhmm ss
> 0203 0332 43 [1138966363] HOST ALERT: system14;DOWN;SOFT;1;CRITICAL -
> Plugin timed out after 20 seconds
> 0203 0333 03 [1138966383] HOST ALERT: system14;DOWN;SOFT;2;CRITICAL -
> Plugin timed out after 20 seconds
> 0203 0333 23 [1138966403] HOST ALERT: system14;DOWN;HARD;3;CRITICAL -
> Plugin timed out after 20 seconds
> 0203 0333 23 [1138966403] HOST NOTIFICATION:
> mtools;system14;DOWN;host-notify-by-mknotify;CRITICAL - Plugin timed
> out after 20 seconds
>
> 0203 0426 05 [1138969565] HOST ALERT: system14;UP;HARD;1;(No output!)      <===this is the problem
> 0203 0426 05 [1138969565] HOST NOTIFICATION:mtools;system14;UP;host-notify-by-mknotify;(No output!)
>
> NOTE: I am using
> define host {
> check_interval 1
> ...
> check_command check-host-alive
> }
>
> define command {
> command_name check-host-alive
> command_line ....libexec/check_ping -H $HOSTADDRESS$ -w
> 6000.0,80% -c 15000.0,100% -p 1 -t 20
> }
>
> Any suggestions would be appreciated
>

This is the exact error I got, so I think this is related to the bug I
found in the nagios plugins 1.4.2 where the alert signal function was
trying to use child_process without checking whether it's NULL or not.
A little patch I sent got put into the current version of
nagios-plugins, but as far as I know has not been back-ported to 1.4.2
or whatever. For me, it had to do with the ipv6 host check, where it
would check whether or not the host is an IPv6 address, by calling a
getaddrinfo(), and that would take long enough that SIGALRM would be
tripped, and child_process was NULL at that point. I found it went
away if I did --without-ipv6, as even with my fix, it still exists,
but with a better output message (and nagios doesn't segfault). The
thing I don't like about check_icmp is it has to be run as root or
setuid root, where as check_ping does not (it uses the system setuid
ping binary).

Jason




More information about the Devel mailing list