[Nagiosplug-devel] Check_ping question

Andreas Ericsson ae at op5.se
Wed Nov 10 16:42:01 CET 2004


Robert Nelson wrote:
>>Nonsensical. The timeout for iterative plugins should always be 
>>calculated based on the sum of per-iteration timeouts (in 
>>this case, 5 
>>seconds * 3 packets).
> 
> 
> Yes, but what I'm really looking for is a real "check-host-alive". I
> don't want a status report, I just want a PASS/FAIL result. Since Nagios
> seems configured to use check_ping for it, I am not looking for it to be
> an iterative plugin in this case.
> 

I'm working on writing a check_host_alive plugin, which will be able to 
do just this.

> 
>>>I end up with an *effective* timeout
>>>value of (5000.0 / 1000.0 * 3 + 3 =) 18 seconds. This seems 
>>
>>"broken" to
>>
>>>me.
>>>
>>
>>It isn't. If you specify a per-packet timeout value of 5 seconds and 
>>send 3 packets that means the complete timeout must be at 
>>least equal to 
>>15 seconds (I don't know where those extra three come from), 
>>otherwise 
>>the per-packet timeout wouldn't make sense (should you count 
>>packets as 
>>lost if they're not even sent?).
> 
> 
> I guess I also am confused on this one, as /bin/ping on most OS's will
> send packet 2 at 1000ms, and packet 3 at 2000ms, with a 5000 ms timeout,
> that's 7.0000000001 seconds total. Why are we going all the way up to
> 18, or even 15 seconds then?
> 

It's just the maximum. check_ping doesn't have a backoff factor (which 
is what you describe), since it forks system ping to do its dirtywork.

> 
> 
>>>I ended up commenting out the last three lines quoted from 
>>
>>check_ping.c
>>
>>>and recompiling it. I'm just curious whether this is 
>>
>>behavior by design
>>
>>>or by error, and whether I need to make notes about it for when
>>>plugins-1.3.2 comes out. Thanks!
>>>
>>
>>It's obviously by design, and 1.3.2 won't come out. They're at 1.4.0 
>>now. What would be good would be to remove the timeout value, 
>>but that 
>>would make a LOT of configurations out there return UNKNOWN instead.
> 
> 
> I'll note that. I don't mind the way it calculates how long it *should*
> take. To me it appears that a specified "timeout" value should not be
> overridden, though.
> 

It isn't overridden. It's just prioritized lower than the timeout value 
specified in the threshold values for programmatical reasons.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer




More information about the Devel mailing list