[Nagiosplug-devel] check_ntp (Was: Flight 1.4.8, ready for boarding)

sean finney seanius at seanius.net
Wed Apr 4 22:03:41 CEST 2007


hey thomas,

On Wed, 2007-04-04 at 13:37 -0400, Thomas Guyot-Sionnest wrote:
> The problem with xntpd it that it doesn't have a jitter value (ntp v3).
> I'll work up a patch a bit different than what I sent but basically
> it'll do the same: take dispersion in place for jitter.

okay.  i *think* that's the same thing with just a different name,
right?

> While we were working on this check, me and Holger raised up a few
> issues. I worked up a todo list and I'd like to share it with you. Only
> the older ntp client support would go in before the 1.4.8 release.
> 
> 1. Older ntp server support (I'm working on it)
> 
> 2. The offset and jitter doesn't change on every call, so there's no
> reason to poll 4 times and compute the average. I'd like to remove all
> code related to that.

one of the design goals when i first wrote the plugin was to mimick as
closely as possible the behaviour of the previous perl-based check_ntp
plugin.  after analyzing the plugin plus the source for ntp plus packet
dumping what the ntp cmdline programs did, this was the behaviour i
found.  that doesn't necessarily mean that various behaviours are the
ideal behaviour, but just so you know where it came from :)

but anyway, as far sampling/averaging goes, the offset/delay can vary a
bit more if the network is less than reliable iirc, hence the multiple
requests.  this is what the ntp cmdline client does as well.

> 3. Allow to use -H multiple times

seems reasonable, though i'd probably have a seperate nagios check for
each host.  see comments below for (4) though.

> 3a. Do one lookup for the servers and store an array of IPs for the
> various functions. (Is it worth it? Will avoid code duplcation
> implementing #4)

isn't that what's already done with the getaddrinfo(), and
array-of-sockets allocation?

> 4. When multiple servers are specified (either multiple IP per hostname
> or multiple -H aguments, check the jitter for all servers.

i think this falls back into the mimicking-behaviour design again.
previously i believe we only checked the jitter on the remote clock
declared as the sync source, but i could be wrong.  i don't really think
this is the *right* behaviour, but before i went fixing it the idea was
to get something that was compatible with the current versoin.
 
actually, istr someone pointing out several months ago that we were
really doing the wrong thing to begin with wrt jitter checking, and that
we ought to really be checking the local jitter and not the jitter of
remote systems to begin with, or something like that.  i'm going from
some hazy memory here, but i think ultimately the problem is that there
are two use cases for check_ntp, but the code has in the past and still
currently not differentiated between the two cases. 

first you have the case of checking the status of the local system, by
connecting to peers specified on the cmdline and verifying the offset.
in such cases we really want to see the local jitter and not the remote
jitter.

the second case is when you're actually interested in the status of the
remote system, and in this case you're comparing the state of its clock
with that of yours (or others), and in which case you're interested in
the jitter on the remote system.

if i'm remembering all of this correctly, i think it would be best to
provide a flag for which form of check we're doing and then have the
plugin behave appropriately based on that.

> 5. Look into the possibility of storing some of the sent header in a
> linked list on write and then match them on reads. That will allow to
> send all packets as fast as possible (ex. when checking the jitter of
> all sync candidates) and also to easily drop odd packets. If put in a
> separate routine that would also allow to easily loop for additional
> packets and append the data. (Any other suggestion?)

i'm not quite sure i follow here.  how this is different from poll on an
array of sockets...?  currently afaik the data *is* sent as fast as
possible, and we read the data as fast as it comes in.  if we need more
per-host information, we know ahead of time how many hosts/sockets/etc
that are needed, so i don't think there's any need for a linked list
instead of a pre-allocated array for whatever extra data we need to
track.


	sean
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
URL: <https://www.monitoring-plugins.org/archive/devel/attachments/20070404/d1287fed/attachment.sig>


More information about the Devel mailing list