[Nagiosplug-devel] [ nagiosplug-Feature Requests-1703823 ] [check_ntp] add option to verify stratum

John P. Rouillard rouilj at cs.umb.edu
Fri Apr 20 06:00:59 CEST 2007

In message <E1HejAy-0007Dt-B4 at sc8-sf-web21.sourceforge.net>,
"SourceForge.net" writes:
>Submitted By: Enrico Scholz (ensc)
>>Assigned to: Thomas Guyot (dermoth)
>Initial Comment:
>Please add an option which checks whether stratum of server exceeds a
>certain value.
>>Comment By: Thomas Guyot (dermoth)
>Date: 2007-04-19 22:48
>Logged In: YES 
>Originator: NO
>That would be fairly easy now that i know that much about check_ntp and
>can be interesting in many scenarios :)

I enhanced the original check_ntp script quite a while ago becuse it
was monitoring things incorrectly according to my understanding of

My stratum check accepts a range that the synchronization peer must be
at. So -s 2:3 means that the sync peer host must be at 2 or 3. It
returns a warning if the host is lower or higher then it should be.

Also I think you already test warning and critical levels for the
jitter of the sys.peer (or pps.peer), but I added 

  -C (--candidates)
   Check number of backup servers that are capable of becoming
        peers. Range (min:max), solo number is minimum number.

If you look at the output of ntpq -p, this counts the number of * and +
hosts that are available. E.G in this case:

     remote           refid      st t when poll reach   delay   offset  jitter
*ntp1       2 u  937 1024  377    0.393   -1.660   0.444
+ntp2       2 u  470 1024  377    9.619    4.526   1.559
+ntp3      2 u  645 1024  377   47.782   -1.140   0.382
 ntp4       2 u  470 1024  377    9.619   18.526  25.559
 LOCAL(0)        LOCAL(0)        10 l   61   64  377    0.000    0.000   0.004

would compare 3 to the range specified by -C. This is useful for
detecting peers that have lost synchronization leaving you vulnerable
to a timeshift due to lack of other peers to stabilize the algorithm.


  -F (--falseticker)
   Check the number of falsetickers reported. Range (min:max).

so -F 0:2 means I will accept up to 2 falsetickers connected to my ntp
server. Good for detecting broken time sync sources.

Also I implemented:

   -l (--last) Maximum time (seconds) between packets received
      from polled machines.

This is compared against the "when" column in the ntpq output to look
for ntp servers that are unresponsive, down, or unreachable due to
network issues.

If you want you can use some/all of these ideas in your check_ntp

				-- rouilj
John Rouillard
My employers don't acknowledge my existence much less my opinions.

More information about the Devel mailing list