[Nagiosplug-devel] [Nagiosplug-help] check_ntp

Andreas Ericsson ae at op5.se
Sun Nov 11 18:33:43 CET 2007


Thomas Guyot-Sionnest wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 11/11/07 09:58 AM, Andreas Ericsson wrote:
>> sdepThomas Guyot-Sionnest wrote:
>>> check_ntp's description in --help is:
>>>
>>> "This plugin checks the selected ntp server"
>>>
>>> To me that clearly doesn't mean we're checking the local time against
>>> the given ntp server.
>>>
>> So amend the message then, perhaps? Changing 400 lines of code (introducing
> 
> Isn't that as bad? A plugin doesn't do what it is advertised to do, so
> change the description and get out with it? What about people that
> actually relied on this description? Shouldn't they get what they think
> they're getting?
> 

Well, the plugin *does* check the selected ntp server. The description is
just ambiguous, not actually wrong.

>> at least one segfault crash in the process) because a message is wrong
>> doesn't seem like a stellar plan, tbh.
> 
> I didn't reviewed my code yet; I also planed some more thoughtful tests
> and even putting check_ntpd in a full production system before release.
> 
> If you found any bug I'll be glad to fix them.
> 

Here's one.

diff --git a/plugins/check_ntpd.c b/plugins/check_ntpd.c
index 2ac2a55..84ebe34 100644
--- a/plugins/check_ntpd.c
+++ b/plugins/check_ntpd.c
@@ -583,75 +346,100 @@ double jitter_request(const char *host, int *status){
        if(verbose) printf("%d candiate peers available\n", num_candidates);
        if(verbose && syncsource_found) printf("synchronization source found\n")
        if(! syncsource_found){
-               *status = STATE_UNKNOWN;
+               status = STATE_WARNING;
                if(verbose) printf("warning: no synchronization source found\n")
        }


 


>>> If we go that way, it will stay with its current semantics, but it'll
>>> say that it's deprecated and will be removed much later (Maybe next
>>> major release).
>>>
>> Still really bad. There's no real reason for it! Simply renaming it is
>> NOT a valid reason, although I see you've also robbed check_ntp of some
>> of its functionality in the check_time_ntp incantation.
> 
> Because checking jitter just make no sense when you want to check if
> your local host is sync. This can be confusing too; some people may
> thing it's the jitter calculated during the check and not reported my
> the ntp server, while others may assume since there's a jitter option
> that the offset is for the peer as well.
> 
> There's no point in checking two different things in the same plugin and
> this is why I split them.
> 
>>> I'm not sure if you follow me well there. By "deprecating" check_ntp I
>>> just mean we'll yell it all over the place without touching it (apart
>>> from normal bugfixes). However people will have the option to use two
>>> new checks that do what they should do.
>>>
>> I understand perfectly well what "deprecating" means, thanks. What I
>> mean is that there's no real reason for it. None. Zip. Zero. So what
>> you'll end up doing is forcing people who are using a perfectly fine
>> plugin to pick another one (that does less btw, as check_time_ntp can't
>> check for jitter) in order to retain exactly the same setup as they
>> already have. Can't you see how unforgivably ridiculous this is?
> 
> I totally disagree there. First check_ntp IS broken. "This plugin checks
> the selected ntp server". Ok, I run ntpd on all my server, so I add a
> service check for all my servers that do "check_ntp -H $HOSTADDRESS$
> - --some-thresholds-here", right? WRONG.
> 

That depends. If all your servers are supposed to be in sync you'll get
a warning when one of them goes out of sync with that. Granted, if all
servers *but* the nagios server stay in sync you'll get a bucket-load
of notifications, but that's beside the point.

> So I tell them if you want to monitor your ntp servers

But there you go re-introducing the same ambiguity again. Be explicit
in your new plugin.

"If you want to check if your ntp server is in sync with its peers,
use check_ntpd"

>, use check_ntpd.
> if you want to check the time diff between a server and a remote clock,

Ambiguous again. "If you want to check if a server is in sync with a
particular ntp server, use check_ntp_time" (check_time_ntp and check_ntpd
wouldn't sort next to each other in directory listings, so check_ntp_time
would be better, imo).

> run check_time_ntp on them. If you do it that way you don't have to
> check the jitter.
> 
> Second, they can keep their setup by using check_ntpd. the other option
> is broken by design. Checking local offset and remote jitter make no sense.
> 

No, they CAN'T keep their setup by using check_ntpd, since that will check
something completely different!

> I also think we could start a release notes document that would cover
> more in-depth the changes in NEWS and current issues in BUGS.
> 

Sensible idea, although I have a problem finding a target audience. Those
who just want to know if a particular release of the plugins package will
solve a problem for them will just look at the news. Those interested in
technicalia will have no problems reading the info straight out of the
source, since the plugins are so simple.

But we're digressing.

>>> IMHO for many people it's already broken and they just don't realize it.
>>> When I do a NTP check I check the $HOSTADDRESS$, which mean I want to
>>> know about the health of the NTP server, not the offset between Nagios
>>> and the server. Do you know many people that run check_ntp over nrpe and
>>> specify a time server?
>>>
>> Yes. I also know many people that want to be notified if their nagios
>> server's clock starts drifting, and I know that everyone will simply
>> hate whoever makes them sit down to fix some breakage in their nagios
>> installation simply because a plugin was renamed. They'll be equally
> 
> Not renamed. I talking about creating NEW ones.
> 
>> pissed if a plugin that formerly did something now does something
>> entirely different (although that particular disagreement seems to
>> have been settled, since you named it check_ntpd instead of replacing
>> check_ntp).
> 
> Yes and I mentioned creating two checks and not touching check_ntp two
> posts ago, so I don't understand why it's still going on in the thread.
> 

Because you want to deprecate it, which means "get ready to remove it in
the future", while you're saying "oh, but lookee here you can use this
new plugin to do the exact same thing".

>>> The newer check specifically warn if the server isn't sync (I may
>>> add an option to make that configurable - ok/warn/crit/unknown).
>>>
>> Which server? The one check_ntp runs on, or the one it's checking?
> 
> I mean it checks if the remote server has a sync.peer (noted with a star
> in "ntpq -p" output. This might not be working well yet, but as I said
> it's still work in progress...
> 

So changing from check_ntp to check_ntpd would mean one gets no warning
if the nagios server starts drifting then? That's a clear change of
semantics right there, and that's why a LOT of people will think the
new plugin is just plain broken.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231




More information about the Devel mailing list