[Nagiosplug-devel] Re: Rewrite of check_log

Paul L. Allen pla at softflare.com
Thu Feb 26 09:39:01 CET 2004


Jason Martin writes:
> I believe check_log2 uses a state file that has the byte position where 
> it last. If the log rotates, the seek to that location will fail and it 
> will start from the beginning.

You're right.  I had a brief look at check_log2 many, many months ago and
remembered only that there was something I didn't like about it.  It is
doing the scanning relatively sensibly.  However, I don't think sufficient
protection has been given for the case where checks overlap for some
reason, although that is unlikely to happen unless the check interval
is absurdly low or the log file is absurdly big. 

The status code usage doesn't seem right to me.  If it gets a match it
reports a warning when, for most things, you'd want it to report critical.
If there is an error (like file not found) it reports critical when it
should, I believe, report unknown according to the plugin guidelines. 

Ideally, of course, it would be nice to specify an optional number of
matches for warning and critical levels so that you can do crude rate
detection of events (assuming that scheduling is regular, which in
practise it isn't).  Even better would be to use the timestamp on the
seek file to figure out the elapsed time and get a true rate.  I'm
thinking here of IDS logs - one intrusion attempt between check intervals
is probably nothing to worry about but a couple of hundred definitely
would be. 

Hmmm, and there is another mode of operation that would be useful.
Being able to latch an alert and specify another pattern that
cancels the alert.  So if it spots item X in the log it returns
critical on that and subsequent checks and only returns to OK if
it later spots item Y in the log.  Some things do log when there is
a problem and also when there is a recovery, although the ones that
come to mind (like pptp tunnels) are better monitored in other ways. 

> You are correct in that check_log's method of using diff to read only the 
> new events is ...interesting. 

Tell me about it.  Ever since my boss learned shell he's been doing
"interesting" things with it.  I.e., crazy, inefficient and resource-
intensive things.  Things that are perfectly sensible for one-off
tasks, or stuff that runs once a week, but very silly when you're
running them every 5 minutes from cron, which is what he does with
them. 

-- 
Paul Allen
Softflare Support 





More information about the Devel mailing list