[Nagiosplug-devel] Re: Rewrite of check_log

Stanley Hopcroft Stanley.Hopcroft at IPAustralia.Gov.AU
Wed Feb 25 19:46:01 CET 2004


Dear Gentlemen,

I am writing to thank you for your letter and say,

On Thu, Feb 26, 2004 at 02:39:56AM +0000, Paul L. Allen wrote:
> Flo Gleixner writes: 
> 
> > I did a half-rewrite of the check_log script.
> 
> Good.  Both check_log and check_log2 are seriously flawed (in my
> opinion).  They do NOT scale well for large logs.

If you are 'serious about logging' or events then

1 Log centrally

2 Monitor with an event correlator like Sec or RuleCore (Logsurfer or 
Swatch maybe)

3 Inform Nag with passive service check results when the event 
correlator detects that such and such an event corresponding to a Nag 
service has fired

(think events as rates or time ordered sequences of messages eg 'Feb 7
15:36:47 pc09011 su: BAD SU anwsmh to root on /dev/ttyp1' occuring above
and below a given rate threshold: you simply can't do this with plain
pattern matching)

Can't be beaten for scalability and flexibility. Sec handles serious
message rates (100s of messages per sec IIRC; lots of event rules) with
multi line matching and event correlation.

> 
> > - No more diff. It calculates the difference with the difference of the
> >   filesizes and only greps the added lines. This implies:
> > - greatly increased performance for large log files.
> 
> Good.  So what happens when logrotate kicks in?

Sec doesn't care; it handles it as a normal event and keeps reading from 
the end of the file (run in tail mode).

  .. snip ...

If you actually see logs as containing events, or rather as the log as
nothing more than the record of an event stream, all the problems you
mention go away.

Events by definition hold state.

On the other hand, I can imagine that for ad-hoc checks of distributed 
logs then the check_log.* plugins do an admirable job.

There are at least two conditions where these plugins provide the only 
solution

1 distributed logs

2 event source heartbeat - eg application is still alive

3 low rate but critical import logged messages

All of these things can be done with event correlation but you may not
have the means or the willingness to do so (eg no loging policy,
different responsibilities and views [eg NT admins in some cases see no
need for central logging]).

> 
> -- 
> Paul Allen
> Softflare Support 
> 

Yours sincerely.

-- 
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------

'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'

from Meditation 17, J Donne.




More information about the Devel mailing list