[Nagiosplug-devel] Re: Rewrite of check_log

Flo Gleixner flo at bier.homeip.net
Mon Mar 1 20:10:26 CET 2004



On Thu, 26 Feb 2004, Stanley Hopcroft wrote:

> Dear Gentlemen,
>
> I am writing to thank you for your letter and say,
>
> On Thu, Feb 26, 2004 at 02:39:56AM +0000, Paul L. Allen wrote:
> > Flo Gleixner writes:
> >
> > > I did a half-rewrite of the check_log script.
> >
> > Good.  Both check_log and check_log2 are seriously flawed (in my
> > opinion).  They do NOT scale well for large logs.
>
> If you are 'serious about logging' or events then
>
> 1 Log centrally
>

How do you do that? With syslog? Not that bad, but not reliable and not
secure. With nsca i can do some basic encryption if i want. And my syslog
does not have to log spoofed messages :-) I know, nagios is not a central
logger ...


> 2 Monitor with an event correlator like Sec or RuleCore (Logsurfer or
> Swatch maybe)
>

Yeah, sure the better solution. Can you provide me with the homepage of
Sec?

> 3 Inform Nag with passive service check results when the event
> correlator detects that such and such an event corresponding to a Nag
> service has fired
>
> (think events as rates or time ordered sequences of messages eg 'Feb 7
> 15:36:47 pc09011 su: BAD SU anwsmh to root on /dev/ttyp1' occuring above
> and below a given rate threshold: you simply can't do this with plain
> pattern matching)
>
> Can't be beaten for scalability and flexibility. Sec handles serious
> message rates (100s of messages per sec IIRC; lots of event rules) with
> multi line matching and event correlation.
>
> >
> > > - No more diff. It calculates the difference with the difference of the
> > >   filesizes and only greps the added lines. This implies:
> > > - greatly increased performance for large log files.
> >
> > Good.  So what happens when logrotate kicks in?
>
> Sec doesn't care; it handles it as a normal event and keeps reading from
> the end of the file (run in tail mode).
>

So, does it read from the inode or from the file with that name? If it
reads from the inode, you have to kick it after logrotate, if it reads
from the filename (like my check_log does), it has to get sure that it did
read all lines of the old file that have been added between the last check
and the logrotate. My script is NOT aware of that. And it sometimes simply
does not work. Some Linuxes do a "logrotate and then compress". And some
move the old file somewhere. How do i know?

Simple way could be to record the inode and if the inode changed, try to
find the old inode and read the last few entries. But it's not reliable.

>   .. snip ...
>
> If you actually see logs as containing events, or rather as the log as
> nothing more than the record of an event stream, all the problems you
> mention go away.
>
> Events by definition hold state.
>
> On the other hand, I can imagine that for ad-hoc checks of distributed
> logs then the check_log.* plugins do an admirable job.
>
> There are at least two conditions where these plugins provide the only
> solution
>
> 1 distributed logs
>
> 2 event source heartbeat - eg application is still alive
>
> 3 low rate but critical import logged messages
>
> All of these things can be done with event correlation but you may not
> have the means or the willingness to do so (eg no loging policy,
> different responsibilities and views [eg NT admins in some cases see no
> need for central logging]).
>
> >
> > --
> > Paul Allen
> > Softflare Support
> >
>
> Yours sincerely.
>
> --
> ------------------------------------------------------------------------
> Stanley Hopcroft
> ------------------------------------------------------------------------

Florian Gleixner





More information about the Devel mailing list