[Nagiosplug-devel] Best Way to Monitor Cruisecontrol

Stanley Hopcroft Stanley.Hopcroft at IPAustralia.Gov.AU
Thu Feb 19 14:14:02 CET 2004


Dear Gentlemen,

I am writing to thank you for this interesting thread and say, that 
event correlation in its Sec incarnation is well suited to this class of 
checks - determining if something _did not_ happen.

On Wed, Feb 18, 2004 at 08:30:02PM -0500, Subhendu Ghosh wrote:
> On Wed, 18 Feb 2004, Robert Pearse wrote:
> 
> > I don't want to change the Cruisecontrol installation. There's already
> > too much involved in configuring it, as it is.
> > 
> > Maybe I should have asked for the best way to monitor and remote file.
> > 
> > Robert
> > 

  ... snip ...


> > > > I'm trying to monitor the Cruisecontrol process on a remote server
> > using
> > > > Nagios. I think the easiest way to do that is to monitor
> > > > cruisecontrol.log and see if the file is changing every half hour.
> > If
> > > > not, then it's stuck.
> > > >
> > > >
> > > >
> > > > What's the best way to monitor a file for changes on a remote
> > server?
> > > >
> 
> if you have ssh access you could:
> - check the timestamp (newness?)
> - diff file versus older copy 
>

Likewise you could export the file using a number of remote file systems 
(NFS, SMB) and check the last access time of the file with the stat 
system call (another way of saying 'newness').

Another alternative is Event correlation.

Event correlation lets you determine if there has not been a message 
matching a pattern (in a rule describing part of that event) written to 
that file.

Sec tails the file and matches each set of log lines (it can read more
than one at a time) against a pattern, and for matches performs some 
actions that include generating a Nagios PROCESS_SERVICE_CHECK_RESULT 
and starting an event correlation operation.

(You will have a set up a passive service corresponding to the 
CuriseControl service and let the check be done by the process appending 
the PROCESS_SERVICE_CHECK_RESULT to the Nag command queue. Making the 
check 'volatile' may be useful in this case also.)

In this case, the event correlation may result in the Nagios 
PROCESS_SERVICE_CHECK_RESULT if no messages arrive in your given time 
window.

To set up Sec in this case would require

1 export the CruiseControl log to the Sec/Nag host

(use NFS if possible, I have seen SMBFS implementations that don't
properly replicate local file system semantics).

2 run Sec in tail mode on the exported file

3 set up a Sec rule to monitor the file 

You have to do a bit of set up, and the Sec learning curve is a bit
steep (you probably want to start with James Brown's tutorial 'Getting
started with Sec') but this is a piece of infrastructure that has
numerous applications.

Alternatives to Sec include

RuleCore

LogSurfer

Swatch

> 
> -- 
> 
> -sg
> 



-- 
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------

'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'

from Meditation 17, J Donne.




More information about the Devel mailing list