[Nagiosplug-help] check_nagios -C problem

Franz, Jay Jay.Franz at ssa.gov
Wed Jun 27 01:01:10 CEST 2012


As a wise admin once said, necessity is both a mother and motivator.  The problem with the failure of the plug-in, to correctly identify the Nagios command by its full pathname, results from the options passed to the 'ps' command (revealed by running the plug-in in "verbose" mode).  The 'ps' command and options appear to be hard-coded in the 'configure' script for AIX and HP-UX.  Specifically, the resulting 'ps' options, '-el', do not display the full pathname of the executing process.  Instead, they only show the command name itself.

For example:

# /usr/bin/ps -el | egrep "[n]agios"
2401 R        108 11743     1  0 152 20 e000000174822280  535                - ?           00:00 nagios

Versus:

# /usr/bin/ps -ef | egrep "[n]agios"
nagios   11743     1  0 18:11:24 ?           00:00 /opt/iexpress/nagios/bin/nagios -d /opt/iexpress/nagios/etc/nagios.cfg

As a result, passing any part of the pathname to the plug-in will generate a CRITICAL result.

So, our solution is pass only the command name, and to rename the plug-in, so that it will no longer match itself.  A bit of a kludge, perhaps, but like most kludges, it does do the trick.

-----Original Message-----
From: Franz, Jay [mailto:Jay.Franz at ssa.gov] 
Sent: Tuesday, June 26, 2012 13:41
To: 'nagiosplug-help at lists.sourceforge.net'
Subject: [Nagiosplug-help] check_nagios -C problem

We are in the process of setting up fail over monitoring for our existing Nagios server and are experiencing some problems with the 'check_nagios' plug-in.  Specifically, it does not appear to recognize our full path command string.  Instead, we are only able to make it work by stripping down the command path to its basename (i.e., '/opt/iexpress/nagios/bin/nagios' versus 'nagios').  Our OS, Nagios core, and plug-in versions follow, as well as the process status output of our Nagios command and the execution results from the 'check_nagios' plug-in.  Any advice would be appreciated.  Thanks.

--------------------

OS:
# uname -sr
HP-UX B.11.23

Nagios Core:
# /opt/iexpress/nagios/bin/nagios -v /opt/iexpress/nagios/etc/nagios.cfg | egrep "Nagios Core"
Nagios Core 3.2.3

Plugin:
# /usr/local/nagios/libexec/check_nagios --version
check_nagios v1.4.15 (nagios-plugins 1.4.15)

--------------------

# ps -ef | egrep "[/]opt/iexpress/nagios/bin/nagios"
nagios    9817     1  0  Jun 22  ?           05:34 /opt/iexpress/nagios/bin/nagios -d /opt/iexpress/nagios/etc/nagios.cfg

# /usr/local/nagios/libexec/check_nagios -e 60 -F /opt/iexpress/nagios/var/nagios.log -C /opt/iexpress/nagios/bin/nagios
NAGIOS CRITICAL: Could not locate a running Nagios process!

# /usr/local/nagios/libexec/check_nagios -e 60 -F /opt/iexpress/nagios/var/nagios.log -C nagios                         
NAGIOS OK: 2 processes, status log updated 1822 seconds ago

While the second iteration works, more or less, it will never return a CRITICAL status because it always matches against itself.  That is, the 'check_nagios' script shows up in the list of processes when it executes.

For example, if we stop the Nagios server, the 'check_nagios' script still returns an OK status

# /sbin/init.d/nagios stop
Stopping nagios: 
done.

# ps -ef | egrep "[/]opt/iexpress/nagios/bin/nagios"
<NO OUTPUT>

# ps -ef | egrep "[n]agios"
<NO OUTPUT>

# /usr/local/nagios/libexec/check_nagios -e 60 -F /opt/iexpress/nagios/var/nagios.log -C nagios
NAGIOS OK: 1 process, status log updated 15 seconds ago

Even if we reduce the expire window to 1, we never get more than a WARNING.

# /usr/local/nagios/libexec/check_nagios -e 60 -F /opt/iexpress/nagios/var/nagios.log -C nagios
NAGIOS OK: 1 process, status log updated 268 seconds ago

# /usr/local/nagios/libexec/check_nagios -e 1 -F /opt/iexpress/nagios/var/nagios.log -C nagios 
NAGIOS WARNING: 1 process, status log updated 272 seconds ago

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Nagiosplug-help mailing list
Nagiosplug-help at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagiosplug-help
::: Please include plugins version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null




More information about the Help mailing list