[Nagiosplug-help] Inexplicable pattern match failure of check_http since update

Thomas Guyot-Sionnest dermoth at aei.ca
Wed Apr 1 13:14:49 CEST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/04/09 05:43 AM, Ralph.Grothe at itdz-berlin.de wrote:
> Hello Nagios Plug-in List Members,
> 
> I recently updated my Nagios Server to from 2.9 to 3.0.6 release and the latest stable release of the Nagios Plug-ins.
> 
> After a day of only minor adaptations of my host/service/check command definitions the vast majority of them is now running fine,
> and it was a rather seamless update.
> 
> Yet, I still encounter some to me inexplicable issue with one of my checks of a Tomcat server.
> 
> When the check is run by the nagios scheduler I always get a HARD CRITICAL error owe to a pattern mismatch
> of my glob/regex as defined by the check command and service definition according to what I expect to receive as HTTP response
> from the checked Tomcat Manager container.
> 
> [nagios at nagsaz:~]
> $ grep aDISWeb\ Tomcat /var/log/nagios/nagios.log 
> [1238536800] CURRENT SERVICE STATE: uranus;aDISWeb Tomcat;CRITICAL;HARD;5;HTTP CRITICAL - pattern not found
> [nagios at nagsaz:~]
> $ perl -le 'print scalar localtime 1238536800'
> Wed Apr  1 00:00:00 2009
> 
> Whereas, when I run the check from the shell as user nagios on my Nagios server with exactly the set of arguments an macros
> as passed to the service, all is OK.
> 
> [nagios at nagsaz:~]
> $ /opt/nagios/plugins/libexec/check_http -H uranus -p 8081 -u /manager/sessions?path=/aDISWeb -a bogususer:boguspasswd -l -r '^OK.*:\s*[0-9]*\sSitzungen'
> HTTP OK HTTP/1.1 200 OK - 0,007 second response time |time=0,007391s;;;0,000000 size=398B;;;0
> 
> 
> As said, the arguments passed to the above manual check are (apart from the obfuscated credentials here) exactly the same as in the 
> definitions of the service and the check command.
> 
> Another thing that changed together with my Nagios and Plug-ins update for this service was that I was forced to set the global i18n on the host
> that runs this service to de_DE.utf-8 (from en_US.iso885915 before).
> The effect that it had on this check was that the HTTP response now contains German text in the HTML markup, so the only difference I had to cater for
> was to replace "sessions" by "Sitzungen" within my pattern.

Are you sure the proper environment is set from nagios? When you run the
plugin by hands try running it with "env -i".

Other things you may try:
- - Try running the plugin with -v in nagios - it should capture plugin
execution details (if not, redirect to a file).
- - Try adding "echo" in nagios before the plugin - it will print the
exact command it's running (without the quotes though). If nagios strips
some characters you can redirect that to a file too.

> The version of the used check_http plug-in is:
> 
> [nagios at nagsaz:~]
> $ /opt/nagios/plugins/libexec/check_http -V
> check_http v2053 (nagios-plugins 1.4.13)
> 
> 
> Beyond this I noticed that somehow external commands related to this service check don't seem to be executed by the scheduler.
> 
> 
> [nagios at nagsaz:~]
> $ printf "[%lu] SCHEDULE_SVC_CHECK;uranus;aDISWeb Tomcat;%lu\n" $(date +%s) $(($(date +%s)+120))>/opt/nagios/var/rw/nagios.cmd 
> 
> [nagios at nagsaz:~]
> $ grep aDISWeb\ Tomcat /var/log/nagios/nagios.log 
> [1238536800] CURRENT SERVICE STATE: uranus;aDISWeb Tomcat;CRITICAL;HARD;5;HTTP CRITICAL - pattern not found
> [1238578049] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;uranus;aDISWeb Tomcat;1238578169
> [nagios at nagsaz:~]
> $ perl -le 'print scalar localtime 1238578169'
> Wed Apr  1 11:29:29 2009
> [nagios at nagsaz:~]
> $ date
> Wed Apr  1 11:27:57 CEST 2009
> 
> 
> ...two minutes later (or even hours later, makes no difference)
> 
> [nagios at nagsaz:~]
> $ date
> Wed Apr  1 11:30:39 CEST 2009
> [nagios at nagsaz:~]
> $ grep aDISWeb\ Tomcat /var/log/nagios/nagios.log 
> [1238536800] CURRENT SERVICE STATE: uranus;aDISWeb Tomcat;CRITICAL;HARD;5;HTTP CRITICAL - pattern not found
> [1238578049] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;uranus;aDISWeb Tomcat;1238578169
> 
> 
> It's no difference whether I reschedule the check via the web interface or pipe it into the FIFO as done above.
> 
> So why does an externally enforced rescheduling of the check then doesn't get executed?
> Is this due to some clever scheduling algorithm that I have missed here so far, or some neglected config setting?

It probably does, but unless you enable state stalking Nagios only logs
state changes.

The last run time is in the status.dat file, and can be seen from the
web interface.


- --
Thomas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJ00yp6dZ+Kt5BchYRAp9SAJ0bsV2TXLqrgbHn/kZjpyiyBycvbQCeLrhy
8nY3SAvGkq4SuVdz+mgCTts=
=MXki
-----END PGP SIGNATURE-----




More information about the Help mailing list