[Nagiosplug-help] check_by_ssh

Earl C. Ruby III earl at switchmanagement.com
Mon Jun 30 13:11:08 CEST 2003


I need to check several servers that do not (and will not) have inetd or 
xinetd installed, and where I don't want to run nrpe as a stand-alone daemon 
for security reasons.

Since we do have ssh, I've set up a limited-rights account on these servers 
that I allow nagios to ssh to. Using check_by_ssh, I can shell to the remote 
host and run any of the check commands I need to:

define command{
        command_name    check_zombie_procs
        command_line    $USER1$/check_by_ssh -t 15 -H username@$HOSTADDRESS$ 
-C '~/bin/check_procs -w 5 -c 10 -s Z'
}

define command{
        command_name    check_total_procs
        command_line    $USER1$/check_by_ssh -t 15 -H username@$HOSTADDRESS$ 
-C '~/bin/check_procs -w 150 -c 200'
}

Note that both of the above examples use check_procs within check_by_ssh.

This works great on every machine that I run this on, with the exception of 
three servers. On those three servers, I often (as in 4 out of 5 tries) get 
an "unknown" response for check_zombie_procs and check_total_procs -- but all 
other check_by_ssh commands work just fine on these servers. I also get 
"unknown" errors on the other servers I use this command on, but not as often 
as on my three "problem" servers.

When I run the same command by hand (log in on Nagios server, su - nagios, 
type in check_by_ssh command on the command line) I get an instant OK 
response back from the remote server, it's only when Nagios runs the plugins 
that I get "unknown" responses, and only when I'm executing check_procs 
inside of check_by_ssh. None of the other check_??? plugins have this problem 
when run inside of check_by_ssh.

I've checked /var/log/messages, and it shows ssh connecting with no errors. 
The "Status Information" column in Nagios consistently says something like 
"OK - 110 processes running" even though the status is "Unknown".

Has anyone had a similar experience? I'm about to stop using check_procs 
altogether unless I can figure out why I keep getting "Unknown" errors.





More information about the Help mailing list