[Nagiosplug-devel] [ nagiosplug-Bugs-1993363 ] check_procs times out on Solaris 10

SourceForge.net noreply at sourceforge.net
Mon Sep 8 21:52:03 CEST 2008


Bugs item #1993363, was opened at 2008-06-13 21:58
Message generated for change (Comment added) made by tonvoon
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=397597&aid=1993363&group_id=29880

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General plugin execution
Group: snapshot tarball
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: maemigh (maemigh)
Assigned to: Ton Voon (tonvoon)
Summary: check_procs times out on Solaris 10

Initial Comment:
I'm having problems with the latest snapshot of check_procs timing out.  The version is v1991 (nagios-plugins 1.4.12). This is using pst3 included with the plugins.

./check_procs
CRITICAL - Plugin timed out after 10 seconds

./check_procs -w 2:2 -c 2:2 -C nagios
CRITICAL - Plugin timed out after 10 seconds

I've tried sparc and x86 builds, both timeout:
Solaris 5.10 Generic_118833-36 sparc
Solaris Generic_127112-11 i386



----------------------------------------------------------------------

>Comment By: Ton Voon (tonvoon)
Date: 2008-09-08 20:52

Message:
maemigh,

Can you try this with the latest snapshot. pst3 has been updated to avoid
using /dev/kmem so it should now work for zones. 

Ton

----------------------------------------------------------------------

Comment By: maemigh (maemigh)
Date: 2008-06-20 15:40

Message:
Logged In: YES 
user_id=1520524
Originator: YES

SVN from 6-19 is causing a segfault on one of our Solaris 9 servers:
Here is some output from truss:
stat("/usr/platform/SUNW,Sun-Fire-V240/lib/sparcv9/libkvm_psr.so.1",
0xFFFFFFFF7FFFE8D0) Err#2 ENOENT
brk(0x100102520)                                = 0
brk(0x100106520)                                = 0
stat("/dev/kmem", 0xFFFFFFFF7FFFF2D0)           = 0
stat("/dev/mem", 0xFFFFFFFF7FFFF250)            = 0
stat("/dev/kmem", 0xFFFFFFFF7FFFF1D0)           = 0
stat("/dev/allkmem", 0xFFFFFFFF7FFFF150)        = 0
open("/dev/kmem", O_RDONLY)                     = 3
open("/dev/mem", O_RDONLY)                      = 4
open("/dev/ksyms", O_RDONLY)                    = 5
read(5, "7F E L F020201\0\0\0\0\0".., 16)       = 16
lseek(5, 0, SEEK_SET)                           = 0
lseek(5, 0, SEEK_END)                           = 895494
mmap(0x00000000, 895494, PROT_READ, MAP_PRIVATE, 5, 0) =
0xFFFFFFFF7E300000
munmap(0xFFFFFFFF7E300000, 895494)              = 0
close(5)                                        = 0
pread(3, "\0\0030E92 " u p", 8, 0x0142C300)     = 8
pread(3, "\0\0030E92 " u p", 8, 0x0142C300)     = 8
ioctl(1, TCGETA, 0xFFFFFFFF7FFFE14C)            = 0
fstat(1, 0xFFFFFFFF7FFFE0E0)                    = 0
S   UID   PID  PPID    VSZ    RSS %CPU COMMAND ARGS
write(1, " S       U I D       P I".., 52)      = 52
pread(3, "\0\003\005F7 5 0\0\003\0".., 2584, 0x30E92227570) = 2584
pread(3, "\0\0 Z ?", 4, 0x30003070C4C)          = 4
pread(3, "\0\0\0 R\0\0 Z ?\0\0\0\0".., 32, 0x30003070C48) = 32
open("/proc/23103/as", O_RDONLY)                = 5
pread(5, "FFFFFFFF7FFFFD\0\0\0\0\0".., 1240, 0xFFFFFFFF7FFFFB28) = 1240
close(5)                                        = 0
open("/proc/23103/psinfo", O_RDONLY)            = 5
read(5, "\b\0C4 H\0\0\001\0\0 Z ?".., 416)      = 416
close(5)                                        = 0
O     0 23103 23102   2096   1160  0.1 pst3  ./pst3
write(1, " O           0   2 3 1 0".., 52)      = 52
pread(3, "\0\003\01CE6D6 8\0\003\0".., 2584, 0x3000513A120) = 2584
pread(3, "\0\0 Z >", 4, 0x3000316EA84)          = 4
pread(3, "\0\0\0 u\0\0 Z >\0\0\0\0".., 32, 0x3000316EA80) = 32
open("/proc/23102/as", O_RDONLY)                = 5
pread(5, "FFFFFFFF7FFFFCE8FFFFFFFF".., 1272, 0xFFFFFFFF7FFFFB08) = 1272
close(5)                                        = 0
open("/proc/23102/psinfo", O_RDONLY)            = 5
read(5, "\b02 @\b\0\0\002\0\0 Z >".., 416)      = 416
close(5)                                        = 0
S     0 23102 22954   2800   1784  0.1 truss  truss ./pst3
write(1, " S           0   2 3 1 0".., 59)      = 59
pread(3, "\0\003\001 -A8 p\0\003\0".., 2584, 0x3000253B450) = 2584
pread(3, "\0\0 YAA", 4, 0x30002D0DF6C)          = 4
pread(3, "\0\0\019\0\0 YAA\0\0030E".., 32, 0x30002D0DF68) = 32
open("/proc/22954/as", O_RDONLY)                = 5
pread(5, "FFBFFF p\0\0\0\0FFBFFF t".., 188, 0xFFBFFF44) = 188
close(5)                                        = 0
brk(0x100106520)                                = 0
brk(0x10091E520)                                = 0
    Incurred fault #6, FLTBOUNDS  %pc = 0xFFFFFFFF7F400648
      siginfo: SIGSEGV SEGV_MAPERR addr=0x101131598
    Received signal #11, SIGSEGV [default]
      siginfo: SIGSEGV SEGV_MAPERR addr=0x101131598


----------------------------------------------------------------------

Comment By: maemigh (maemigh)
Date: 2008-06-19 19:31

Message:
Logged In: YES 
user_id=1520524
Originator: YES

PS_COMMAND was set, but it was set to a location that didn't exist.  I was
running the check_procs command without first doing a make install.  I'm
thinking that the plugin should probably report that
/usr/local/nagios/libexec/pst3 doesn't exist rather than timing out after
10 seconds.   I ran into another problem with pst3 in that it will not run
inside a Solaris zone (as /dev/kmem does not exist) -- are there plans to
make changes to allow for use in zones?

----------------------------------------------------------------------

Comment By: Ton Voon (tonvoon)
Date: 2008-06-17 10:23

Message:
Logged In: YES 
user_id=664364
Originator: NO

Sorry, misread your report. So are you saying that PS_COMMAND is not set?

----------------------------------------------------------------------

Comment By: Ton Voon (tonvoon)
Date: 2008-06-17 10:22

Message:
Logged In: YES 
user_id=664364
Originator: NO

Hi maemigh,

The issue is that pst3 times out as it is taking too long to query. We've
found this on a master host with multiple zones. Please try the snapshot at
http://nagiosplug.sf.net/snapshot as pst3 has been optimised to make less
kvm calls.

It would be useful if you could give us timings before and after the
snapshot.

Beware, we've recently found an issue where it can coredump if a process
disappears as it is trying to access it - a fix is due soon.

Ton

----------------------------------------------------------------------

Comment By: maemigh (maemigh)
Date: 2008-06-16 21:32

Message:
Logged In: YES 
user_id=1520524
Originator: YES

Had time to do a little more digging.  This happens if the file pointed to
by PS_COMMAND does not exist.  There do not appear to be any checks within
spopen to handle a return code from execve in the event of an error.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=397597&aid=1993363&group_id=29880




More information about the Devel mailing list