[Nagiosplug-devel] [ nagiosplug-Bugs-1864225 ] check_procs under SunOS 5.10 broken

SourceForge.net noreply at sourceforge.net
Wed Jan 9 17:57:58 CET 2008


Bugs item #1864225, was opened at 2008-01-04 22:37
Message generated for change (Comment added) made by valloo99
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=397597&aid=1864225&group_id=29880

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Compilation
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: gerhard lausser (lausser)
Assigned to: Nobody/Anonymous (nobody)
Summary: check_procs under SunOS 5.10 broken

Initial Comment:
Hi,

i just compiled the 4.11 plugins on a 
$ uname -a
SunOS spnb51 5.10 Generic_118833-24 sun4v sparc SUNW,Sun-Fire-T200

and found that check_procs does not work correctly.

$ ps -ef | grep cron
    root 15100     1   0   Nov 08 ?           1:06 /usr/sbin/cron

$ /usr/ucb/ps -alxwwn | grep cron
 0     0 15100     1  0  59 20 2784 1048 60015774482 S ?         1:06 /usr/sbin/cron

Cron is running, but....

check_procs -w :10 -c 1: -C cron
PROCS CRITICAL: 0 processes with command name '/usr/sbin/cron'

Running check-procs with -vvv shows, that it uses 
/usr/ucb/ps -alxwwn 
and actually parses the right line
proc#=0 uid=0 vsz=2784 rss=1048 pid=15100 ppid=1 pcpu=0.00 stat=S etime= prog= args=/usr/sbin/cron

but as you see, prog is empty and what should be prog is recognized as argument.

Please change the configure.in (the buggy lines in configure are 24967 ff) so that it reads:

ac_cv_ps_varlist="&procuid,&procpid,&procppid,&procpcpu,&procvsz,&procrs
s,procstat,procprog,&pos"
ac_cv_ps_command="/usr/ucb/ps -alxwwn"
ac_cv_ps_format="%*s %d %d %d %d %*d %*d %d %d%*[ 0123456789abcdef]%[OSR
ZT]%*s %*s %s %n"
ac_cv_ps_cols=9

(add procprog to the varlist, add a %s at the end of format and increase cols)

Then it looks good:
$ check_procs -w :10 -c 1: -C cron
PROCS OK: 1 process with command name 'cron'


Greetings from Munich,
Gerhard

----------------------------------------------------------------------

Comment By: Vincent Alloo (valloo99)
Date: 2008-01-09 17:57

Message:
Logged In: YES 
user_id=1977333
Originator: NO

Fro info, this config is working for both Solaris8 and Soalris 10:

./configure --with-ps-command="/usr/bin/ps -eo 's uid pid ppid vsz rss
pcpu etime comm args'" \
            --with-ps-format='%s %d %d %d %d %d %f %s %s %n' \
            --with-ps-cols=10 \
           
--with-ps-varlist='procstat,&procuid,&procpid,&procppid,&procvsz,&procrss,&procpcpu,procetime,procprog,&pos'

Hope it helps.

----------------------------------------------------------------------

Comment By: gerhard lausser (lausser)
Date: 2008-01-09 16:47

Message:
Logged In: YES 
user_id=613416
Originator: YES

You are right, when SZ and/or RSS grow too large, ther is no space left
between NI and SZ or between SZ and RS:
Not parseable:  0     0 16901     1  0  59 205028812184 30004e2e1a6 S ?   
    278:36 /opt/VRTSsmf/bin/vxsmf.bin -p RootSMF -B

I replaced
ac_cv_ps_format="%*s %d %d %d %d %*d %*d %d %d%*[......
with
ac_cv_ps_format="%*s %d %d %d %d %*d %*2d %5d %d%*[....
and it successfully scanned these lines. (nice can have a maximum of 2
digits, and size can have a maximum of 5 digits. at least when the values
still fit into the columns. maybe there are situations, where even the
columns don't fit any more, then i think we have no chance)

But then i found another one:
Not parseable:  0     0 16101   351  0  54 20 3104 2664        ? S ?      
 0:00 /usr/openv/netbackup/bin/bpcd

Here we have a WCHAN value of "?". Adding a ? to the conversion shoud
should solve this. So i propose to change line 521 of configure.in to:
 
ac_cv_ps_format="%*s %d %d %d %d %*d %*2d %5d %d%*[
0123456789abcdef?]%[OSRZT]%*s %*s %s %n"

Gerhard

----------------------------------------------------------------------

Comment By: Vincent Alloo (valloo99)
Date: 2008-01-09 15:22

Message:
Logged In: YES 
user_id=1977333
Originator: NO

Becarefull, the "/usr/ucb/ps -alxwwn" can be non-parseable on some
system:

 0 18046 20095 20093  0  59 20 3056 1996 fffffe8d3efb8776 S pts/9     0:00
-tcsh

 0     0   911  3293  0  59 20837008 1188 ffffffff85c0af42 S ?        
0:00 /usr/local/bin/rsync -aWz -v --stats --progress
--rsync-path=/apps/free/rsync/2

I still don't have a working check_procs for sol10/x86.


----------------------------------------------------------------------

Comment By: gerhard lausser (lausser)
Date: 2008-01-05 02:01

Message:
Logged In: YES 
user_id=613416
Originator: YES

Hi Matthias,
the last release i compiled on this machine was 1.4.2 and at that time
configure decided to use /usr/bin/ps. I can try a 1.4.10 tomorrow but i am
sure, it is also broken.
I currently look at configure.in line 519. Particularly noticeable is the
missing procprog variable in the ac_cv_ps_format. 
As one can see in the egrep line above, there _is_ a COMMAND field in the
output, but the sscanf will not catch it.

Gerhard

p.s. this ps_format is not used for SunOS in general, only for SunOS whose
/usr/ucb/ps -alxwwn shows the expected output format.

----------------------------------------------------------------------

Comment By: Matthias Eble (psychotrahe)
Date: 2008-01-05 01:13

Message:
Logged In: YES 
user_id=1694341
Originator: NO

Hi Gerhard,

Are you sure this is valid for all operating systems where uname -s
returns SunOS?
I guess this (mis-)behaviour is not new in 1.4.11, right?

Matthias

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=397597&aid=1864225&group_id=29880




More information about the Devel mailing list