[Nagiosplug-help] Broken check_procs or malusage?

Ralph.Grothe at itdz-berlin.de Ralph.Grothe at itdz-berlin.de
Wed Feb 20 10:46:35 CET 2008


Hi Mike,

yes, I did compile the nagios-plugins-1.4 from the sources on
HP-UX B.11.11
with the HP ANSI C compiler.

These are the ps-related  lines from my config.h:

<code_snippet>

/* Number of columns in ps command */
#define PS_COLS 5

/* Verbatim command to execute for ps in check_procs */
#define PS_COMMAND "/usr/bin/ps -el (AIX 4.1 and HP-UX)"

/* Format string for scanning ps output in check_procs */
#define PS_FORMAT "%*s %s %d %*s %d %*s %*s %*s %*s %*s %*s %*s
%*s %n%s"

/* Variable list for sscanf of 'ps' output */
#define PS_VARLIST procstat,&procuid,&procppid,&pos,procprog

</code_snippet>


What irritates me is that PS_COLS is only 5
wheras the hpux ps -l outputs 14 fields

# ps -l|head -1      
  F S        UID   PID  PPID  C PRI NI             ADDR   SZ
WCHAN TTY       TIME COMD

# ps -l|head -1|wc -w
14

On the other hand PS_FORMAT does have 14 format specifiers.
But PS_VARLIST again only has 5 args.

According to man ps HP-UX 11.11 these are the fields that -l
shows:

<man_snippet>

           -l             Show columns flags, state, uid, pid,
ppid, cpu,
                          intpri, nice, addr, sz, wchan, tty,
time, and
                          comm, in that order.
</man_snippet>

Please, see also

http://nixdoc.net/man-pages/HP-UX/ps.1.html


On the other hand I wonder why the check_proc plugin doesn't call
the pstat_getproc()
syscall?
see
http://nixdoc.net/man-pages/HP-UX/pstat_getcommandline.2.html
Admittedly, it is much easier for the nagios-plugins developers
to parse ps output than
having to provide different libc calls for every conceivable Unix
variant.



So, do you think I should manually extend PS_VARLIST to cater for
all 14 entries?

I don't need to recompile all the plugins but just check_proc.c
(at least for now).
So I assume after having deleted its executable and object file
a mere "make check_proc" would be all that was required, right?

Many thanks for your help.
I hope you don't mind that I Cc'ed to the list.


Ralph


> -----Original Message-----
> From: Mike Hamrick [mailto:mikeh at bluegecko.net]
> Sent: Tuesday, February 19, 2008 6:59 PM
> To: Grothe, Ralph
> Subject: Re: [Nagiosplug-help] Broken check_procs or malusage?
> 
> 
> 
> On Feb 19, 2008, at 9:33 AM, <Ralph.Grothe at itdz-berlin.de> 
> <Ralph.Grothe at itdz-berlin.de 
>  > wrote:
> > Hm, I wonder how on earth the plugin should deduce the cpu
usage
> > from an ps -el alone?
> 
> It's not, which is probably why it's not working ;)
> 
> Did you build the plugins from source?  If so, take a look at
the  
> config.h that ./configure generated.  Here is what mine on 
> linux looks  
> like:
> 
> #define PS_COLS 9
> #define PS_COMMAND "/bin/ps axwo 'stat uid pid ppid vsz rss 
> pcpu comm  
> args'"
> #define PS_FORMAT "%s %d %d %d %d %d %f %s %n"
> #define PS_VARLIST  
> procstat 
>
,&procuid,&procpid,&procppid,&procvsz,&procrss,&procpcpu,procprog
,&pos
> 
> Those #defines configure how it parses the ps(1) output into  
> variables.  You could hand manipulate this header file and
rebuild  
> check_nagios.
> 
> The check_nagios.c code later uses these constants in a sscanf
call  
> when it parses the output.
> 
> Mike
> >
> >
> >
> > $ uname -srvm && ps -el|awk 'NR==1||$NF=="java"'
> > HP-UX B.11.11 U 9000/800
> >  F S        UID   PID  PPID  C PRI NI             ADDR   SZ
> > WCHAN TTY       TIME COMD
> > 401 R      56199  2710  2709  0 152 20         9e985800 16651
> > - ?        745:03 java
> > 401 R      52399  5523  5522  0 152 20         9fab8d80 16651
> > - ?        772:01 java
> > 401 R      54900  5536  5535  0 152 20         8828ad00 17675
> > - ?        689:51 java
> > 401 R        250  3927  3926  0 152 20         9f1e3580 14603
> > - ?        316:39 java
> > 401 R      29499  3905  3904  0 152 20         8828abc0 14603
> > - ?        284:38 java
> > 401 R      58699  5553  5552  0 152 20         88260d00 16651
> > - ?        716:01 java
> > 401 R      29999  2858  2857  0 152 20         9f5fdd80 15627
> > - pts/6     5:50 java
> >
> >
> > Comparing the fields from ps of different Unices there at
least
> > seems to be some common denomenator
> > (maybe some would say thanks to POSIX?)
> >
> > $ uname -srv && ps -el|head -1
> > AIX 3 4
> >       F S      UID   PID  PPID   C PRI NI ADDR    SZ    WCHAN
> > TTY  TIME CMD
> >
> >
> > $ uname -srv && ps -el|head -1
> > Linux 2.6.18-8.el5 #1 SMP Fri Jan 26 14:15:21 EST 2007
> > F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY
TIME
> > CMD
> >
> >
> > $ uname -srvm && ps -el|head -1
> > SunOS 5.8 Generic_117350-39 sun4us
> > F S   UID   PID  PPID  C PRI NI     ADDR     SZ    WCHAN TTY
> > TIME CMD
> >
> >
> >
> >> -----Original Message-----
> >> From: Mike Hamrick [mailto:mikeh at bluegecko.net]
> >> Sent: Tuesday, February 19, 2008 6:06 PM
> >> To: Grothe, Ralph
> >> Subject: Re: [Nagiosplug-help] Broken check_procs or
malusage?
> >>
> >>
> >> On Feb 19, 2008, at 8:21 AM, <Ralph.Grothe at itdz-berlin.de>
> > wrote:
> >>> I wonder why this simple check for cpu usage percentage of
> > Java
> >>> master threads fails?
> >>
> >> I found that using check_procs with -vvv gives you a bunch
of
> >> interesting debugging info, including the ps(1)  command
line.
> > It
> >> also outputs what it parses out of the results.
> >>
> >> In my case, I found a problem with -m ELAPSED not working
due
> > to ./
> >> configure not generating the right command line for ps(1).
> >>
> >>> Is my invocation wrong?
> >>> $ check_procs -w 3 -c 10 -m CPU -C java
> >>> CPU OK: 7 processes with command name 'java'
> >>
> >> Looks right to me.  It works for me:
> >>
> >> [nagios at xenu local]$ /usr/lib/nagios/plugins/check_procs -C
> >> nagios -m
> >> CPU -w 6 -c 50
> >> CPU WARNING: 1 warn out of 18 processes with command name
> > 'nagios'
> >> [nagios at xenu local]$ ps ax -opcpu,cmd | sort -n | tail -1
> >>  7.3 /home/nagios/bin/nagios -d /home/nagios/etc/nagios.cfg
> >>
> >> Mike
> >>
> >>
> 
> 




More information about the Help mailing list