[Nagiosplug-help] RE: [Nagios-users] *** RESOLVED***check_snmp CPU Load strange result

Pascal Wessel pascal.wessel at media-online.ch
Tue Dec 3 23:36:03 CET 2002


Dear Subhendu,

Indeed the new beta2 plugin check_snmp is working as expected :-)

Now I have OK, WARNING and CRITICAL status, depending on the check I
perform.

Thanks a lot for your kind help, time, and advise.
Warm regards,
Pascal Wessel


-----Original Message-----
From: Subhendu Ghosh [mailto:sghosh at sghosh.org] 
Sent: mardi, 3. décembre 2002 17:55
To: Pascal Wessel
Subject: RE: [Nagios-users] check_snmp CPU Load strange result


could you try the plugins from beta2 - a few changes were made on 
check_snmp. (it will probably get re-writen for beta3)

-sg


On Tue, 3 Dec 2002, Pascal Wessel wrote:

> You asked: Please post the version of the plugin/os/net-snmp
> 
> Hummm... Ok, I say it again: (I copy-past the needed info from my 
> first mail. Btw it's at the very end of this thread I agree)
> 
> ---snip---
> > My Nagios system installation is as follows:
> > 
> > System Intel i686, Mandrake 9.0, Kernel 2.4.19-16
> > NAGIOS: Nagios 1.0b6
> > Plugins: nagios-plugins-200211131100
> > Check_snmp: Revision: 1.17
> > SNMP:
> > 	libsnmp0-4.2.3-4mdk
> > 	ucd-snmp-4.2.3-4mdk
> > 	ucd-snmp-utils-4.2.3-4mdk
> ---snip---
> I had this:
> [libexec]# ldd check_snmp
>         libutil.so.1 => /lib/libutil.so.1 (0x40023000)
>         libc.so.6 => /lib/i686/libc.so.6 (0x40026000)
>         /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
> 
> check_snmp treats bare numbers in warning and critical as upper 
> bounds, so -w'60,69'  is interpreted as warn on 0-60 for the first and

> 0-69 for the
> 
> second...
> 
> Very interesting... I did't catch it at first.
> 
> I made some tests with different values for warning and/or critical 
> thresholds... Very stange, I only have WARNING status returning from 
> these ones. To clarify a bit I made tests with only one OID (The 1 min

> CPU load), with same result: WARNING Always returns. By reading 
> check_snmp --help I saw that I can use absolute values (like -w 
> '0:30') if colons are used.
>   Ranges are inclusive and are indicated with colons. When specified
as
>   'min:max' a STATE_OK will be returned if the result is within the 
> indicated
>   range or is equal to the upper or lower bound. A non-OK state will
be
>   returned if the result is outside the specified range.
> 
> Then....some tests again....
> Eg (with the 1 min CPU load OID):
> 
> ./check_snmp -v -t 10 -H 192.168.1.1 -o .1.3.6.1.4.1.9.2.1.57.0 -C 
> publicro -w '50:69' -c'70:99'
>              enterprises.9.2.1.57.0 = 4
> 
>              SNMP WARNING - *4*
> That's ok... 4% is not in the range 50:69... Then it's a "non-OK" 
> result, thus the WARNING for the -w switch.
> 
> ./check_snmp -v -t 10 -H 192.168.1.1 -o .1.3.6.1.4.1.9.2.1.57.0 -C 
> publicro -w '1:9' -c'10:99'
>              enterprises.9.2.1.57.0 = 7
> 
>              SNMP WARNING - *7*
> 7% is within the range 1:9... I should have a STATE_OK. 
> GrrrRRRrrrRRRRRR %#*$& !!! Why do I have WARNING ?????
> 
> 
> Well, any help appreciated. Thanks in advance.
> Warm regards,
> Pascal
> 
> 
> 
> 
> -sg
> 
> On Mon, 2 Dec 2002, Pascal Wessel wrote:
> 
> > Nagios gives me warning when snmp_check 'ing for Cisco 3640 CPU load

> > /
> 
> > IOS is (C3640-IK9O3S-M), Version 12.2(10a) but the CPU load is below
> > my Warning threshold.
> > 
> > When launched from the command-line with verbose output:
> > 
> > [libexec]# ./check_snmp -v -t 10 -H 192.168.1.1 -o
> > .1.3.6.1.4.1.9.2.1.57.0,.1.3.6.1.4.1.9.2.1.58.0 -C publicro -w 
> > '60,69', -c  '70,80' -l 'CPU usage 1min/5min' -D ' / '
> > /usr/bin/snmpget -m ALL -v 1 -c publicro 192.168.1.1:161
> > .1.3.6.1.4.1.9.2.1.57.0 .1.3.6.1.4.1.9.2.1.58.0
> > enterprises.9.2.1.57.0 = 4
> > enterprises.9.2.1.58.0 = 3
> > 
> > CPU usage 1min/5min WARNING - *4* / *3*
> > 
> > As you can see.. (and if I understood the syntax)
> > Warning status should be triggered when the CPU load is between 60 
> > and
> 
> > 69% Critical status should be triggered when the router CPU is 
> > between
> 
> > 70 to 80%
> > 
> > #----
> > My question is: why this check reports WARNING as my router CPU load
> > (4% last minute and 3% last 5 min) is below the WARNING threshold ?
> > #----
> > 
> > My Nagios system installation is as follows:
> > 
> > System Intel i686, Mandrake 9.0, Kernel 2.4.19-16
> > NAGIOS: Nagios 1.0b6
> > Plugins: nagios-plugins-200211131100
> > Check_snmp: Revision: 1.17
> > SNMP:
> > 	libsnmp0-4.2.3-4mdk
> > 	ucd-snmp-4.2.3-4mdk
> > 	ucd-snmp-utils-4.2.3-4mdk
> > 
> > Below a snip of my "cfg file
> > 
> > #--- hosts.cfg for myrouter
> > 
> > define host {
> > name                           		generic-host     
> > notifications_enabled          	1                ; Host
> notifications
> > are enabled
> > event_handler_enabled          	1                ; Host event
> handler is
> > enabled
> > flap_detection_enabled         	1                ; Flap
> detection is
> > enabled
> > process_perf_data              	1                ; Process
> performance
> > data
> > retain_status_information      	1                ; Retain status
> > information across program restarts
> > retain_nonstatus_information   	1                ; Retain
> non-status
> > information across program restarts
> > max_check_attempts             	10
> > register                       0                ; DONT REGISTER THIS
> > DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
> > }
> > 
> > define host {
> > use                            		generic-host
> ; Name
> > of host template to use
> > host_name                      	myrouter
> > alias                          		Router Gva Coulou -6
> > address                        		192.168.1.1
> > check_command                  	check-host-alive
> > notification_interval          	60
> > notification_period            	24x7
> > notification_options          	d,u,r
> > }
> > 
> > #--- services.cfg
> > define service {
> > name                           		generic-service  ; 
> > active_checks_enabled          	1        ; Active service checks
> are
> > enabled
> > passive_checks_enabled         	1        ; Passive service
> checks are
> > enabled/accepted
> > parallelize_check              	1        ; Active service checks
> should
> > be parallelized 
> > obsess_over_service            	1        ; We should obsess over
> this
> > service (if necessary)
> > check_freshness                	0        ; Default is to NOT
> check
> > service 'freshness'
> > notifications_enabled          	1        ; Service notifications
> are
> > enabled
> > event_handler_enabled          	1        ; Service event handler
> is
> > enabled
> > flap_detection_enabled         	1        ; Flap detection is
> enabled
> > process_perf_data              	1        ; Process performance
> data
> > retain_status_information      	1        ; Retain status
> information
> > across program restarts
> > retain_nonstatus_information   	1        ; Retain non-status
> information
> > across program restarts
> > normal_check_interval          	5
> > retry_check_interval           	2
> > notification_period            	24x7
> > notification_options           	u,c,r
> > register                       		0        ; DONT REGISTER
> THIS
> > DEFINITION
> > }
> > 
> > define service{
> > use                             		generic-service
> > host_name                       	myrouter
> > service_description             	CPU
> > is_volatile                     		0
> > check_period                    	24x7
> > max_check_attempts              	3
> > retry_check_interval            	1
> > contact_groups                  	router-admins
> > notification_interval           	120
> > notification_period             	24x7
> > check_command
> > check_cisco_cpu!publicro!60!69!70!80
> > }
> > 
> > 
> > 
> > #--- checkcommands.cfg
> > # 'check_snmp' generic command definition
> > define command{
> > command_name    check_snmp
> > command_line    $USER1$/check_snmp -t 10 -H $HOSTADDRESS$ -C $ARG1$
> > $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$ $ARG7$ $ARG8$ $ARG9$
> > }
> > # check_cisco_cpu: checks router CPU-usage
> > # Syntax
> > !Hostname!Community!WARN-1min-%!WARN-5min-%!CRIT-1min-%!CRIT-5min-%
> > define command{
> > command_name    check_cisco_cpu
> > command_line    $USER1$/check_snmp -t 10 -H $HOSTADDRESS$
> > -o.1.3.6.1.4.1.9.2.1.57.0,.1.3.6.1.4.1.9.2.1.58.0 -C $ARG1$ -w 
> > :$ARG2$,:$ARG3$ -c : $ARG4$,:$ARG5$ -l 'CPU usage 1min/5min' -D ' /
'
> > }
> > 
> > 
> > 
> > Btw, by looking at the code in check_snmp.c I'm wondering . Is there

> > a
> 
> > problem with : #define mark(a) ((a)!=0?"*":"") in check_snmp.c ??? 
> > Or
> > are my parms so bad ? :-o
> > 
> > Thanks for your kind help.
> > Warm regards,
> > Pascal
> > 
> > 
> > 
> > -------------------------------------------------------
> > This sf.net email is sponsored by:ThinkGeek
> > Welcome to geek heaven.
> > http://thinkgeek.com/sf
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > 
> 
> 

-- 







More information about the Help mailing list