[Nagiosplug-devel] a plugin to check interface errors

atonns at mail.ivillage.com atonns at mail.ivillage.com
Tue Aug 26 08:17:14 CEST 2003


I have been able to solve my problems.

Basically when the data is a COUNTER type (as ifInErrors and ifOuterrors
are), rrdtool only stores rate information. In my case is was average number
of errors in 10 minutes (as this was my step value). Thus, to
psuedo-calculate the number of errors, just multiply the rate by the step
(ie: # of err/sec * sec = # of err). Of course, it's an approximation based
on the average - but it should accurately determine if there's a problem
with the interface.

So finally, my plugin runs as such:

Usage: check_remote_interfaces -H hostname  -w=count,seconds -c=rate,minutes
Usage: check_remote_interfaces --hostname=hostname  --warning=count,seconds
--critical=rate,minutes
                               [-v|--verbose -V|--version -h|--help]

The plugin will return warning if there are at least "count" errors in the
last "seconds" (ie: I'm using -w5,1800 - so it's a total of 5 errors within
the last 30 minutes) or return critical if the error rate is "rate" in the
last "minutes" (ie: I'm using -c 3,180 - so it's an error rate of 3 per
minute over the last 3 hours).

The only value I've 'hard coded' (and could easily make a cmdline option) is
the expected interval of how often data is collected - ie: the rrd step
value. This would normally be the value in "normal_check_interval", but
since that isn't available as a macro it has to be user-input.

Finally, my weird float/int problem was due to the fact that I had invalid
rrd data. I was trying to create fake data to simulate interface errors, and
since my scripts updates the data as well as analyzing it, it was updating
with bad data. Basically, since interface errors are a counter type, once X
errors have occurred, the counter will always be X or greater (if there are
more errors) until the counter rolls over the 32bit/64bit integer. Since I
was simulating errors, and then making a real reading (ie: currently 0
errors) it was assuming I rolled the counter and hence my wacky value.

This presents a problem on how to handle "normal" counter reset events - ie:
the rebooting of machine or running the "clear counters" command on a
router. I've yet to work this out.

Tony

P.S. I looks like check_rrdtool-0.3.pl (I assume from this page:
http://www.nachtwache.org/projects/netsaint/plugins/) only checks data of
type GAUGE (ie: CPU) and not of type COUNTER (ie: ifInErrors, ifOutErrors,
etc). You'd need to make a call to "rrdtool info filename.rrd" and look at
the value 'ds[a].type = "GAUGE"', and then modify the calculations
appropriately.

--
"Computer science is as much about computers as
        astronomy is about telescopes" -- Edsger Dijkstra
---------------------------------------------------------
Anthony Tonns, UNIX Administrator - atonns at mail.ivillage.com


> -----Original Message-----
> From: Michael Markstaller [mailto:mm at elabnet.de]
> Sent: Thursday, August 14, 2003 11:57 AM
> To: nagiosplug-devel at lists.sourceforge.net
> Subject: RE: [Nagiosplug-devel] a plugin to check interface errors
> 
> 
> I already tried the same with some similar approaches but failed. 
> The data collection for mine into the rrd's is done by mrtg, 
> then using the somehow only "very little working" 
> check_rrdtool-0.3.pl to check the threshold. It works for my 
> CPU-rrds but not for iferrors somehow..
> 
> I'd really appreciate if there'd be a _working_ check_rrd in 
> future as with such a plugin several of these checks could be 
> performed very easy.
> 
> Michael
> 
> -----Original Message-----
> From: atonns at mail.ivillage.com [mailto:atonns at mail.ivillage.com]
> Sent: Thursday, August 14, 2003 4:40 PM
> To: nagiosplug-devel at lists.sourceforge.net
> Subject: [Nagiosplug-devel] a plugin to check interface errors
> 
> 
> I'm wondering if someone has written a plugin to check for interface
> errors.
> I have already looked at check_ifoperstatus and 
> check_ifstatus - and I've
> already got my own versions of them for SNMPv3.
> 
> What I'm looking for is a way to detect interface errors. 
> I've got some
> perl
> code I'm still working on, but the data storage is killing 
> me. To explain:
> 
> Basically, I'm walking the MIBs 1.3.6.1.2.1.2.2.1.14 (ifInErrors) and
> 1.3.6.1.2.1.2.2.1.20 (ifOutErrors) every 10 minutes and 
> storing them each
> in
> an different rrd (a file named $hostname-$ifDescr.rrd). Then, I'm rrd
> fetching the last 60 minutes worth of data - if there's at least
> $warning_cnt errors in that time period, return warning. 
> Finally, I'm rrd
> fetching the last 180 minutes (3 hrs) worth of data - if 
> there's an error
> rate of at least $critical_rate per minute, return critical.
> 
> I'm having problems with storing the rrd data with the RRDs 
> perl module -
> I'm trying to simulate the data by pre-populating a rrd file 
> with errors,
> and I'm getting a floating point results (instead of int). I'm using
> RRDtool
> 1.0.45. FWIW, my RRD create looks like this:
> 
>         RRDs::create ( $filename,
>             "--start", $now - 60,
>             "--step", "600",
>             "DS:ifInErrors:COUNTER:600:U:U",
>             "DS:ifOutErrors:COUNTER:600:U:U",
>             "RRA:MAX:0.5:1:600",
>             "RRA:MAX:0.5:1:150",
>         );
> 
> Summary:
> 
> 1) Has anyone written a plugin to check interface errors?
> 2) Has anyone had strange problems with RRDs and float/int problems?
> 
> Thanks
> Tony
> 
> --
> "Computer science is as much about computers as
>         astronomy is about telescopes" -- Edsger Dijkstra
> ---------------------------------------------------------
> Anthony Tonns, UNIX Administrator - atonns at mail.ivillage.com
> 
> 
> --- auto-converted to plaintext by ELAB4
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email sponsored by: Free pre-built ASP.NET sites including
> Data Reports, E-commerce, Portals, and Forums are available now.
> Download today and enter to win an XBOX or Visual Studio .NET.
> http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet
> _072303_01/01
> _______________________________________________
> Nagiosplug-devel mailing list
> Nagiosplug-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagiosplug-devel
> ::: Please include plugins version (-v) and OS when reporting 
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
> 




More information about the Devel mailing list