[Nagiosplug-devel] more info on network management plugins

Subhendu Ghosh sghosh at sghosh.org
Tue Sep 24 10:38:03 CEST 2002


On 12 Sep 2002, Guy Van Den Bergh wrote:

> On Wed, 2002-09-11 at 21:03, Subhendu Ghosh wrote:
> > Sorry - don't mean to be nit-picking...
> > 
> Don't worry, discussion is good :)
> 
> > > 
> > > *) check_xinterface considers also the administrative status of the
> > > interface: when the interface is admin down, only a WARNING status is
> > > returned (CRITICAL when admin up & operational down).
> > 
> > Is this logical? - often we have interfaces installed on devices that are
> > not in use and therefore admin down.  If monitoring a specific interface 
> > (therefore assumption is that it should be up), why would the interface go 
> > admin down - (if maintenance then scheduling downtime ...)
> > 
> > Inadvertent change of admin to down should get flagged on your
> > configuration diffs (You are doing those, right :)
> 
> You have a point, but anyway you can choose in the nagios config whether
> you want a notification on admin down or not. If an interface is admin
> down, it is (probably) done through manual intervention, maybe even for
> a good reason. I still think it is good to treat admin down and
> operational down differently.
> 
> > 
> > By requiring both ifIndex and ifName in the config file you are forcing a 
> > nms reconfig because of a router reboot with snmp reindexing. Seems to me 
> > to be at cross-purposes in trying to maintain nms uptime.  
> 
> That's true, but I tried to find a balance between nms uptime, snmp
> traffic and monitoring the correct interface (read on!).
> > 
> > The current version of check_ifoperstatus supports either ifIndex or 
> > ifDescr for the interface.  It could be improved to use ifName instead of 
> > ifDescr if the -I (support for ifXTable) is provided.
> > 
> It is indeed possible to provide only ifName or ifDescr, and get the
> correct index from the router with snmp. But I think this might be a
> problem on routers with a large number of interfaces. Excessive traffic
> can be solved by storing a mapping ifName->ifIndex locally and let the
> plugin use this data when available. But I'm not sure whether this is a
> good idea for plugin performance. Any ideas?
> Another possibility is to get the index when only ifName or ifDescr is
> provided, use the index directly when provided, and ultimately make the
> comparison as I am doing now when both are provided.
> 

There had been a discussion earlier about supporting a cache/tmp directory
for this kind of data - similar to the metadat that is stored by Cricket
on snmp indexes.  I'll look into the perl-cache and see if that helps. One
of the issues with caching this info is where to cache.  My preference is
to cache in memory or on a memory filesytem to prevent contribution to
diskio bottlenecks.



> > > *) check_xinterface returns UNKNOWN when the snmp poll fails
> > > (check_ifoperstatus returns critical). SNMP implementations on
> > > commercial routers are pretty stable. Most of the time an SNMP poll
> > > failure means the router is down, and this is detected with the
> > > check-host-alive command.
> > 
> > True - but having a critical here allows for faster notification rather 
> > than waiting for the host-check max-attempts to expire.
> 
> I'd rather wait for that to be really sure the host is down.
> You can have a condition where a link is lost, which makes a router
> unreachable. I will receive 2 notifications: host down and interface
> down (on the router still reachable). You will get at least 3
> notifications with ifoperstatus.pl (one interface down and at least one
> unreachable and one host down). xinterface can return all states:
> unknown, ok, warning and critical; ifoperstatus only returns ok and
> critical, and unknown on a condition which will not be met very often
> (plugin timeout instead of snmp session timeout).
> 


OK, I'll accept this :)

Here's what I would propose - merge the changes from check_xinterface and 
check_ifoperstatus into a new check_snmp_interface.

This would allow the name change to distinguish the changed functionality.


> > > 2. The router environment plugins.
> > > It is indeed a good idea to fold both check_cisco_env and
> > > check_juniper_env into a single plugin, but I will need some more time
> > > to implement this. This way it should be easily extensible with other
> > > proprietary environment MIBs (Foundry, Unisphere, Alcatel, ...)
> > > 
> > I can give you a hand on this, if you'd like.
> > 
> Welcome aboard ;) I was thinking of transforming the two plugins into
> functions, and make the check_router_env decide whether it should call
> the cisco_env function or the juniper_env function. Any suggestions?
> 

functions would be the easiest way to go.  We may want to appropriately
flag cases where the device may not be able to return all of the
information. 

-sg
-- 






More information about the Devel mailing list