[Nagiosplug-devel] Kickoff for 1.5

Subhendu Ghosh sghosh at sghosh.org
Wed Mar 9 18:22:21 CET 2005


On Wed, 9 Mar 2005, Harper Mann wrote:

> Hi Everyone,
>
> There are several items in an SNMP plugin discussion we're interested in and
> are working on.  What I can remember off the top of my head is:
>
> 1) How to manage and alarm on counter data like interface traffic, etc.  We
> use check_rrd, which was mentioned earlier in this thread, and perhaps
> that's sufficient since we customarily store and graph, but standardizing
> this would be good.  We're not sure RRDTool will scale to sufficient size
> installations.
>
> 2) We've had a request to collect 3-4 SNMP values (in, out, errors) from
> more than 10,000 interfaces every 15 minutes so we're looking into how to
> scale to such a large installation.  Aside from how to get plugins to keep
> up with collecting, what's the best way to store so much performance data?
>
> 3) Fix the performance data so it conforms to the project standards and
> manages OIDs and Symbolic names well for multiple requests.
>

Separate out the functionality  - Nagios is primarily a fault management 
tool. For 10k interface performance choose a performance 
management(monitor) tool.

I've been partial to Cricket to snmp data collection - the snmp engine is 
pretty well designed so that each device is only contacted once and all 
the different oids are requested together. (cricket.sf.net)
I've seen it scale quite well so long as you can stagger the the hosts 
groups (ie. not everything runs at the same 15 min interval) and you can 
use snmp v2 and get-bulk

For alarms - either check_rrd or snmptraps from Cricket (and possibly 
2Cacti in the near future).

By forcing Nagios to do traffic measurements from snmp - the scalability 
is not present based on the plugin architecture.  You need something else 
to do the active monitor and check the results.  For small installs that 
don't want multiple tools, it would work, but large installs like yours 
should definitely use separate tools.

I used to monitor about the same number of interfaces with mrtg arounf 
'98-'00.  disk i/o was the biggest issue. (ram disk to the rescue).

RRDtool scales as well as the underlying hardware (disk i/o) and file 
layout.

-- 
-sg




More information about the Devel mailing list