[Nagiosplug-help] monitoring F5 bigIP Load Balancers

Thomas Guyot-Sionnest dermoth at aei.ca
Thu Jun 19 03:34:22 CEST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 18/06/08 09:27 AM, Heiko wrote:
> On Wed, Jun 18, 2008 at 2:33 PM, Thomas Guyot-Sionnest <dermoth at aei.ca> wrote:
>>>>> this looks like to work much better, I still get notifications about
>>>>> pools that have no users,
> I'm not sure what you mean here...
> 
>> it did report some wrong status about that nodes are offline but they wherent
>> like this, it cant be that on one machine only 1 node is there and on
>> the other both,
>> in this case the second BigIP reports a timeout:
> 
>> [root at monitoring-1:/usr/local/nagios/libexec]# date
>> Wed Jun 18 13:18:22 UTC 2008
>> [root at monitoring-1:/usr/local/nagios/libexec]#
>> /usr/local/nagios/libexec/check_bigip_pool -H 172.17.1.12 -C public -S
>> 9 -vw 51 -c 26 -P pool1_PRODUCTION_www_v2 -t 180
>> Getting 'MemberQty' trough SNMP
>> Matching F5-BIGIP-LOCAL-MIB::ltmPoolMemberPoolName against
>> '\.1\.3\.6\.1\.4\.1\.3375\.2\.2\.5\.3\.2\.1\.1\.27\.67\.66\.105\.100\.101\.97\.115\.116\.118\.95\.80\.82\.79\.68\.85\.67\.84\.73\.79\.78\.95\.119\.'
>> Getting 'ActiveMemberCount' trough SNMP
>> CHECK_BIGIP_POOL WARNING - pool1_PRODUCTION_www_v2 1/2 nodes online
>> [root at monitoring-1:/usr/local/nagios/libexec]# date
>> Wed Jun 18 13:18:26 UTC 2008
>> [root at monitoring-1:/usr/local/nagios/libexec]#
>> /usr/local/nagios/libexec/check_bigip_pool -H 172.17.1.11 -C public -S
>> 9 -vw 51 -c 26 -P pool1_PRODUCTION_www_v2 -t 180
>> Getting 'MemberQty' trough SNMP
>> Matching F5-BIGIP-LOCAL-MIB::ltmPoolMemberPoolName against
>> '\.1\.3\.6\.1\.4\.1\.3375\.2\.2\.5\.3\.2\.1\.1\.27\.67\.66\.105\.100\.101\.97\.115\.116\.118\.95\.80\.82\.79\.68\.85\.67\.84\.73\.79\.78\.95\.119\.'
>> Getting 'ActiveMemberCount' trough SNMP
>> CHECK_BIGIP_POOL OK - pool1_PRODUCTION_www_v2 all 2 nodes online
> 
> 
>>>>> arent reachable and messages like this:
> You mean the BigIP isn't reachable? Have you tried snmpwalk'ing it by
> hands? Raising the timeout?
> 
>> My timeout is at 180 minutes, so i set the normal_check_interval to 5 minutes.
>> But we still get a lot of timeouts, they recover often on the next
>> check, but under this situation we
>> cant use it in a production environment.
>> We think it is a bigIP problem, maybe it gives snmp queries a low
>> priority, so they get processed to late or never.
>> We have some heavy load on these machines. But even on the standby
>> unit we have some timeouts on pools.
>> Strange thing is that the vservers are always reported like they should be.

Thanks. although you added only one -v flag. I'd like all three to see
in full details what's happening. Can you reproduce the "1/0 node
online" one in full debug (-vvv)?

Also are you trying to check both the passive and active BigIPs in a HA
pair? This check was designed to run on the floating IP so you always
get to the Active BigIP.

Thomas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIWY7u6dZ+Kt5BchYRAkBzAJ9JQh5fzMX9+lhqGdEXej0ox5/REACfenue
6LFSkSGib0q3qV9mgPzWOGM=
=0nAr
-----END PGP SIGNATURE-----




More information about the Help mailing list