[Nagiosplug-help] monitoring F5 bigIP Load Balancers

Heiko rupertt at gmail.com
Thu Jun 19 08:42:23 CEST 2008


On Thu, Jun 19, 2008 at 3:34 AM, Thomas Guyot-Sionnest <dermoth at aei.ca> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 18/06/08 09:27 AM, Heiko wrote:
>> On Wed, Jun 18, 2008 at 2:33 PM, Thomas Guyot-Sionnest <dermoth at aei.ca> wrote:
>>>>>> this looks like to work much better, I still get notifications about
>>>>>> pools that have no users,
>> I'm not sure what you mean here...
>>
>>> it did report some wrong status about that nodes are offline but they wherent
>>> like this, it cant be that on one machine only 1 node is there and on
>>> the other both,
>>> in this case the second BigIP reports a timeout:
>>
>>> [root at monitoring-1:/usr/local/nagios/libexec]# date
>>> Wed Jun 18 13:18:22 UTC 2008
>>> [root at monitoring-1:/usr/local/nagios/libexec]#
>>> /usr/local/nagios/libexec/check_bigip_pool -H 172.17.1.12 -C public -S
>>> 9 -vw 51 -c 26 -P pool1_PRODUCTION_www_v2 -t 180
>>> Getting 'MemberQty' trough SNMP
>>> Matching F5-BIGIP-LOCAL-MIB::ltmPoolMemberPoolName against
>>> '\.1\.3\.6\.1\.4\.1\.3375\.2\.2\.5\.3\.2\.1\.1\.27\.67\.66\.105\.100\.101\.97\.115\.116\.118\.95\.80\.82\.79\.68\.85\.67\.84\.73\.79\.78\.95\.119\.'
>>> Getting 'ActiveMemberCount' trough SNMP
>>> CHECK_BIGIP_POOL WARNING - pool1_PRODUCTION_www_v2 1/2 nodes online
>>> [root at monitoring-1:/usr/local/nagios/libexec]# date
>>> Wed Jun 18 13:18:26 UTC 2008
>>> [root at monitoring-1:/usr/local/nagios/libexec]#
>>> /usr/local/nagios/libexec/check_bigip_pool -H 172.17.1.11 -C public -S
>>> 9 -vw 51 -c 26 -P pool1_PRODUCTION_www_v2 -t 180
>>> Getting 'MemberQty' trough SNMP
>>> Matching F5-BIGIP-LOCAL-MIB::ltmPoolMemberPoolName against
>>> '\.1\.3\.6\.1\.4\.1\.3375\.2\.2\.5\.3\.2\.1\.1\.27\.67\.66\.105\.100\.101\.97\.115\.116\.118\.95\.80\.82\.79\.68\.85\.67\.84\.73\.79\.78\.95\.119\.'
>>> Getting 'ActiveMemberCount' trough SNMP
>>> CHECK_BIGIP_POOL OK - pool1_PRODUCTION_www_v2 all 2 nodes online
>>
>>
>>>>>> arent reachable and messages like this:
>> You mean the BigIP isn't reachable? Have you tried snmpwalk'ing it by
>> hands? Raising the timeout?
>>
>>> My timeout is at 180 minutes, so i set the normal_check_interval to 5 minutes.
>>> But we still get a lot of timeouts, they recover often on the next
>>> check, but under this situation we
>>> cant use it in a production environment.
>>> We think it is a bigIP problem, maybe it gives snmp queries a low
>>> priority, so they get processed to late or never.
>>> We have some heavy load on these machines. But even on the standby
>>> unit we have some timeouts on pools.
>>> Strange thing is that the vservers are always reported like they should be.
>
> Thanks. although you added only one -v flag. I'd like all three to see
> in full details what's happening. Can you reproduce the "1/0 node
> online" one in full debug (-vvv)?
>
Hello Thomas,
we didnt have that exact message again,
when i start youre poolcheck from the bash the result come very fast:

 /usr/local/nagios/libexec/check_bigip_pool -H 172.17.1.10 -C public
-S 9 -vw 51 -c 26 -P cbf-bla_mysql -t 180
Getting 'MemberQty' trough SNMP
Matching F5-BIGIP-LOCAL-MIB::ltmPoolMemberPoolName against
'\.1\.3\.6\.1\.4\.1\.3375\.2\.2\.5\.3\.2\.1\.1\.17\.99\.98\.119\.45\.105\.100\.101\.97\.115\.116\.118\.95\.109\.121\.115\.113\.108'
Getting 'ActiveMemberCount' trough SNMP
CHECK_BIGIP_POOL WARNING - cbf-bla_mysql 1/2 nodes online

Can this problem be located inside nagios?

> Also are you trying to check both the passive and active BigIPs in a HA
> pair? This check was designed to run on the floating IP so you always
> get to the Active BigIP.
>
I changed that now, so we only check the floating IP, but it didnt
change anything, I still get timeouts.


cheers

Heiko

> Thomas
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFIWY7u6dZ+Kt5BchYRAkBzAJ9JQh5fzMX9+lhqGdEXej0ox5/REACfenue
> 6LFSkSGib0q3qV9mgPzWOGM=
> =0nAr
> -----END PGP SIGNATURE-----
>




More information about the Help mailing list