[Nagiosplug-devel] Problem: check_icmp incorrectly reporting packet loss?
srunschke at abit.de
Tue Mar 22 02:41:21 CET 2005
I have been using Nagios together with check_icmp for quite some time now
and it mostly went very smooth. Just lately I made a few changes and now
it seems that check_icmp is causing quite some problems.
Since I am running quite a highly parallelized Nagios environment, I
a new server to get rid of the load spikes, I switched from an older
beta to 2.0b2 and updated my old nagiosplug cvs snapshot to 1.4.
The old server was running RH9, the new one is running RH Enterprise
4. SELinux is disabled, since I couldn't get it to work with the
being accessed by the http- and usr-content.
After a few days of smooth work, check_icmp seems to start choking,
paket loss and high latency where there is none actually.
HP Compaq DL 380 G4 (Xeon 3.4Ghz) 3 GB Ram
RedHat Enterprise Server 4 (SE disabled)
check_icmp 2005_03_15 (tried that after having problems with the one from
I'm running around 300 service checks per minute, quite some of them icmp
After a few days check_icmp suddenly started to report high latency and
paket loss, but the strange thing is that it only reported it for 2
(I am monitoring about 10 remote sites in addition to our HQ)
So my first conclusion was problems with the internet connection for those
sites. But to my surprise there were absolutely no problems. Neither the
there noticed _anything_, nor did my ping tests bring anything up. Even
from the Nagios machine worked absolutely fine - but using check_icmp to
brought up the same strange behaviour - high latency (500-1000ms) and high
loss (60-80%) with half of the checks made - the others were fine.
Using /bin/ping at the same time brings up:
[root at nagios check_icmp-2005-03-15]# ping 184.108.40.206
PING 220.127.116.11 (18.104.22.168) 56(84) bytes of data.
--- 22.214.171.124 ping statistics ---
21 packets transmitted, 21 received, 0% packet loss, time 20027ms
rtt min/avg/max/mdev = 37.718/109.663/307.297/98.933 ms, pipe 2
The quite high rtt is normal, since that site has a continuous bandwidth
but it seldom spikes - and it doesn't spike to 800ms with 80% loss for 44
in a row like check_icmp reported.
A -v -v -v log output from check_icmp at the same time when the ping was
I'll see if I get to dig into check_icmp.c myself, but I am not too sure
it's gonna happen
since I'm loaded with work :/
Any ideas or hints to the problem?
Tel.:+49 (0) 2150.9153.226
Mobil:+49 (0) 173.5419665
mailto:SRunschke at abit.de
Der Inhalt dieser Email sowie die Anhänge sind ausschließlich für den
bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat
dieser Email oder dessen Vertreter sein sollten, so beachten Sie bitte,
daß jede Form der Kenntnisnahme, Veröffentlichung, Vervielfältigung oder
Weitergabe des Inhalts dieser Email unzulässig ist. Wir möchten Sie
außerdem darauf hinweisen, daß die Kommunikation per Email über das
Internet unsicher ist, da fuer unberechtigte Dritte grundsätzlich die
Möglichkeit der Kenntnisnahme und Manipulation besteht. Wenn Sie diese
Nachricht versehentlich erhalten, informieren Sie bitte den Absender und
löschen diese Nachricht mit den Anhängen. Herzlichen Dank
The information and any attachments contained in this email are intended
solely for the addressee. Access to this email by anyone else is
unauthorized. If you are not the intended recipient, any form of
disclosure, reproduction, distribution or any action taken or refrained
from in reliance on it, is prohibited and may be unlawful. We also like to
inform you that communication via email over the internet is insecure
because third parties may have the possibility to access and manipulate
emails. If you have received the message in error, please advise the
sender and delete the message and any attachments. Thank you very much.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the Devel