[Nagiosplug-help] Nagios crashes badly and takes out both machines!
jon.soong at imvs.sa.gov.au
Thu Apr 29 00:12:02 CEST 2004
I'm looking for some help.
Over the last week my mail server and the machine monitoring it with
Nagios has crashed 3 times at the same time.
I'm not sure if it is the Nagios machine crashing and taking my mail
server with it somehow or the other way around.
In both situations i have seen increased load on my mail server, to the
point of nrpe sending me a socket timeout warning. Shortly after this
the machines become unusable and a hard-reboot is the only way to fix it.
When both machines crash (mailserver=Redhat 9, nagio=fedora), i've gone
to the console on both machines and they are both filled with messages
saying "status=0". This is on BOTH machines.
I'm running nrpe on the mailserver checking load, number of processes,
disk space etc. The only anamolous thing is that i run my own plugin
which i called check_ps which scans 'ps' for a given process (just so i
know postfix is actually running!).
I was wondering if anyone could confirm whether or not it is Nagios that
is crashing my machines???
Institute of Medical and Veterinary Science (IMVS)
Email: jon.soong at imvs.sa.gov.au
Web : http://www.imvs.sa.gov.au
Tel : +61 8 82223095
Fax : +61 8 82223147
More information about the Help