[Nagiosplug-help] nrpe spams my logs

Andreas Ericsson ae at op5.se
Sun Nov 13 06:24:48 CET 2005


Jeroen Demeyer wrote:
> Hello list,
> 
> We installed nagios+nrpe on a cluster to monitor the health of our
> nodes.  Nagios runs on a server, and the diskless nodes run nrpe.
> This works fine, however, sometimes nrpe starts spamming the syslog:
> 
> Nov 13 13:46:51 node6 nrpe[10671]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10673]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10675]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10677]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10679]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10681]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10683]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10685]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10687]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10689]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10691]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10693]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10695]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10697]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10699]: Network server accept failure (22: Invalid argument)
> (this goes on for a long time, >100 times per second)
> 
> I suppose this is a bug in nrpe?

Yes and no. It shouldn't retry the accept(2) syscall so often if it 
fails with this frequency. The problem is however elsewhere, since it 
somewhere fails to obtain a socket (or has its socket destroyed by the 
kernel somehow) so that when it calls accept(2) on the socket it's not a 
socket any more.

Hope that made sense. It did in my head, but doesn't look so now it's 
down in writing.

>  Also, I shut down nagios on the server
> but nrpe keeps giving these error messages, even hours after nagios was
> stopped.  Any ideas how to fix this problem?
> 

Run nrpe in an strace and tee the output to some file. If you manage to 
get this error while running in the trace, send me the output (gzipped, 
preferrably) and I'll see what goes on.

> This is on a recent Gentoo system with nagios-1.2 and nagios-nrpe-2.0-r1,
> 2.4.26 openMosix kernel.
> 

You could try installing nrpe 2.2. It's available at 
http://oss.op5.se/nagios/nrpe-2.2.tar.gz

Let me know if it fixes this particular problem for you.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231




More information about the Help mailing list