[Nagiosplug-help] check_http and missing DNS fallback

Steffen Poulsen sp at tdchosting.dk
Tue Mar 31 10:25:01 CEST 2009


Hi again,

I have now added the following to /etc/resolv.conf, which helps a bit:

options timeout:1 attempts:1
nameserver <ns1>
nameserver <ns2>

This makes the plugin spend only 3 seconds on the DNS fallback in the case of primary name server outage. In our case this will make the checks raise a WARNING typically, but avoid CRITICAL alerts.

Best regards,
Steffen

-----Original Message-----
From: Steffen Poulsen [mailto:sp at tdchosting.dk] 
Sent: 31. marts 2009 10:09
To: Nagios Plugin Help List
Subject: Re: [Nagiosplug-help] check_http and missing DNS fallback

Hi Andy,

Thanks for your ideas and feedback.

DROPping both nameservers unfortunately still gives me the exact same error: "CRITICAL - Socket timeout after 10 seconds".

If instead I flush /etc/resolv.conf, I get this:

[root at nag-test nagios-plugins-1.4.13]# ./plugins/check_http -H <hostname> -u / -f warning -p 80 -w 3 -c 5 -t 10
Temporary failure in name resolution
Unable to open TCP socket

In your example below, you actually get a reply from the DNS server (failing the blah.blah.blah.com lookup) - in my example I don't even see traffic against the secondary DNS server (using tcpdump). A standard telnet <hostname> 80 sends the DNS query as expected, after the timeout.

Whoops .... but only after 20 seconds.

And .. raising values for the timeouts in the plugin, actually gives the same picture:

[root at nag-test nagios-plugins-1.4.13]# ./plugins/check_http -H <hostname> -u / -f warning -p 80 -w 40 -c 50 -t 10
HTTP OK HTTP/1.1 200 OK - 356 bytes in 20.026 seconds |time=20.026142s;40.000000;50.000000;0.000000 size=356B;;;0

Of course this is somewhat problematic - that the time spent on the DNS lookup part of the check cannot be threshold'ed separately in the plugin/check setup, but this fact is probably not that easy to do anything about(?).

So, second best option is probably to lower this DNS fallback threshold, either systemwide or nagios-user-wide.

To the default warning/critical thresholds to still make sense, it would be great if the DNS fallover timing could be reduced from the default 20 seconds to perhaps 500ms or 1s.

Would anybody by coincidence know how this is achieved in a RHEL installation?

Best regards,
Steffen Poulsen


-----Original Message-----
From: Andy Shellam [mailto:andy-lists at networkmail.eu] 
Sent: 30. marts 2009 19:23
To: Nagios Plugin Help List
Subject: Re: [Nagiosplug-help] check_http and missing DNS fallback

Steffen,

What's the configuration of that machine's /etc/resolv.conf? Plugin 
maintainers correct me if I'm wrong, but don't they just use the 
machine's gethostbyname or getservbyname functions? In which case it'd 
be down to your machine's resolver library.

However by the look of the error message, it does appear check_http is 
finding a valid IP address for <hostname> but then unable to connect to 
it. If you give it an invalid hostname (i.e. the result couldn't be 
gotten from DNS) the error message is different:

# /opt/nmail/nagios/libexec/check_http -H blah.blah.blah.com -u / -f 
warning -p 80 -w 3 -c 5 -t 10
Name or service not known
HTTP CRITICAL - Unable to open TCP socket

Try also adding a DROP rule to iptables at the same time for your 
secondary DNS, and see if the error message you get back is different.

Andy

Steffen Poulsen wrote:
>
> Hi,
>
> Is there any way to make check_http fallback on dns service used (like 
> other services at the machine)?
>
> A little test reveals that apparently only the primary DNS server is 
> in use, the second and following name servers are ignored.
>
> --
>
> [root at nag-test nagios-plugins-1.4.13]# ./plugins/check_http -H 
> <hostname> -u / -f warning -p 80 -w 3 -c 5 -t 10
>
> HTTP OK HTTP/1.1 200 OK - 356 bytes in 0.013 seconds 
> |time=0.012519s;3.000000;5.000000;0.000000 size=356B;;;0
>
> [root at nag-test nagios-plugins-1.4.13]# iptables -A INPUT -s <primary 
> dns> -j DROP
>
> [root at nag-test nagios-plugins-1.4.13]# ./plugins/check_http -H 
> <hostname> -u / -f warning -p 80 -w 3 -c 5 -t 10
>
> CRITICAL - Socket timeout after 10 seconds
>
> --
>
> I know there is a "-I" parameter for passing in the ip directly to the 
> check, but this is not the behavior I'm after - I still want the check 
> to be dynamic with regards to the IP address for the service.
>
> I hope I'm just missing the obvious?
>
> Best regards,
>
> Steffen
>
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------------
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Nagiosplug-help mailing list
> Nagiosplug-help at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagiosplug-help
> ::: Please include plugins version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
>   

------------------------------------------------------------------------------
_______________________________________________
Nagiosplug-help mailing list
Nagiosplug-help at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagiosplug-help
::: Please include plugins version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


------------------------------------------------------------------------------
_______________________________________________
Nagiosplug-help mailing list
Nagiosplug-help at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagiosplug-help
::: Please include plugins version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Help mailing list