[Nagiosplug-help] How must I define this in checkcommands.cfg and services.cfg?

Ralph.Grothe at itdz-berlin.de Ralph.Grothe at itdz-berlin.de
Wed Aug 3 02:39:37 CEST 2005


Hello Ben,

first many thanks for your quick help!

Maybe things are a bit convoluted?

I wish to set up a kind of generic service for all Service Groups
(SG) of a Veritas Cluster Servers (VCS).
One such cluster alone has over 20 SGs!
I simply wanted to save me repetitive copying and pasting of
definition blocks that only distinguish from one another by their
SG name.

Every SG also has its own IP Addr. 

First I thought I could manage them by putting them under the
hood of a servicegroup definition.
But then I found no way to draw the link between SG name and IP.
Thus I ended up defining each SG as a host definition in its own
right and clamping them together as hostgroups
which are VCS node affiliated, containing those hosts (i.e. SGs)
that are during normal cluster operations 
(i.e. no fail or switch over happened) running on the respective
node.

This concept at least works so far for the lower level check_icmp
of each SG's IP which of course is a prerequisite for the whole
SG to be ONLINE (I think later also ought to define service
dependencies).

Because every later addition of a new SG already requires a new
host definition
I at least wanted to avoid also having to define a new check
command for it as well
(The VCS and VxVM maintenance alone is already involved enough).

I hope you got the idea behind my contrived setup.

Here are (I hope all affected) parts of my current definitions.


>From hostgroups.cfg  the definition of HG for node nemesis
containing SGs 



# template for EVO SGs
#
define hostgroup { 
    hostgroup_name      nemesis_vcs_sgs
    alias               nemesis
    members
evo1,evo2,evo3,evo10,evo11,evo13,evo15,evo17,evo19,evo812
} 



>From hosts.cfg one such member
(n.b. IPs bogus here for daft paranoia reasons)


define host {
    use                                 generic-host
    host_name                           evo1
    alias                               EVO VCS SG EVO1
    address                             10.22.120.13
    hostgroups
evo_cluster,non_fwalled_hosts,nemesis_vcs_sgs
    contact_groups                      evoadmin
}



>From the same cfg file the host definition of cluster node
nemesis


define host {
    use                                 generic-host
    host_name                           nemesis
    alias                               EVO VCS Node NEMESIS
    address                             10.22.120.11
    hostgroups
evo_cluster,non_fwalled_hosts,nemesis_vcs_sgs
    contact_groups                      evoadmin
}



>From checkcommands.cfg my check-nrpe definition
(n.b. it already works for other NRPE checks on other hosts, e.g.
Weblogic servers)



define command {
    command_name        check-nrpe
    command_line        $USER1$/check_nrpe -H $HOSTADDRESS$ -c
$ARG1$
}



Here the critical command definition from the same file that
doesn't work



define command {
    command_name        check-evo-sg-online
    command_line
check-nrpe!check_vcs_sg\!$HOSTNAME$\!$HOSTGROUPALIAS$
}



As you can see I'm relying on the correct substitutions for
$HOSTNAME$ and $HOSTGROUPALIAS$

Also according to a suggestion from Sharon (see my other
response) I URI escaped the bang
as he mentioned that it also was part of the illegal meta charset
(caveat shell expansion?)
But this brought no change.



Finally from my services.cfg here's the critical "generic"
service definition


# EVO Services

define service {
    use                         generic-service
    service_description         nemesis-sg-online
    hostgroup_name              nemesis_vcs_sgs
    check_command               check-evo-sg-online
    contact_groups              evoadmin
}



Ah, not to forget the generic-service template which only
contains default settings to be inheritted unoverridden by most
services



define service {
    name                                generic-service
    service_description                 Service_Class_Definition
    is_volatile                         0
    max_check_attempts                  5
    normal_check_interval               5
    retry_check_interval                3
    check_period                        24x7
    active_checks_enabled               1
    passive_checks_enabled              0
    parallelize_check                   1
    obsess_over_service                 0
    check_freshness                     0
    event_handler                       notify-by-email
    flap_detection_enabled              0
    process_perf_data                   0
    retain_status_information           1
    retain_nonstatus_information        1
    notification_interval               60
    notification_period                 24x7
    notification_options                w,u,c,r
    notifications_enabled               1
    contact_groups                      unixadmin
    register                            0
}





When I run nagios with those settings I now get this kind of
errors

[nagios at daisy:~/etc]
$ tail -1 /var/opt/nagios/log/nagios.log 
[1123061435] Warning: Return code of 127 for check of service
'nemesis-sg-online' on host 'evo3' was out of bounds. Make sure
the plugin you're trying to run actually exists.



I hope not to have bothered you too much with my configuration
settings.

Many thanks for your patience!

Ralph




> -----Original Message-----
> From: Ben O'Hara [mailto:bohara at gmail.com]
> Sent: Wednesday, August 03, 2005 10:10 AM
> To: Ralph.Grothe at itdz-berlin.de
> Cc: nagiosplug-help at lists.sourceforge.net;
> nagios-users at lists.sourceforge.net
> Subject: Re: [Nagiosplug-help] How must I define this in
> checkcommands.cfg and services.cfg?
> 
> 
> On 8/3/05, Ben O'Hara <bohara at gmail.com> wrote:
> > 
> > and then add the service into etc/services.cfg
> > 
> > define service{
> >         use                             generic-service
> >         host_name                       nemesis
> >         service_description             EVO
> >         is_volatile                     0
> >         check_period                    24x7
> >         max_check_attempts              5
> >         normal_check_interval           3
> >         retry_check_interval            1
> >         contact_groups                  admins
> >         notification_interval           120
> >         notification_period             24x7
> >         notification_options            w,u,c,r
> >         check_command                   
> check_nrpe!check_vcs_sg\!evo1\!nemesis
> >         }
> >
> 
> Infact, the check_command is probably wrong, $ARG1$ is
currently being
> set up "check_vcs_sg"
> 
> Id suggest ading multiple checks to nrpe.cfg if you want to
check
> evo1, evo2 etc? (Im not sure what your checking!)
> 
> You can then just use
> 
>  check_command                   check_nrpe!check_vcs_sg_evo1
>  check_command                   check_nrpe!check_vcs_sg_evo2
> 
> as your check_commands in services.cfg
> 
> Hope that makes sense!
> 
> Ben
> 




More information about the Help mailing list