[Nagiosplug-help] Why isn't Nagios notifying about a trap

Ralph.Grothe at itdz-berlin.de Ralph.Grothe at itdz-berlin.de
Fri Jan 5 12:14:35 CET 2007


Hello again,

please, excuse my x-posting to two lists simultanously
but I'm about to despair.
I cannot find out on my own why Nagios isn't notifying about this
"trap service" that I had defined
like this:

# Service for testing SMS notifications

define service {
    use                         generic-service
    service_description         sms_pkg_state
    servicegroups               mcsg_cluster
    host_name                   sms
    notification_options        c,r,u
    contact_groups              nagiosadmin,admin_mobile
    max_check_attempts          1
    is_volatile                 1
    active_checks_enabled       0
    passive_checks_enabled      1
    check_freshness             0
    check_period                never
    check_command               passive-check-pad
}


Because check_period and check_command appear to be mandatory in
a service definition,
but don't make any sense to me in this context,
I provided only these two placeholders for this service to just
satisfy nagios:


define command {
    command_name        passive-check-pad
    command_line        $USER1$/check_dummy 3 "won't do active
checks"
}


define timeperiod {
    timeperiod_name     never
    alias               Void Timeperiod Definition
}



When I manually fail the monitored cluster package "sms" by
running a "cmhaltpkg sms" on the cluster
the trap is sent via send_nsca which takes place in the
customer_defined_halt_commands function
(in case you happen to know HP MC/ServiceGuard) whithin the
package's control script.

I can verify that the send_nsca delivered external command of the
state change is making it indeed into
nagios'es command pipe because it appears in the nagios.log as
PROCESS_SERVICE_CHECK_RESULT
and looks to me to have the required semicolon delimited
parameters.
(in between tests I also had a cat on the nagios command fifo
lingering and could verify that the
PROCESS_SERVICE_CHECK_RESULT appeared instantly after the sms
cluster package had gone down;
not to empty the fifo that way, as said, I only did that once or
twice to just verify that
the external command would be caught)


$ tail -9 /opt/sw/nagios/var/nagios.log 
[1167992561] Finished daemonizing... (New PID=95396)
[1167992627] EXTERNAL COMMAND:
PROCESS_SERVICE_CHECK_RESULT;sms;sms_pkg_state;2;CRITICAL - MC/SG
Package sms halting on samoa
[1167992642] HOST ALERT: sms;DOWN;SOFT;1;123.123.123.123 is DOWN
- rta: nan, lost 100%
[1167992653] HOST ALERT: sms;DOWN;SOFT;2;123.123.123.123 is DOWN
- rta: nan, lost 100%
[1167992664] HOST ALERT: sms;DOWN;SOFT;3;123.123.123.123 is DOWN
- rta: nan, lost 100%
[1167992675] HOST ALERT: sms;DOWN;SOFT;4;123.123.123.123 is DOWN
- rta: nan, lost 100%
[1167992686] HOST ALERT: sms;DOWN;HARD;5;123.123.123.123 is DOWN
- rta: nan, lost 100%
[1167992686] SERVICE ALERT:
sms;sms_pkg_state;CRITICAL;HARD;1;CRITICAL - MC/SG Package sms
halting on samoa
[1167992946] SERVICE ALERT:
sms;icmp-host-alive;CRITICAL;HARD;1;CRITICAL - 123.123.123.123:
rta nan, lost 100%


But why isn't the notification happening?
I have no clue.

The imported service template looks like this:

define service {
    name                                generic-service
    is_volatile                         0
    max_check_attempts                  5
    normal_check_interval               5
    retry_check_interval                3
    check_period                        24x7
    active_checks_enabled               1
    passive_checks_enabled              0
    parallelize_check                   1
    obsess_over_service                 0
    check_freshness                     0
    event_handler                       notify-by-email
    event_handler_enabled               0
    flap_detection_enabled              0
    process_perf_data                   0
    retain_status_information           1
    retain_nonstatus_information        1
    notification_interval               30
    notification_period                 24x7
    notification_options                w,u,c,r
    notifications_enabled               1
    contact_groups                      nagiosadmin
    register                            0
}


My contacts template which is used by contacts in the to be
notified contactgroups
nagiosadmin and admin_mobile per default enables all
notificatios.

define contact {
    name                                generic-contact
    register                            0
    contact_name                        grothe_ralph
    alias                               Must be overridden
    contactgroups                       unix_admins 
    host_notification_period            24x7
    service_notification_period         24x7
    host_notification_options           d,u,r,f
    service_notification_options        w,u,c,r,f
    host_notification_commands          host-notify-by-email
    service_notification_commands       notify-by-email
    email                               nagios
}



Thank you for your patience

Ralph




More information about the Help mailing list