Thursday 22 January 2009

Allowing for a timeout on check_nrpe

check_nrpe allows for a timeout to be set using the -t option. The default is 10 seconds. Often this might not be enough. There is no way of specifying the timeout option when configuring a host.

example
=======

define service{
use generic-service
# Hostname of remote system
host_name mynode.mydomain.com
service_description Load
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 5
retry_check_interval 1
# Change to your contact group
contact_groups admins
notification_options w,u,c,r
notification_interval 10
notification_period 24x7
check_command check_nrpe!check_load
}

To get round this problem simply add a new command definition to commands.cfg below the existing check_nrpe definition

define command{
command_name mycheck_nrpe
command_line /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -t $ARG2$
}

mycheck_nrpe allows for a 2nd parameter to be passed on the service definition.

example

define service{
use generic-service
# Hostname of remote system
host_name mynode.mydomain.com
service_description Load
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 5
retry_check_interval 1
# Change to your contact group
contact_groups admins
notification_options w,u,c,r
notification_interval 10
notification_period 24x7
check_command mycheck_nrpe!check_load!30
}

The above example specifies a 30 second timeout.

Problem solved!