Nagios Service for Hostgroup without any Nodes


#1

Hello,

I’m a bit hung up on a problem with the Nagios cookbook when using host groups without any nodes. Here’s the situation:

I have a loadbalancer-snowflake role. This is so I can monitor the loadbalancers without rebuilding them (which I will, but not for a few months.)

In the production environment, I have a few of these nodes. However, in the staging environment, I don’t have any of them.

The effect of this is it creates the host group block:

define hostgroup {
hostgroup_name loadbalancer-snowflake
alias loadbalancer-snowflake
}

And the command block (this is on the server, since it is performed remotely):

define command {
command_name check_loadbalancer-snowflake-http
command_line USER1/check_http -I HOSTADDRESS -e “HTTP/1.1 200” -w 3 -c 5
}

However when it adds the service block:

define service {
service_description loadbalancer-snowflake-http
hostgroup_name loadbalancer-snowflake
check_command check_loadbalancer-snowflake-http
use default-service
}

it complains about “Could not expand hostgroups and/or hosts specified in service”. This is because the Hostgroup is empty.

Has anyone experienced before? Is there a good solution to it? I considered altering it to not output the service if no hosts exist for it, but then found this monstrosity: https://github.com/opscode-cookbooks/nagios/blob/master/templates/default/services.cfg.erb#L17

I’d be willing to submit a PR to alter this behavior, but it looks like this might require a semi-substantial change. Any feedback to that point?

This cookbook could use refactoring :slight_smile:

Graham


#2

I’m sure there is a better solution, but couldn’t you just not pass the loadbancer role for the environment “staging”. When you are ready to implement a LB in staging u could always do something like knife node edit XXXX, add that role to staging and run chef- client (unless you have a ton of these servers).

Thanks,

Nikhil Shah

On Sep 19, 2013, at 6:13 PM, Graham Christensen graham@grahamc.com wrote:

Hello,

I’m a bit hung up on a problem with the Nagios cookbook when using host groups without any nodes. Here’s the situation:

I have a loadbalancer-snowflake role. This is so I can monitor the loadbalancers without rebuilding them (which I will, but not for a few months.)

In the production environment, I have a few of these nodes. However, in the staging environment, I don’t have any of them.

The effect of this is it creates the host group block:

define hostgroup {
hostgroup_name loadbalancer-snowflake
alias loadbalancer-snowflake
}

And the command block (this is on the server, since it is performed remotely):

define command {
command_name check_loadbalancer-snowflake-http
command_line USER1/check_http -I HOSTADDRESS -e “HTTP/1.1 200” -w 3 -c 5
}

However when it adds the service block:

define service {
service_description loadbalancer-snowflake-http
hostgroup_name loadbalancer-snowflake
check_command check_loadbalancer-snowflake-http
use default-service
}

it complains about “Could not expand hostgroups and/or hosts specified in service”. This is because the Hostgroup is empty.

Has anyone experienced before? Is there a good solution to it? I considered altering it to not output the service if no hosts exist for it, but then found this monstrosity: https://github.com/opscode-cookbooks/nagios/blob/master/templates/default/services.cfg.erb#L17

I’d be willing to submit a PR to alter this behavior, but it looks like this might require a semi-substantial change. Any feedback to that point?

This cookbook could use refactoring :slight_smile:

Graham


#3

There are two solutions for this problem:

First off Nagios has a nice bug regarding empty host groups in older versions. If you run Nagios prior to version 3.4.0 the config will fail to load with an empty host group. If you’re compiling from source or using a distribution that has 3.4.0 or later you can enable a new flag that was added: allow_empty_hostgroup_assignment. A while back I added a check to see if the user was installing via source and if so, the feature would be enabled automatically. That was before any distros were shipping version 3.4.0+ in packages. So if you install via source it’ll just work. If not for now you can manually enable that by editing the nagios.cfg.erb template and taking out the check for the source install method. I’ll file a bug to allow turning that on via attribute.

My second recommendation would be looking at adding hosts via the nagios_unmanagedhosts data bag. I added that so that I could manage hosts that were not yet migrated into Chef. You create a data bag item per host and assign it to one or more host groups. You could add your old load balancers using this method. There’s an example in the readme.

Let me know if you have any other issue.

-Tim

Tim Smith - Systems Engineer
m: +1 707.738.8132

On Sep 19, 2013, at 3:13 PM, Graham Christensen graham@grahamc.com wrote:

Hello,

I’m a bit hung up on a problem with the Nagios cookbook when using host groups without any nodes. Here’s the situation:

I have a loadbalancer-snowflake role. This is so I can monitor the loadbalancers without rebuilding them (which I will, but not for a few months.)

In the production environment, I have a few of these nodes. However, in the staging environment, I don’t have any of them.

The effect of this is it creates the host group block:

define hostgroup {
hostgroup_name loadbalancer-snowflake
alias loadbalancer-snowflake
}

And the command block (this is on the server, since it is performed remotely):

define command {
command_name check_loadbalancer-snowflake-http
command_line USER1/check_http -I HOSTADDRESS -e “HTTP/1.1 200” -w 3 -c 5
}

However when it adds the service block:

define service {
service_description loadbalancer-snowflake-http
hostgroup_name loadbalancer-snowflake
check_command check_loadbalancer-snowflake-http
use default-service
}

it complains about “Could not expand hostgroups and/or hosts specified in service”. This is because the Hostgroup is empty.

Has anyone experienced before? Is there a good solution to it? I considered altering it to not output the service if no hosts exist for it, but then found this monstrosity: https://github.com/opscode-cookbooks/nagios/blob/master/templates/default/services.cfg.erb#L17

I’d be willing to submit a PR to alter this behavior, but it looks like this might require a semi-substantial change. Any feedback to that point?

This cookbook could use refactoring :slight_smile:

Graham