Overriding Search

We had a long discussion today about a pain point we’ve had with Chef thus
far. We’d like to rely on the search functionality that chef provides so
things will automatically wire up with each other in a given environment.
But during cookbook development we’d like to have more flexibility with the
search functionality.

We decided that we’d like to have a bit more control over the
search functionality and be able to provide overrides while we’re working
on something. Has anyone already written a library for doing something like


I wrote a quick library (lives in elasticsearch/libraries) to wrap search for discovering nodes in our applications and haproxy cookbooks that rely on it.

module ElasticSearch
class Discovery

attr_accessor :additional_conditions, :dc, :datacenter_restriction, :logger

attr_reader :environment

def initialize(recipe_context, opts = {})

We need to access chef search methods and the easiest way to get them is through

passing self from the chef recipe referencing this class

@recipe_context = recipe_context

@environment = opts.fetch(:environment)

@dc = opts.fetch(:dc, 'default-datacenter-name')

@conditions are non-overridable attributes used in the search like environment

and es-cluster role

@conditions = opts.fetch(:conditions, {

chef_environment: self.environment,

role: opts.fetch(:backend_role, 'elasticsearch-cluster'),

elasticsearch_cluster_name: opts.fetch(:cluster_name)


@additional_conditions = opts.fetch(:additional_conditions, Hash.new)

@logger = opts.fetch(:logger, Chef::Log)

@logger.info "Initialized discovery in #{self.environment}/#{self.dc} for #{@conditions[:role]}"


def conditions

merged_conditions = @conditions.merge(self.additional_conditions)

@logger.info "Search conditions are #{merged_conditions.flatten}"



def query

@conditions.inject() do |query,condition|

query << "(#{condition.first}:#{condition.last})"

end.join(' AND ')


def search(opts = {})

sorted = opts.fetch(:sorted, false)

@logger.info "Searching with #{self.query}"

@results = @recipe_context.search(:node, self.query)

(sorted) ? sorted_results : @results


def restrict_to_datacenter!

@logger.info "Restricting search to #{@dc}"

@conditions[:dc] = @dc


def restricted_to_datacenter?

@conditions[:dc] == @dc


def sorted_results

return unless @results

@results.sort { |node1, node2| node1[:fqdn] <=> node2[:fqdn] }


def to_a

@results.collect { |node| { :ip => node[:ipaddress], :fqdn => node[:fqdn], :datacenter => node[:dc] } }




It's used like the following:

es_discovery = ElasticSearch::Discovery.new(self,
environment: node.chef_environment,

backend_role: node[:haproxy][:edge_backend_role],

cluster_name: node[:elasticsearch][:cluster_name]


es_discovery.restrict_to_datacenter! if node[:haproxy][:edge_datacenter_restriction]

es_discovery.additional_conditions = node[:haproxy][:additional_conditions] if node[:haproxy][:additional_conditions].is_a?(Hash)
results = es_discovery.search(sorted: true)

This way it'll find all the nodes in a specific cluster but I can also add extra conditions for restricting the search if needed on a per-node/role basis. It needs some work and should probably be generalized for discovering other services besides ES but it could be a start for you.


On Tuesday, December 18, 2012 at 2:59 PM, Kevin Nuckolls wrote:

We had a long discussion today about a pain point we've had with Chef thus far. We'd like to rely on the search functionality that chef provides so things will automatically wire up with each other in a given environment. But during cookbook development we'd like to have more flexibility with the search functionality.

We decided that we'd like to have a bit more control over the search functionality and be able to provide overrides while we're working on something. Has anyone already written a library for doing something like that?


This is quite similar to my Discovery library cookbook of yore:
GitHub - hw-cookbooks/discovery: Discovery cookbook for search, implements Discovery#search environment and non-environment aware search for roles with a few extra checks -- this one is not specific to
ElasticSearch, but doesn't directly support mocked results.

It surely wouldn't be too hard to add. How would the mock response data be


On 19 December 2012 09:22, Daniel Condomitti daniel@condomitti.com wrote:

I wrote a quick library (lives in elasticsearch/libraries) to wrap search
for discovering nodes in our applications and haproxy cookbooks that rely
on it.

module ElasticSearch

class Discovery

attr_accessor :additional_conditions, :dc, :datacenter_restriction, :logger

attr_reader :environment

def initialize(recipe_context, opts = {})

  # We need to access chef search methods and the easiest way to get them is through

  # passing self from the chef recipe referencing this class

  @recipe_context = recipe_context

  @environment = opts.fetch(:environment)

  @dc = opts.fetch(:dc, 'default-datacenter-name')

  # @conditions are non-overridable attributes used in the search like environment

  # and es-cluster role

  @conditions = opts.fetch(:conditions, {

    chef_environment:           self.environment,

    role:                       opts.fetch(:backend_role, 'elasticsearch-cluster'),

    elasticsearch_cluster_name: opts.fetch(:cluster_name)


  @additional_conditions = opts.fetch(:additional_conditions, Hash.new)

  @logger = opts.fetch(:logger, Chef::Log)

  @logger.info "Initialized discovery in #{self.environment}/#{self.dc} for #{@conditions[:role]}"


def conditions

  merged_conditions = @conditions.merge(self.additional_conditions)

  @logger.info "Search conditions are #{merged_conditions.flatten}"



def query

  @conditions.inject([]) do |query,condition|

    query << "(#{condition.first}:#{condition.last})"

  end.join(' AND ')


def search(opts = {})

  sorted = opts.fetch(:sorted, false)

  @logger.info "Searching with #{self.query}"

  @results = @recipe_context.search(:node, self.query)

  (sorted) ? sorted_results : @results


def restrict_to_datacenter!

  @logger.info "Restricting search to #{@dc}"

  @conditions[:dc] = @dc


def restricted_to_datacenter?

  @conditions[:dc] == @dc


def sorted_results

  return [] unless @results

  @results.sort { |node1, node2| node1[:fqdn] <=> node2[:fqdn] }


def to_a

  @results.collect { |node| { :ip => node[:ipaddress], :fqdn => node[:fqdn], :datacenter => node[:dc] } }




It's used like the following:

es_discovery = ElasticSearch::Discovery.new(self,

environment: node.chef_environment,

backend_role: node[:haproxy][:edge_backend_role],

cluster_name: node[:elasticsearch][:cluster_name]


es_discovery.restrict_to_datacenter! if node[:haproxy][:edge_datacenter_restriction]

es_discovery.additional_conditions = node[:haproxy][:additional_conditions] if node[:haproxy][:additional_conditions].is_a?(Hash)
results = es_discovery.search(sorted: true)

This way it'll find all the nodes in a specific cluster but I can also add
extra conditions for restricting the search if needed on a per-node/role
basis. It needs some work and should probably be generalized for
discovering other services besides ES but it could be a start for you.


On Tuesday, December 18, 2012 at 2:59 PM, Kevin Nuckolls wrote:

We had a long discussion today about a pain point we've had with Chef thus
far. We'd like to rely on the search functionality that chef provides so
things will automatically wire up with each other in a given environment.
But during cookbook development we'd like to have more flexibility with the
search functionality.

We decided that we'd like to have a bit more control over the
search functionality and be able to provide overrides while we're working
on something. Has anyone already written a library for doing something like


Both of these are interesting. I'm not sure how we'd like to supply the
mock / override data to the system. I get the feeling that it might be
simplest if we just institutionalize using Vagrant and having some local
yaml file that Vagrant picks up and supplies to the nodes within it.

Mostly I was just looking for any prior art as far as this is concerned, so
thanks guys.


On Tue, Dec 18, 2012 at 2:26 PM, AJ Christensen aj@junglist.gen.nz wrote:

This is quite similar to my Discovery library cookbook of yore:
GitHub - hw-cookbooks/discovery: Discovery cookbook for search, implements Discovery#search environment and non-environment aware search for roles with a few extra checks -- this one is not specific to
ElasticSearch, but doesn't directly support mocked results.

It surely wouldn't be too hard to add. How would the mock response data be


On 19 December 2012 09:22, Daniel Condomitti daniel@condomitti.comwrote:

I wrote a quick library (lives in elasticsearch/libraries) to wrap
search for discovering nodes in our applications and haproxy cookbooks that
rely on it.

module ElasticSearch

class Discovery

attr_accessor :additional_conditions, :dc, :datacenter_restriction, :logger

attr_reader :environment

def initialize(recipe_context, opts = {})

  # We need to access chef search methods and the easiest way to get them is through

  # passing self from the chef recipe referencing this class

  @recipe_context = recipe_context

  @environment = opts.fetch(:environment)

  @dc = opts.fetch(:dc, 'default-datacenter-name')

  # @conditions are non-overridable attributes used in the search like environment

  # and es-cluster role

  @conditions = opts.fetch(:conditions, {

    chef_environment:           self.environment,

    role:                       opts.fetch(:backend_role, 'elasticsearch-cluster'),

    elasticsearch_cluster_name: opts.fetch(:cluster_name)


  @additional_conditions = opts.fetch(:additional_conditions, Hash.new)

  @logger = opts.fetch(:logger, Chef::Log)

  @logger.info "Initialized discovery in #{self.environment}/#{self.dc} for #{@conditions[:role]}"


def conditions

  merged_conditions = @conditions.merge(self.additional_conditions)

  @logger.info "Search conditions are #{merged_conditions.flatten}"



def query

  @conditions.inject([]) do |query,condition|

    query << "(#{condition.first}:#{condition.last})"

  end.join(' AND ')


def search(opts = {})

  sorted = opts.fetch(:sorted, false)

  @logger.info "Searching with #{self.query}"

  @results = @recipe_context.search(:node, self.query)

  (sorted) ? sorted_results : @results


def restrict_to_datacenter!

  @logger.info "Restricting search to #{@dc}"

  @conditions[:dc] = @dc


def restricted_to_datacenter?

  @conditions[:dc] == @dc


def sorted_results

  return [] unless @results

  @results.sort { |node1, node2| node1[:fqdn] <=> node2[:fqdn] }


def to_a

  @results.collect { |node| { :ip => node[:ipaddress], :fqdn => node[:fqdn], :datacenter => node[:dc] } }




It's used like the following:

es_discovery = ElasticSearch::Discovery.new(self,

environment: node.chef_environment,

backend_role: node[:haproxy][:edge_backend_role],

cluster_name: node[:elasticsearch][:cluster_name]


es_discovery.restrict_to_datacenter! if node[:haproxy][:edge_datacenter_restriction]

es_discovery.additional_conditions = node[:haproxy][:additional_conditions] if node[:haproxy][:additional_conditions].is_a?(Hash)
results = es_discovery.search(sorted: true)

This way it'll find all the nodes in a specific cluster but I can also
add extra conditions for restricting the search if needed on a
per-node/role basis. It needs some work and should probably be generalized
for discovering other services besides ES but it could be a start for you.


On Tuesday, December 18, 2012 at 2:59 PM, Kevin Nuckolls wrote:

We had a long discussion today about a pain point we've had with Chef
thus far. We'd like to rely on the search functionality that chef provides
so things will automatically wire up with each other in a given
environment. But during cookbook development we'd like to have more
flexibility with the search functionality.

We decided that we'd like to have a bit more control over the
search functionality and be able to provide overrides while we're working
on something. Has anyone already written a library for doing something like
