Biggest issue I've had with search is that a fail state node will not update its node attributes. Causing search to not reflect reality. I like the idea of an infra wide mutex.
The use case here is clustering. The first node that comes up would search the databage for an item named the deployment id. If it doesn't find an item with that name it would create the item and within that item state that it is the master. The next node to come up with that deployment id would find that item and extract the master node info.
A coworker suggested I use chef's search capabilities instead. Search chef for a node with that deployment id. If not found set an attribute claiming itself as the master. Subsequent nodes would find that with search and would extract what they need from the search results. This is the method I'm going to use now but I initially rejected his idea (and am not happy that were having to go this route) because I have to rely on solr properly indexing my node before the next node comes up. I don't like having to rely on timing when it comes to automation. It's sloppy in my eyes. With the databag solution, I "know" the data is there because my call to write it is successful.
A 3rd solution I came up with is to use Amazon SQS but I'd be happier with a pure chef solution.
MG
On Fri, Dec 9, 2011 at 10:55 AM, Jay Feldblum y_feldblum@yahoo.com wrote:
What's a use-case for a node rewriting a data bag item during a converge, that can't be solved by some other method?
Note that it would have to scale to 100,000 nodes talking to the same server and converging every 5 minutes, just as it would have to scale to 2 nodes talking to the same server and converging only when you SSH in to run the chef-client.
On Fri, Dec 9, 2011 at 12:48 PM, Michael Glenney mike.glenney@gmail.com wrote:
I figured out the issue. First of all, the 403 was because I was trying to write to a databag within a recipe and didn't read the disclaimer in the wiki that said I'd have to give the node api client admin privileges to be able to do that. Of course we don't want to do that so I'll have to come up with another way to solve the problem.
BTW, I wouldn't mind seeing chef have the capability to give a nodes permission to write to a particular databag without giving the node admin rights in case anyone is accepting Christmas wishes.
For the shef localhost:4000 error, that was just because, when running shef from a client node, I forgot I need to launch shef with 'shef -c /etc/chef/client.rb'
MG
On Wed, Dec 7, 2011 at 10:09 PM, Peter Donald peter@realityforge.org wrote:
Hi,
We had this exception when the chef-solr service died on the chef
server. Figure out what killed itc and restart was our approach.
- This was a result of some process updating the owner/permissions on
/var/log/chef and /var/run/chef so that solr failed during startup.
On Thu, Dec 8, 2011 at 4:02 PM, Michael Glenney mike.glenney@gmail.com wrote:
I'm having problems with a 403 error when trying a new cookbook and I think
I've tracked it down to databag access.
Chef Server 10.0
chef-client 10.4
The first several lines of the stacktrace:
Generated at 2011-12-08 04:12:07 +0000
Net::HTTPServerException: 403 "Forbidden"
/usr/lib/ruby/1.9.1/net/http.rb:2303:in error!' /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/rest.rb:237:in
block in
api_request'
/usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/rest.rb:288:in
retriable_rest_request' /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/rest.rb:218:in
api_request'
/usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/rest.rb:130:in put_rest' /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/data_bag_item.rb:227:in
save'
/var/chef/cache/cookbooks/ejabberd/recipes/default.rb:45:in from_file' /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/mixin/from_file.rb:30:in
instance_eval'
/usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/mixin/from_file.rb:30:in
from_file' /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/cookbook_version.rb:578:in
load_recipe'
The relevant part of that recipe is:
Find cluster master, or create item and become master
if search(:ejabberd, "id:#{deployid}").count == 0
masternode = "#{node[:ipaddress]}"
h = {}
h[deployid] = {"id" => deployid, "master" => masternode, "members" =>
[masternode]}
Create new data bag item for cluster
databag_item = Chef::DataBagItem.new
databag_item.data_bag("ejabberd")
databag_item.raw_data = h[deployid]
databag_item.save
else
This is for a first chef run on a new node. If I run shef, switch to recipe
context, and run 'search(:ejabberd, "id:deployment_000010182")' I get back:
chef:recipe > search(:ejabberd, "id:deployment_000010182")
[Thu, 08 Dec 2011 04:51:05 +0000] ERROR: Connection refused connecting to
localhost:4000 for /search/ejabberd, retry 1/5
[Thu, 08 Dec 2011 04:51:10 +0000] ERROR: Connection refused connecting to
localhost:4000 for /search/ejabberd, retry 2/5
[Thu, 08 Dec 2011 04:51:15 +0000] ERROR: Connection refused connecting to
localhost:4000 for /search/ejabberd, retry 3/5
[Thu, 08 Dec 2011 04:51:20 +0000] ERROR: Connection refused connecting to
localhost:4000 for /search/ejabberd, retry 4/5
[Thu, 08 Dec 2011 04:51:25 +0000] ERROR: Connection refused connecting to
localhost:4000 for /search/ejabberd, retry 5/5
Errno::ECONNREFUSED: Connection refused - Connection refused connecting to
localhost:4000 for /search/ejabberd, giving up
but my /etc/chef/client.rb has the proper url:port for my chef server. If I
run the same command from my local box I get back:
chef:recipe > search(:ejabberd, "id:deployment_000010182")
=>
which is what I expect. Any ideas where I should be looking?
Thanks,
MG
--
Cheers,
Peter Donald