Disk running out of space

Hi

I’ve been running chef-server since about March of this year on our infrastructure for 81 nodes. After running out of space twice due to couchDB compaction being incorrectly configured, I finally got it working right, but I’m still beginning to run out of space at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we’ve got over 11,000 documents stored in the ‘chef’ database taking up 42GB. Looking at the docs, they’re mostly ‘sandbox’ objects from what I can tell, with the occasional ‘data_bag’ object. Also, the vast majority of the sandbox objects I’ve looked at have a ‘create_time’ of march or april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB disk allocated exclusively for couchdb data and that’s a lot larger than I’d really like
  3. could it be something else that’s misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA. We’re running chef 0.10.8.

I’ve got a couple of recipes that set node attributes when they run; primarily datestamps for when some recipes are run so they don’t get run on a regular basis and things like that, but I’m not aware of anything else I could be doing that would cause such growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

…spike

btw, this is my chef compaction script:

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our infrastructure for 81 nodes. After running out of space twice due to couchDB compaction being incorrectly configured, I finally got it working right, but I'm still beginning to run out of space at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000 documents stored in the 'chef' database taking up 42GB. Looking at the docs, they're mostly 'sandbox' objects from what I can tell, with the occasional 'data_bag' object. Also, the vast majority of the sandbox objects I've looked at have a 'create_time' of march or april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB disk allocated exclusively for couchdb data and that's a lot larger than I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA. We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run; primarily datestamps for when some recipes are run so they don't get run on a regular basis and things like that, but I'm not aware of anything else I could be doing that would cause such growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

Do you bump up the cookbook versions too often? Check if you have older
versions of cookbooks thst you dont use,
On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com
wrote:

btw, this is my chef compaction script:

Chef compaction script run via cron · GitHub

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our
infrastructure for 81 nodes. After running out of space twice due to
couchDB compaction being incorrectly configured, I finally got it working
right, but I'm still beginning to run out of space at a rate of around
1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000
documents stored in the 'chef' database taking up 42GB. Looking at the
docs, they're mostly 'sandbox' objects from what I can tell, with the
occasional 'data_bag' object. Also, the vast majority of the sandbox
objects I've looked at have a 'create_time' of march or april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB
    disk allocated exclusively for couchdb data and that's a lot larger than
    I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.
We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;
primarily datestamps for when some recipes are run so they don't get run on
a regular basis and things like that, but I'm not aware of anything else I
could be doing that would cause such growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started versioning as I needed to keep separate versions for prod and staging environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may have uploaded my cookbooks a few thousand times, but I've only made 2 or 3 changes in the last month.

could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com wrote:
btw, this is my chef compaction script:

Chef compaction script run via cron · GitHub

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our infrastructure for 81 nodes. After running out of space twice due to couchDB compaction being incorrectly configured, I finally got it working right, but I'm still beginning to run out of space at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000 documents stored in the 'chef' database taking up 42GB. Looking at the docs, they're mostly 'sandbox' objects from what I can tell, with the occasional 'data_bag' object. Also, the vast majority of the sandbox objects I've looked at have a 'create_time' of march or april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB disk allocated exclusively for couchdb data and that's a lot larger than I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA. We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run; primarily datestamps for when some recipes are run so they don't get run on a regular basis and things like that, but I'm not aware of anything else I could be doing that would cause such growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

Check that you’ve set the _revs_limit for the db, otherwise compaction
will retain more versions than you need.

Something like:
% curl -X PUT -d ‘1’ ‘localhost:5984/chef/_revs_limit’

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com
Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com
Date: Friday, November 16, 2012 4:26 PM
To: "chef@lists.opscode.com" chef@lists.opscode.com
Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started
versioning as I needed to keep separate versions for prod and staging
environments.

I can’t imagine version bumps causing 40GB+ of disk usage, though. I may
have uploaded my cookbooks a few thousand times, but I’ve only made 2 or 3
changes in the last month.

could there be something else?

…spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older
versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, “Spike Grobstein” spike@ticketevolution.com
wrote:

btw, this is my chef compaction script:

forgot to include that before.

…spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I’ve been running chef-server since about March of this year on our
infrastructure for 81 nodes. After running out of space twice due to
couchDB compaction being incorrectly configured, I finally got it working
right, but I’m still beginning to run out of space
at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we’ve got over 11,000
documents stored in the ‘chef’ database taking up 42GB. Looking at the
docs, they’re mostly ‘sandbox’ objects from what I can tell, with the
occasional ‘data_bag’ object. Also, the vast majority
of the sandbox objects I’ve looked at have a ‘create_time’ of march or
april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB
    disk allocated exclusively for couchdb data and that’s a lot larger than
    I’d really like
  3. could it be something else that’s misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.
We’re running chef 0.10.8.

I’ve got a couple of recipes that set node attributes when they run;
primarily datestamps for when some recipes are run so they don’t get run
on a regular basis and things like that, but I’m not aware of anything
else I could be doing that would cause such
growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

…spike

Hi Mark,

I did that and ran my compaction script again...

in the couchdb web UI, the size of the DB went from 41.9GB to 41.3GB and doesn't seem to be getting smaller. Additionally I did:

$ curl localhost:5984/chef
{"db_name":"chef","doc_count":11178,"doc_del_count":204,"update_seq":497447,"purge_seq":0,"compact_running":false,"disk_size":44302975077,"instance_start_time":"1353113098236259","disk_format_version":5,"committed_update_seq":497447}

and it says it's does not have a compact_running.

Is there anything else I can do?

Is there any chance that this will reduce the amount of space used on-disk? I know some database systems will keep the space on-disk requisitioned, but I'm not super experienced with couchdb.

Thanks.

...spike

On Nov 16, 2012, at 7:36 PM, Mark Anderson wrote:

Check that you've set the _revs_limit for the db, otherwise compaction
will retain more versions than you need.

Something like:
% curl -X PUT -d '1' 'localhost:5984/chef/_revs_limit'

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com
Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com
Date: Friday, November 16, 2012 4:26 PM
To: "chef@lists.opscode.com" chef@lists.opscode.com
Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started
versioning as I needed to keep separate versions for prod and staging
environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may
have uploaded my cookbooks a few thousand times, but I've only made 2 or 3
changes in the last month.

could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older
versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com
wrote:

btw, this is my chef compaction script:

Chef compaction script run via cron · GitHub

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our
infrastructure for 81 nodes. After running out of space twice due to
couchDB compaction being incorrectly configured, I finally got it working
right, but I'm still beginning to run out of space
at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000
documents stored in the 'chef' database taking up 42GB. Looking at the
docs, they're mostly 'sandbox' objects from what I can tell, with the
occasional 'data_bag' object. Also, the vast majority
of the sandbox objects I've looked at have a 'create_time' of march or
april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB
    disk allocated exclusively for couchdb data and that's a lot larger than
    I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.
We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;
primarily datestamps for when some recipes are run so they don't get run
on a regular basis and things like that, but I'm not aware of anything
else I could be doing that would cause such
growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

also, regarding the sandbox documents, I have a TON of documents that look like this:

{
"_id": "ffe7561e-0065-4396-9638-bb1a23cae511",
"_rev": "2-cfde619b3a2cd62afb8767d838b00712",
"create_time": "2012-03-26T21:05:42+00:00",
"json_class": "Chef::Sandbox",
"is_completed": true,
"name": "8311466ba0b845788c33bc7ef2fcdacf",
"checksums": [
],
"chef_type": "sandbox",
"guid": "8311466ba0b845788c33bc7ef2fcdacf"
}

is there any reason for these? Is there an easy way to safely delete them?

...spike

On Nov 16, 2012, at 7:50 PM, Spike Grobstein wrote:

Hi Mark,

I did that and ran my compaction script again...

in the couchdb web UI, the size of the DB went from 41.9GB to 41.3GB and doesn't seem to be getting smaller. Additionally I did:

$ curl localhost:5984/chef
{"db_name":"chef","doc_count":11178,"doc_del_count":204,"update_seq":497447,"purge_seq":0,"compact_running":false,"disk_size":44302975077,"instance_start_time":"1353113098236259","disk_format_version":5,"committed_update_seq":497447}

and it says it's does not have a compact_running.

Is there anything else I can do?

Is there any chance that this will reduce the amount of space used on-disk? I know some database systems will keep the space on-disk requisitioned, but I'm not super experienced with couchdb.

Thanks.

...spike

On Nov 16, 2012, at 7:36 PM, Mark Anderson wrote:

Check that you've set the _revs_limit for the db, otherwise compaction
will retain more versions than you need.

Something like:
% curl -X PUT -d '1' 'localhost:5984/chef/_revs_limit'

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com
Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com
Date: Friday, November 16, 2012 4:26 PM
To: "chef@lists.opscode.com" chef@lists.opscode.com
Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started
versioning as I needed to keep separate versions for prod and staging
environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may
have uploaded my cookbooks a few thousand times, but I've only made 2 or 3
changes in the last month.

could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older
versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com
wrote:

btw, this is my chef compaction script:

Chef compaction script run via cron · GitHub

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our
infrastructure for 81 nodes. After running out of space twice due to
couchDB compaction being incorrectly configured, I finally got it working
right, but I'm still beginning to run out of space
at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000
documents stored in the 'chef' database taking up 42GB. Looking at the
docs, they're mostly 'sandbox' objects from what I can tell, with the
occasional 'data_bag' object. Also, the vast majority
of the sandbox objects I've looked at have a 'create_time' of march or
april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB
    disk allocated exclusively for couchdb data and that's a lot larger than
    I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.
We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;
primarily datestamps for when some recipes are run so they don't get run
on a regular basis and things like that, but I'm not aware of anything
else I could be doing that would cause such
growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

It's been awhile since I looked at this but here's my couchdb compact
recipe. It's similar in functionality to yours except for the last 5 lines:

http_request "cleanup chef couchDB" do
action :post
url "#{Chef::Config[:couchdb_url]}/chef/_view_cleanup"
end

Full recipe: 4092905’s gists · GitHub

Our chef couchdb uses around 100mb.

On Fri, Nov 16, 2012 at 5:01 PM, Spike Grobstein
spike@ticketevolution.comwrote:

also, regarding the sandbox documents, I have a TON of documents that look
like this:

{
"_id": "ffe7561e-0065-4396-9638-bb1a23cae511",
"_rev": "2-cfde619b3a2cd62afb8767d838b00712",
"create_time": "2012-03-26T21:05:42+00:00",
"json_class": "Chef::Sandbox",
"is_completed": true,
"name": "8311466ba0b845788c33bc7ef2fcdacf",
"checksums": [
],
"chef_type": "sandbox",
"guid": "8311466ba0b845788c33bc7ef2fcdacf"
}

is there any reason for these? Is there an easy way to safely delete them?

...spike

On Nov 16, 2012, at 7:50 PM, Spike Grobstein wrote:

Hi Mark,

I did that and ran my compaction script again...

in the couchdb web UI, the size of the DB went from 41.9GB to 41.3GB and
doesn't seem to be getting smaller. Additionally I did:

$ curl localhost:5984/chef

{"db_name":"chef","doc_count":11178,"doc_del_count":204,"update_seq":497447,"purge_seq":0,"compact_running":false,"disk_size":44302975077,"instance_start_time":"1353113098236259","disk_format_version":5,"committed_update_seq":497447}

and it says it's does not have a compact_running.

Is there anything else I can do?

Is there any chance that this will reduce the amount of space used
on-disk? I know some database systems will keep the space on-disk
requisitioned, but I'm not super experienced with couchdb.

Thanks.

...spike

On Nov 16, 2012, at 7:36 PM, Mark Anderson wrote:

Check that you've set the _revs_limit for the db, otherwise compaction

will retain more versions than you need.

Something like:

% curl -X PUT -d '1' 'localhost:5984/chef/_revs_limit'

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com

Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com

Date: Friday, November 16, 2012 4:26 PM

To: "chef@lists.opscode.com" chef@lists.opscode.com

Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started

versioning as I needed to keep separate versions for prod and staging

environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may

have uploaded my cookbooks a few thousand times, but I've only made 2 or 3

changes in the last month.

could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older

versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com

wrote:

btw, this is my chef compaction script:

Chef compaction script run via cron · GitHub

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our

infrastructure for 81 nodes. After running out of space twice due to

couchDB compaction being incorrectly configured, I finally got it working

right, but I'm still beginning to run out of space

at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000

documents stored in the 'chef' database taking up 42GB. Looking at the

docs, they're mostly 'sandbox' objects from what I can tell, with the

occasional 'data_bag' object. Also, the vast majority

of the sandbox objects I've looked at have a 'create_time' of march or

april of this year.

So my questions are:

  1. is there any way to clean up these old documents

  2. is there any way to prevent couchdb from getting larger? I have a 60GB

disk allocated exclusively for couchdb data and that's a lot larger than

I'd really like

  1. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.

We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;

primarily datestamps for when some recipes are run so they don't get run

on a regular basis and things like that, but I'm not aware of anything

else I could be doing that would cause such

growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

I've added that line to my script and run it, but it doesn't appear to have done anything.

I'll keep an eye on my disk usage and see if that stifles the growth. I've got around 20GB free on the partition, so I should be safe for the next week while I observe.

...spike

On Nov 16, 2012, at 10:05 PM, Peter Struijk wrote:

It's been awhile since I looked at this but here's my couchdb compact recipe. It's similar in functionality to yours except for the last 5 lines:

http_request "cleanup chef couchDB" do
action :post
url "#{Chef::Config[:couchdb_url]}/chef/_view_cleanup"
end

Full recipe: 4092905’s gists · GitHub

Our chef couchdb uses around 100mb.

On Fri, Nov 16, 2012 at 5:01 PM, Spike Grobstein spike@ticketevolution.com wrote:
also, regarding the sandbox documents, I have a TON of documents that look like this:

{
"_id": "ffe7561e-0065-4396-9638-bb1a23cae511",
"_rev": "2-cfde619b3a2cd62afb8767d838b00712",
"create_time": "2012-03-26T21:05:42+00:00",
"json_class": "Chef::Sandbox",
"is_completed": true,
"name": "8311466ba0b845788c33bc7ef2fcdacf",
"checksums": [
],
"chef_type": "sandbox",
"guid": "8311466ba0b845788c33bc7ef2fcdacf"
}

is there any reason for these? Is there an easy way to safely delete them?

...spike

On Nov 16, 2012, at 7:50 PM, Spike Grobstein wrote:

Hi Mark,

I did that and ran my compaction script again...

in the couchdb web UI, the size of the DB went from 41.9GB to 41.3GB and doesn't seem to be getting smaller. Additionally I did:

$ curl localhost:5984/chef
{"db_name":"chef","doc_count":11178,"doc_del_count":204,"update_seq":497447,"purge_seq":0,"compact_running":false,"disk_size":44302975077,"instance_start_time":"1353113098236259","disk_format_version":5,"committed_update_seq":497447}

and it says it's does not have a compact_running.

Is there anything else I can do?

Is there any chance that this will reduce the amount of space used on-disk? I know some database systems will keep the space on-disk requisitioned, but I'm not super experienced with couchdb.

Thanks.

...spike

On Nov 16, 2012, at 7:36 PM, Mark Anderson wrote:

Check that you've set the _revs_limit for the db, otherwise compaction
will retain more versions than you need.

Something like:
% curl -X PUT -d '1' 'localhost:5984/chef/_revs_limit'

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com
Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com
Date: Friday, November 16, 2012 4:26 PM
To: "chef@lists.opscode.com" chef@lists.opscode.com
Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started
versioning as I needed to keep separate versions for prod and staging
environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may
have uploaded my cookbooks a few thousand times, but I've only made 2 or 3
changes in the last month.

could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older
versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com
wrote:

btw, this is my chef compaction script:

Chef compaction script run via cron · GitHub

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our
infrastructure for 81 nodes. After running out of space twice due to
couchDB compaction being incorrectly configured, I finally got it working
right, but I'm still beginning to run out of space
at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000
documents stored in the 'chef' database taking up 42GB. Looking at the
docs, they're mostly 'sandbox' objects from what I can tell, with the
occasional 'data_bag' object. Also, the vast majority
of the sandbox objects I've looked at have a 'create_time' of march or
april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB
    disk allocated exclusively for couchdb data and that's a lot larger than
    I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.
We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;
primarily datestamps for when some recipes are run so they don't get run
on a regular basis and things like that, but I'm not aware of anything
else I could be doing that would cause such
growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

I've run into this as well and came up with the same script. I've ended up resorting to nightly restarts to keep the disk usage in line.
Without the restarts the view files stick around, sounds like that's what's happen here

Sent from a phone

On Nov 16, 2012, at 7:09 PM, Spike Grobstein spike@ticketevolution.com wrote:

I've added that line to my script and run it, but it doesn't appear to have done anything.

I'll keep an eye on my disk usage and see if that stifles the growth. I've got around 20GB free on the partition, so I should be safe for the next week while I observe.

...spike

On Nov 16, 2012, at 10:05 PM, Peter Struijk wrote:

It's been awhile since I looked at this but here's my couchdb compact recipe. It's similar in functionality to yours except for the last 5 lines:

http_request "cleanup chef couchDB" do
action :post
url "#{Chef::Config[:couchdb_url]}/chef/_view_cleanup"
end

Full recipe: 4092905’s gists · GitHub

Our chef couchdb uses around 100mb.

On Fri, Nov 16, 2012 at 5:01 PM, Spike Grobstein spike@ticketevolution.com wrote:

also, regarding the sandbox documents, I have a TON of documents that look like this:

{
"_id": "ffe7561e-0065-4396-9638-bb1a23cae511",
"_rev": "2-cfde619b3a2cd62afb8767d838b00712",
"create_time": "2012-03-26T21:05:42+00:00",
"json_class": "Chef::Sandbox",
"is_completed": true,
"name": "8311466ba0b845788c33bc7ef2fcdacf",
"checksums": [
],
"chef_type": "sandbox",
"guid": "8311466ba0b845788c33bc7ef2fcdacf"
}

is there any reason for these? Is there an easy way to safely delete them?

...spike

On Nov 16, 2012, at 7:50 PM, Spike Grobstein wrote:

Hi Mark,

I did that and ran my compaction script again...

in the couchdb web UI, the size of the DB went from 41.9GB to 41.3GB and doesn't seem to be getting smaller. Additionally I did:

$ curl localhost:5984/chef
{"db_name":"chef","doc_count":11178,"doc_del_count":204,"update_seq":497447,"purge_seq":0,"compact_running":false,"disk_size":44302975077,"instance_start_time":"1353113098236259","disk_format_version":5,"committed_update_seq":497447}

and it says it's does not have a compact_running.

Is there anything else I can do?

Is there any chance that this will reduce the amount of space used on-disk? I know some database systems will keep the space on-disk requisitioned, but I'm not super experienced with couchdb.

Thanks.

...spike

On Nov 16, 2012, at 7:36 PM, Mark Anderson wrote:

Check that you've set the _revs_limit for the db, otherwise compaction
will retain more versions than you need.

Something like:
% curl -X PUT -d '1' 'localhost:5984/chef/_revs_limit'

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com
Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com
Date: Friday, November 16, 2012 4:26 PM
To: "chef@lists.opscode.com" chef@lists.opscode.com
Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started
versioning as I needed to keep separate versions for prod and staging
environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may
have uploaded my cookbooks a few thousand times, but I've only made 2 or 3
changes in the last month.
could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older
versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com
wrote:

btw, this is my chef compaction script:
Chef compaction script run via cron · GitHub

forgot to include that before.

...spike
On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our
infrastructure for 81 nodes. After running out of space twice due to
couchDB compaction being incorrectly configured, I finally got it working
right, but I'm still beginning to run out of space
at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000
documents stored in the 'chef' database taking up 42GB. Looking at the
docs, they're mostly 'sandbox' objects from what I can tell, with the
occasional 'data_bag' object. Also, the vast majority
of the sandbox objects I've looked at have a 'create_time' of march or
april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB
    disk allocated exclusively for couchdb data and that's a lot larger than
    I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.
We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;
primarily datestamps for when some recipes are run so they don't get run
on a regular basis and things like that, but I'm not aware of anything
else I could be doing that would cause such
growth in the database.

Hopefully you guys can shed some light on this.

Thanks!
...spike

when you say restart, do you mean restarting the daemon or restarting the whole server?

...spike

On Nov 16, 2012, at 11:31 PM, Chris wrote:

I've run into this as well and came up with the same script. I've ended up resorting to nightly restarts to keep the disk usage in line.
Without the restarts the view files stick around, sounds like that's what's happen here

Sent from a phone

On Nov 16, 2012, at 7:09 PM, Spike Grobstein spike@ticketevolution.com wrote:

I've added that line to my script and run it, but it doesn't appear to have done anything.

I'll keep an eye on my disk usage and see if that stifles the growth. I've got around 20GB free on the partition, so I should be safe for the next week while I observe.

...spike

On Nov 16, 2012, at 10:05 PM, Peter Struijk wrote:

It's been awhile since I looked at this but here's my couchdb compact recipe. It's similar in functionality to yours except for the last 5 lines:

http_request "cleanup chef couchDB" do
action :post
url "#{Chef::Config[:couchdb_url]}/chef/_view_cleanup"
end

Full recipe: 4092905’s gists · GitHub

Our chef couchdb uses around 100mb.

On Fri, Nov 16, 2012 at 5:01 PM, Spike Grobstein spike@ticketevolution.com wrote:
also, regarding the sandbox documents, I have a TON of documents that look like this:

{
"_id": "ffe7561e-0065-4396-9638-bb1a23cae511",
"_rev": "2-cfde619b3a2cd62afb8767d838b00712",
"create_time": "2012-03-26T21:05:42+00:00",
"json_class": "Chef::Sandbox",
"is_completed": true,
"name": "8311466ba0b845788c33bc7ef2fcdacf",
"checksums": [
],
"chef_type": "sandbox",
"guid": "8311466ba0b845788c33bc7ef2fcdacf"
}

is there any reason for these? Is there an easy way to safely delete them?

...spike

On Nov 16, 2012, at 7:50 PM, Spike Grobstein wrote:

Hi Mark,

I did that and ran my compaction script again...

in the couchdb web UI, the size of the DB went from 41.9GB to 41.3GB and doesn't seem to be getting smaller. Additionally I did:

$ curl localhost:5984/chef
{"db_name":"chef","doc_count":11178,"doc_del_count":204,"update_seq":497447,"purge_seq":0,"compact_running":false,"disk_size":44302975077,"instance_start_time":"1353113098236259","disk_format_version":5,"committed_update_seq":497447}

and it says it's does not have a compact_running.

Is there anything else I can do?

Is there any chance that this will reduce the amount of space used on-disk? I know some database systems will keep the space on-disk requisitioned, but I'm not super experienced with couchdb.

Thanks.

...spike

On Nov 16, 2012, at 7:36 PM, Mark Anderson wrote:

Check that you've set the _revs_limit for the db, otherwise compaction
will retain more versions than you need.

Something like:
% curl -X PUT -d '1' 'localhost:5984/chef/_revs_limit'

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com
Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com
Date: Friday, November 16, 2012 4:26 PM
To: "chef@lists.opscode.com" chef@lists.opscode.com
Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started
versioning as I needed to keep separate versions for prod and staging
environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may
have uploaded my cookbooks a few thousand times, but I've only made 2 or 3
changes in the last month.

could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older
versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com
wrote:

btw, this is my chef compaction script:

Chef compaction script run via cron · GitHub

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our
infrastructure for 81 nodes. After running out of space twice due to
couchDB compaction being incorrectly configured, I finally got it working
right, but I'm still beginning to run out of space
at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000
documents stored in the 'chef' database taking up 42GB. Looking at the
docs, they're mostly 'sandbox' objects from what I can tell, with the
occasional 'data_bag' object. Also, the vast majority
of the sandbox objects I've looked at have a 'create_time' of march or
april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB
    disk allocated exclusively for couchdb data and that's a lot larger than
    I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.
We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;
primarily datestamps for when some recipes are run so they don't get run
on a regular basis and things like that, but I'm not aware of anything
else I could be doing that would cause such
growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

Just couchdb. It's certainly not ideal, but its kept the monster at bay.

Sent from a phone

On Nov 16, 2012, at 8:32 PM, Spike Grobstein spike@ticketevolution.com wrote:

when you say restart, do you mean restarting the daemon or restarting the whole server?

...spike

On Nov 16, 2012, at 11:31 PM, Chris wrote:

I've run into this as well and came up with the same script. I've ended up resorting to nightly restarts to keep the disk usage in line.
Without the restarts the view files stick around, sounds like that's what's happen here

Sent from a phone

On Nov 16, 2012, at 7:09 PM, Spike Grobstein spike@ticketevolution.com wrote:

I've added that line to my script and run it, but it doesn't appear to have done anything.

I'll keep an eye on my disk usage and see if that stifles the growth. I've got around 20GB free on the partition, so I should be safe for the next week while I observe.

...spike

On Nov 16, 2012, at 10:05 PM, Peter Struijk wrote:

It's been awhile since I looked at this but here's my couchdb compact recipe. It's similar in functionality to yours except for the last 5 lines:

http_request "cleanup chef couchDB" do
action :post
url "#{Chef::Config[:couchdb_url]}/chef/_view_cleanup"
end

Full recipe: 4092905’s gists · GitHub

Our chef couchdb uses around 100mb.

On Fri, Nov 16, 2012 at 5:01 PM, Spike Grobstein spike@ticketevolution.com wrote:

also, regarding the sandbox documents, I have a TON of documents that look like this:

{
"_id": "ffe7561e-0065-4396-9638-bb1a23cae511",
"_rev": "2-cfde619b3a2cd62afb8767d838b00712",
"create_time": "2012-03-26T21:05:42+00:00",
"json_class": "Chef::Sandbox",
"is_completed": true,
"name": "8311466ba0b845788c33bc7ef2fcdacf",
"checksums": [
],
"chef_type": "sandbox",
"guid": "8311466ba0b845788c33bc7ef2fcdacf"
}

is there any reason for these? Is there an easy way to safely delete them?

...spike

On Nov 16, 2012, at 7:50 PM, Spike Grobstein wrote:

Hi Mark,

I did that and ran my compaction script again...

in the couchdb web UI, the size of the DB went from 41.9GB to 41.3GB and doesn't seem to be getting smaller. Additionally I did:

$ curl localhost:5984/chef
{"db_name":"chef","doc_count":11178,"doc_del_count":204,"update_seq":497447,"purge_seq":0,"compact_running":false,"disk_size":44302975077,"instance_start_time":"1353113098236259","disk_format_version":5,"committed_update_seq":497447}

and it says it's does not have a compact_running.

Is there anything else I can do?

Is there any chance that this will reduce the amount of space used on-disk? I know some database systems will keep the space on-disk requisitioned, but I'm not super experienced with couchdb.

Thanks.

...spike

On Nov 16, 2012, at 7:36 PM, Mark Anderson wrote:

Check that you've set the _revs_limit for the db, otherwise compaction
will retain more versions than you need.

Something like:
% curl -X PUT -d '1' 'localhost:5984/chef/_revs_limit'

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com
Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com
Date: Friday, November 16, 2012 4:26 PM
To: "chef@lists.opscode.com" chef@lists.opscode.com
Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started
versioning as I needed to keep separate versions for prod and staging
environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may
have uploaded my cookbooks a few thousand times, but I've only made 2 or 3
changes in the last month.
could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older
versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com
wrote:

btw, this is my chef compaction script:
Chef compaction script run via cron · GitHub

forgot to include that before.

...spike
On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our
infrastructure for 81 nodes. After running out of space twice due to
couchDB compaction being incorrectly configured, I finally got it working
right, but I'm still beginning to run out of space
at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000
documents stored in the 'chef' database taking up 42GB. Looking at the
docs, they're mostly 'sandbox' objects from what I can tell, with the
occasional 'data_bag' object. Also, the vast majority
of the sandbox objects I've looked at have a 'create_time' of march or
april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB
    disk allocated exclusively for couchdb data and that's a lot larger than
    I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.
We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;
primarily datestamps for when some recipes are run so they don't get run
on a regular basis and things like that, but I'm not aware of anything
else I could be doing that would cause such
growth in the database.

Hopefully you guys can shed some light on this.

Thanks!
...spike

Hmm. that doesn't seem to be helping either.

Is there anything else that I can do?

...spike

On Nov 16, 2012, at 11:47 PM, Chris wrote:

Just couchdb. It's certainly not ideal, but its kept the monster at bay.

Sent from a phone

On Nov 16, 2012, at 8:32 PM, Spike Grobstein spike@ticketevolution.com wrote:

when you say restart, do you mean restarting the daemon or restarting the whole server?

...spike

On Nov 16, 2012, at 11:31 PM, Chris wrote:

I've run into this as well and came up with the same script. I've ended up resorting to nightly restarts to keep the disk usage in line.
Without the restarts the view files stick around, sounds like that's what's happen here

Sent from a phone

On Nov 16, 2012, at 7:09 PM, Spike Grobstein spike@ticketevolution.com wrote:

I've added that line to my script and run it, but it doesn't appear to have done anything.

I'll keep an eye on my disk usage and see if that stifles the growth. I've got around 20GB free on the partition, so I should be safe for the next week while I observe.

...spike

On Nov 16, 2012, at 10:05 PM, Peter Struijk wrote:

It's been awhile since I looked at this but here's my couchdb compact recipe. It's similar in functionality to yours except for the last 5 lines:

http_request "cleanup chef couchDB" do
action :post
url "#{Chef::Config[:couchdb_url]}/chef/_view_cleanup"
end

Full recipe: 4092905’s gists · GitHub

Our chef couchdb uses around 100mb.

On Fri, Nov 16, 2012 at 5:01 PM, Spike Grobstein spike@ticketevolution.com wrote:
also, regarding the sandbox documents, I have a TON of documents that look like this:

{
"_id": "ffe7561e-0065-4396-9638-bb1a23cae511",
"_rev": "2-cfde619b3a2cd62afb8767d838b00712",
"create_time": "2012-03-26T21:05:42+00:00",
"json_class": "Chef::Sandbox",
"is_completed": true,
"name": "8311466ba0b845788c33bc7ef2fcdacf",
"checksums": [
],
"chef_type": "sandbox",
"guid": "8311466ba0b845788c33bc7ef2fcdacf"
}

is there any reason for these? Is there an easy way to safely delete them?

...spike

On Nov 16, 2012, at 7:50 PM, Spike Grobstein wrote:

Hi Mark,

I did that and ran my compaction script again...

in the couchdb web UI, the size of the DB went from 41.9GB to 41.3GB and doesn't seem to be getting smaller. Additionally I did:

$ curl localhost:5984/chef
{"db_name":"chef","doc_count":11178,"doc_del_count":204,"update_seq":497447,"purge_seq":0,"compact_running":false,"disk_size":44302975077,"instance_start_time":"1353113098236259","disk_format_version":5,"committed_update_seq":497447}

and it says it's does not have a compact_running.

Is there anything else I can do?

Is there any chance that this will reduce the amount of space used on-disk? I know some database systems will keep the space on-disk requisitioned, but I'm not super experienced with couchdb.

Thanks.

...spike

On Nov 16, 2012, at 7:36 PM, Mark Anderson wrote:

Check that you've set the _revs_limit for the db, otherwise compaction
will retain more versions than you need.

Something like:
% curl -X PUT -d '1' 'localhost:5984/chef/_revs_limit'

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com
Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com
Date: Friday, November 16, 2012 4:26 PM
To: "chef@lists.opscode.com" chef@lists.opscode.com
Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started
versioning as I needed to keep separate versions for prod and staging
environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may
have uploaded my cookbooks a few thousand times, but I've only made 2 or 3
changes in the last month.

could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older
versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com
wrote:

btw, this is my chef compaction script:

Chef compaction script run via cron · GitHub

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our
infrastructure for 81 nodes. After running out of space twice due to
couchDB compaction being incorrectly configured, I finally got it working
right, but I'm still beginning to run out of space
at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000
documents stored in the 'chef' database taking up 42GB. Looking at the
docs, they're mostly 'sandbox' objects from what I can tell, with the
occasional 'data_bag' object. Also, the vast majority
of the sandbox objects I've looked at have a 'create_time' of march or
april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB
    disk allocated exclusively for couchdb data and that's a lot larger than
    I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.
We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;
primarily datestamps for when some recipes are run so they don't get run
on a regular basis and things like that, but I'm not aware of anything
else I could be doing that would cause such
growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

I'd run into a similar problem with the size of couchdb bloating up to a
few GB. Performing regular compaction as part of a chef recipe that
performs maintenance has eased the problem, and we're now down to a few
hundred megs at max.

https://gist.github.com/3ae18ca9113b27a3f54a

You might need to tweak the disk_size based. We've set it to 1000000.

  • Ketan

Ketan
studios.thoughtworks.com | twitter.com/ketanpkr

On Tue, Nov 20, 2012 at 8:32 PM, Spike Grobstein
spike@ticketevolution.comwrote:

Hmm. that doesn't seem to be helping either.

Is there anything else that I can do?

...spike

On Nov 16, 2012, at 11:47 PM, Chris wrote:

Just couchdb. It's certainly not ideal, but its kept the monster at bay.

Sent from a phone

On Nov 16, 2012, at 8:32 PM, Spike Grobstein spike@ticketevolution.com
wrote:

when you say restart, do you mean restarting the daemon or restarting the
whole server?

...spike

On Nov 16, 2012, at 11:31 PM, Chris wrote:

I've run into this as well and came up with the same script. I've ended up
resorting to nightly restarts to keep the disk usage in line.
Without the restarts the view files stick around, sounds like that's
what's happen here

Sent from a phone

On Nov 16, 2012, at 7:09 PM, Spike Grobstein spike@ticketevolution.com
wrote:

I've added that line to my script and run it, but it doesn't appear to
have done anything.

I'll keep an eye on my disk usage and see if that stifles the growth. I've
got around 20GB free on the partition, so I should be safe for the next
week while I observe.

...spike

On Nov 16, 2012, at 10:05 PM, Peter Struijk wrote:

It's been awhile since I looked at this but here's my couchdb compact
recipe. It's similar in functionality to yours except for the last 5 lines:

http_request "cleanup chef couchDB" do
action :post
url "#{Chef::Config[:couchdb_url]}/chef/_view_cleanup"
end

Full recipe: 4092905’s gists · GitHub

Our chef couchdb uses around 100mb.

On Fri, Nov 16, 2012 at 5:01 PM, Spike Grobstein <
spike@ticketevolution.com> wrote:

also, regarding the sandbox documents, I have a TON of documents that
look like this:

{
"_id": "ffe7561e-0065-4396-9638-bb1a23cae511",
"_rev": "2-cfde619b3a2cd62afb8767d838b00712",
"create_time": "2012-03-26T21:05:42+00:00",
"json_class": "Chef::Sandbox",
"is_completed": true,
"name": "8311466ba0b845788c33bc7ef2fcdacf",
"checksums": [
],
"chef_type": "sandbox",
"guid": "8311466ba0b845788c33bc7ef2fcdacf"
}

is there any reason for these? Is there an easy way to safely delete them?

...spike

On Nov 16, 2012, at 7:50 PM, Spike Grobstein wrote:

Hi Mark,

I did that and ran my compaction script again...

in the couchdb web UI, the size of the DB went from 41.9GB to 41.3GB and
doesn't seem to be getting smaller. Additionally I did:

$ curl localhost:5984/chef

{"db_name":"chef","doc_count":11178,"doc_del_count":204,"update_seq":497447,"purge_seq":0,"compact_running":false,"disk_size":44302975077,"instance_start_time":"1353113098236259","disk_format_version":5,"committed_update_seq":497447}

and it says it's does not have a compact_running.

Is there anything else I can do?

Is there any chance that this will reduce the amount of space used
on-disk? I know some database systems will keep the space on-disk
requisitioned, but I'm not super experienced with couchdb.

Thanks.

...spike

On Nov 16, 2012, at 7:36 PM, Mark Anderson wrote:

Check that you've set the _revs_limit for the db, otherwise compaction

will retain more versions than you need.

Something like:

% curl -X PUT -d '1' 'localhost:5984/chef/_revs_limit'

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com

Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com

Date: Friday, November 16, 2012 4:26 PM

To: "chef@lists.opscode.com" chef@lists.opscode.com

Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started

versioning as I needed to keep separate versions for prod and staging

environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may

have uploaded my cookbooks a few thousand times, but I've only made 2 or 3

changes in the last month.

could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older

versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com

wrote:

btw, this is my chef compaction script:

Chef compaction script run via cron · GitHub

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our

infrastructure for 81 nodes. After running out of space twice due to

couchDB compaction being incorrectly configured, I finally got it working

right, but I'm still beginning to run out of space

at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000

documents stored in the 'chef' database taking up 42GB. Looking at the

docs, they're mostly 'sandbox' objects from what I can tell, with the

occasional 'data_bag' object. Also, the vast majority

of the sandbox objects I've looked at have a 'create_time' of march or

april of this year.

So my questions are:

  1. is there any way to clean up these old documents

  2. is there any way to prevent couchdb from getting larger? I have a 60GB

disk allocated exclusively for couchdb data and that's a lot larger than

I'd really like

  1. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.

We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;

primarily datestamps for when some recipes are run so they don't get run

on a regular basis and things like that, but I'm not aware of anything

else I could be doing that would cause such

growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

Hi Ketan,

Yeah, I've got something similar set up:

I have that running once a day, but it's not affecting our free space and at the rate it's going, I'll be out of space in about a month. The couchdb size has already passed 40GB.

I really want to be able to just delete old data. I've got tons of Sandbox data going back as far as march.

...spike

On Nov 20, 2012, at 11:36 AM, Ketan Padegaonkar wrote:

I'd run into a similar problem with the size of couchdb bloating up to a few GB. Performing regular compaction as part of a chef recipe that performs maintenance has eased the problem, and we're now down to a few hundred megs at max.

https://gist.github.com/3ae18ca9113b27a3f54a

You might need to tweak the disk_size based. We've set it to 1000000.

  • Ketan

Ketan
studios.thoughtworks.com | twitter.com/ketanpkr

On Tue, Nov 20, 2012 at 8:32 PM, Spike Grobstein spike@ticketevolution.com wrote:
Hmm. that doesn't seem to be helping either.

Is there anything else that I can do?

...spike

On Nov 16, 2012, at 11:47 PM, Chris wrote:

Just couchdb. It's certainly not ideal, but its kept the monster at bay.

Sent from a phone

On Nov 16, 2012, at 8:32 PM, Spike Grobstein spike@ticketevolution.com wrote:

when you say restart, do you mean restarting the daemon or restarting the whole server?

...spike

On Nov 16, 2012, at 11:31 PM, Chris wrote:

I've run into this as well and came up with the same script. I've ended up resorting to nightly restarts to keep the disk usage in line.
Without the restarts the view files stick around, sounds like that's what's happen here

Sent from a phone

On Nov 16, 2012, at 7:09 PM, Spike Grobstein spike@ticketevolution.com wrote:

I've added that line to my script and run it, but it doesn't appear to have done anything.

I'll keep an eye on my disk usage and see if that stifles the growth. I've got around 20GB free on the partition, so I should be safe for the next week while I observe.

...spike

On Nov 16, 2012, at 10:05 PM, Peter Struijk wrote:

It's been awhile since I looked at this but here's my couchdb compact recipe. It's similar in functionality to yours except for the last 5 lines:

http_request "cleanup chef couchDB" do
action :post
url "#{Chef::Config[:couchdb_url]}/chef/_view_cleanup"
end

Full recipe: 4092905’s gists · GitHub

Our chef couchdb uses around 100mb.

On Fri, Nov 16, 2012 at 5:01 PM, Spike Grobstein spike@ticketevolution.com wrote:
also, regarding the sandbox documents, I have a TON of documents that look like this:

{
"_id": "ffe7561e-0065-4396-9638-bb1a23cae511",
"_rev": "2-cfde619b3a2cd62afb8767d838b00712",
"create_time": "2012-03-26T21:05:42+00:00",
"json_class": "Chef::Sandbox",
"is_completed": true,
"name": "8311466ba0b845788c33bc7ef2fcdacf",
"checksums": [
],
"chef_type": "sandbox",
"guid": "8311466ba0b845788c33bc7ef2fcdacf"
}

is there any reason for these? Is there an easy way to safely delete them?

...spike

On Nov 16, 2012, at 7:50 PM, Spike Grobstein wrote:

Hi Mark,

I did that and ran my compaction script again...

in the couchdb web UI, the size of the DB went from 41.9GB to 41.3GB and doesn't seem to be getting smaller. Additionally I did:

$ curl localhost:5984/chef
{"db_name":"chef","doc_count":11178,"doc_del_count":204,"update_seq":497447,"purge_seq":0,"compact_running":false,"disk_size":44302975077,"instance_start_time":"1353113098236259","disk_format_version":5,"committed_update_seq":497447}

and it says it's does not have a compact_running.

Is there anything else I can do?

Is there any chance that this will reduce the amount of space used on-disk? I know some database systems will keep the space on-disk requisitioned, but I'm not super experienced with couchdb.

Thanks.

...spike

On Nov 16, 2012, at 7:36 PM, Mark Anderson wrote:

Check that you've set the _revs_limit for the db, otherwise compaction
will retain more versions than you need.

Something like:
% curl -X PUT -d '1' 'localhost:5984/chef/_revs_limit'

Sets it to 1, which is fine for chef.

From: Spike Grobstein spike@ticketevolution.com
Reply-To: "chef@lists.opscode.com" chef@lists.opscode.com
Date: Friday, November 16, 2012 4:26 PM
To: "chef@lists.opscode.com" chef@lists.opscode.com
Subject: [chef] Re: disk running out of space

Hi Ranjib,

I may have had 20 version bumps ever. I only just recently started
versioning as I needed to keep separate versions for prod and staging
environments.

I can't imagine version bumps causing 40GB+ of disk usage, though. I may
have uploaded my cookbooks a few thousand times, but I've only made 2 or 3
changes in the last month.

could there be something else?

...spike

On Nov 16, 2012, at 7:23 PM, Ranjib Dey wrote:

Do you bump up the cookbook versions too often? Check if you have older
versions of cookbooks thst you dont use,

On Nov 16, 2012 3:40 PM, "Spike Grobstein" spike@ticketevolution.com
wrote:

btw, this is my chef compaction script:

Chef compaction script run via cron · GitHub

forgot to include that before.

...spike

On Nov 16, 2012, at 6:38 PM, Spike Grobstein wrote:

Hi

I've been running chef-server since about March of this year on our
infrastructure for 81 nodes. After running out of space twice due to
couchDB compaction being incorrectly configured, I finally got it working
right, but I'm still beginning to run out of space
at a rate of around 1.5GB per week.

I began poking around in the couchDB web UI and we've got over 11,000
documents stored in the 'chef' database taking up 42GB. Looking at the
docs, they're mostly 'sandbox' objects from what I can tell, with the
occasional 'data_bag' object. Also, the vast majority
of the sandbox objects I've looked at have a 'create_time' of march or
april of this year.

So my questions are:

  1. is there any way to clean up these old documents
  2. is there any way to prevent couchdb from getting larger? I have a 60GB
    disk allocated exclusively for couchdb data and that's a lot larger than
    I'd really like
  3. could it be something else that's misconfigured?

Some details about the installation:

The server is Ubuntu and chef-server was installed via the OpsCode PPA.
We're running chef 0.10.8.

I've got a couple of recipes that set node attributes when they run;
primarily datestamps for when some recipes are run so they don't get run
on a regular basis and things like that, but I'm not aware of anything
else I could be doing that would cause such
growth in the database.

Hopefully you guys can shed some light on this.

Thanks!

...spike

On Tuesday, November 20, 2012 at 8:45 AM, Spike Grobstein wrote:

Hi Ketan,

Yeah, I've got something similar set up:

Chef compaction script run via cron · GitHub

I have that running once a day, but it's not affecting our free space and at the rate it's going, I'll be out of space in about a month. The couchdb size has already passed 40GB.

I really want to be able to just delete old data. I've got tons of Sandbox data going back as far as march.

...spike

Sandboxes should be safe to delete, but I'm not sure it will fix your issue. Would be interested to hear your results.

--
Daniel DeLeo

Hi Daniel,

I don't have much experience with couchdb. What's the best way to do this?

I'd do it through the couchdb admin interface, but it's like 10,000 documents.

...spike

On Nov 20, 2012, at 11:51 AM, Daniel DeLeo wrote:

On Tuesday, November 20, 2012 at 8:45 AM, Spike Grobstein wrote:

Hi Ketan,

Yeah, I've got something similar set up:

Chef compaction script run via cron · GitHub

I have that running once a day, but it's not affecting our free space and at the rate it's going, I'll be out of space in about a month. The couchdb size has already passed 40GB.

I really want to be able to just delete old data. I've got tons of Sandbox data going back as far as march.

...spike

Sandboxes should be safe to delete, but I'm not sure it will fix your issue. Would be interested to hear your results.

--
Daniel DeLeo

Hi there,

Cleaning up CouchDB actually requires 3 different compaction commands, but
some of the responses in this thread don't mention all 3:
View Compaction, View Cleanup, and overall Compaction. Without running all
3, not much space will be cleared.

I've written a small python command line tool that aides me in running
these commands by automatically seeking all the views and running the
compaction commands against them. Additionally, you can dump a GZ backup of
your database as well.

I hope this helps: GitHub - bmhatfield/couch-compaction: A simple script to automatically perform the various CouchDB compaction tasks.

Brian

On Tue, Nov 20, 2012 at 12:12 PM, Spike Grobstein <spike@ticketevolution.com

wrote:

Hi Daniel,

I don't have much experience with couchdb. What's the best way to do this?

I'd do it through the couchdb admin interface, but it's like 10,000
documents.

...spike

On Nov 20, 2012, at 11:51 AM, Daniel DeLeo wrote:

On Tuesday, November 20, 2012 at 8:45 AM, Spike Grobstein wrote:

Hi Ketan,

Yeah, I've got something similar set up:

Chef compaction script run via cron · GitHub

I have that running once a day, but it's not affecting our free space and
at the rate it's going, I'll be out of space in about a month. The couchdb
size has already passed 40GB.

I really want to be able to just delete old data. I've got tons of Sandbox
data going back as far as march.

...spike

Sandboxes should be safe to delete, but I'm not sure it will fix your
issue. Would be interested to hear your results.

--
Daniel DeLeo

Hi Brian,

The end result of my daily cleanup cron is basically the same.

I didn't know about the way to list the views, but when I looked at my views (based on the URL you're hitting in your code) I get:

chef-server:~$ curl 'localhost:5984/chef/_all_docs?startkey="_design/"&endkey="_design0"'
{"total_rows":11243,"offset":7018,"rows":[
{"id":"_design/checksums","key":"_design/checksums","value":{"rev":"1-62a9807afbf0177330dc9989e99ae9a3"}},
{"id":"_design/clients","key":"_design/clients","value":{"rev":"1-f800279f9b454738aa1cf34eadaa5e9c"}},
{"id":"_design/cookbooks","key":"_design/cookbooks","value":{"rev":"1-12de613d3cfa8f0c0dde9a78b95346dc"}},
{"id":"_design/data_bags","key":"_design/data_bags","value":{"rev":"1-7f86059f3c4f9b81a9a0823de971efac"}},
{"id":"_design/environments","key":"_design/environments","value":{"rev":"1-0e657fdd2e990977704d0643adc3c7ed"}},
{"id":"_design/id_map","key":"_design/id_map","value":{"rev":"1-8f687bcb28aa335d35c93f6f2824cc46"}},
{"id":"_design/nodes","key":"_design/nodes","value":{"rev":"1-be0ed0a94ae8dbd9bd1caf2ac499d99e"}},
{"id":"_design/roles","key":"_design/roles","value":{"rev":"1-0f44c80a2244351cbd1f39ce72e37602"}},
{"id":"_design/sandboxes","key":"_design/sandboxes","value":{"rev":"1-9631200dcb18b22d313f41ee7a084a83"}},
{"id":"_design/users","key":"_design/users","value":{"rev":"1-3a045dbe31e7eb10588661d955c4455d"}}
]}

and I'm hitting all of those with my script:

...spike

On Nov 20, 2012, at 12:15 PM, Brian Hatfield wrote:

Hi there,

Cleaning up CouchDB actually requires 3 different compaction commands, but some of the responses in this thread don't mention all 3:
View Compaction, View Cleanup, and overall Compaction. Without running all 3, not much space will be cleared.

I've written a small python command line tool that aides me in running these commands by automatically seeking all the views and running the compaction commands against them. Additionally, you can dump a GZ backup of your database as well.

I hope this helps: GitHub - bmhatfield/couch-compaction: A simple script to automatically perform the various CouchDB compaction tasks.

Brian

On Tue, Nov 20, 2012 at 12:12 PM, Spike Grobstein spike@ticketevolution.com wrote:
Hi Daniel,

I don't have much experience with couchdb. What's the best way to do this?

I'd do it through the couchdb admin interface, but it's like 10,000 documents.

...spike

On Nov 20, 2012, at 11:51 AM, Daniel DeLeo wrote:

On Tuesday, November 20, 2012 at 8:45 AM, Spike Grobstein wrote:

Hi Ketan,

Yeah, I've got something similar set up:

Chef compaction script run via cron · GitHub

I have that running once a day, but it's not affecting our free space and at the rate it's going, I'll be out of space in about a month. The couchdb size has already passed 40GB.

I really want to be able to just delete old data. I've got tons of Sandbox data going back as far as march.

...spike

Sandboxes should be safe to delete, but I'm not sure it will fix your issue. Would be interested to hear your results.

--
Daniel DeLeo

Ah, that looks correct enough to me.

Apologies if I've also missed this, but what version of Couch are you
running?

If it's not 1.2, I strongly recommend you upgrade; that may help ensure
that the compaction and cleanup tasks actually work. I noticed a huge
improvement in compaction and such under 1.2 vs. previous versions myself.

Brian

On Tue, Nov 20, 2012 at 12:26 PM, Spike Grobstein <spike@ticketevolution.com

wrote:

Hi Brian,

The end result of my daily cleanup cron is basically the same.

I didn't know about the way to list the views, but when I looked at my
views (based on the URL you're hitting in your code) I get:

chef-server:~$ curl
'localhost:5984/chef/_all_docs?startkey="_design/"&endkey="_design0"'
{"total_rows":11243,"offset":7018,"rows":[

{"id":"_design/checksums","key":"_design/checksums","value":{"rev":"1-62a9807afbf0177330dc9989e99ae9a3"}},

{"id":"_design/clients","key":"_design/clients","value":{"rev":"1-f800279f9b454738aa1cf34eadaa5e9c"}},

{"id":"_design/cookbooks","key":"_design/cookbooks","value":{"rev":"1-12de613d3cfa8f0c0dde9a78b95346dc"}},

{"id":"_design/data_bags","key":"_design/data_bags","value":{"rev":"1-7f86059f3c4f9b81a9a0823de971efac"}},

{"id":"_design/environments","key":"_design/environments","value":{"rev":"1-0e657fdd2e990977704d0643adc3c7ed"}},

{"id":"_design/id_map","key":"_design/id_map","value":{"rev":"1-8f687bcb28aa335d35c93f6f2824cc46"}},

{"id":"_design/nodes","key":"_design/nodes","value":{"rev":"1-be0ed0a94ae8dbd9bd1caf2ac499d99e"}},

{"id":"_design/roles","key":"_design/roles","value":{"rev":"1-0f44c80a2244351cbd1f39ce72e37602"}},

{"id":"_design/sandboxes","key":"_design/sandboxes","value":{"rev":"1-9631200dcb18b22d313f41ee7a084a83"}},

{"id":"_design/users","key":"_design/users","value":{"rev":"1-3a045dbe31e7eb10588661d955c4455d"}}
]}

and I'm hitting all of those with my script:

Chef compaction script run via cron · GitHub

...spike

On Nov 20, 2012, at 12:15 PM, Brian Hatfield wrote:

Hi there,

Cleaning up CouchDB actually requires 3 different compaction commands, but
some of the responses in this thread don't mention all 3:
View Compaction, View Cleanup, and overall Compaction. Without running all
3, not much space will be cleared.

I've written a small python command line tool that aides me in running
these commands by automatically seeking all the views and running the
compaction commands against them. Additionally, you can dump a GZ backup of
your database as well.

I hope this helps: GitHub - bmhatfield/couch-compaction: A simple script to automatically perform the various CouchDB compaction tasks.

Brian

On Tue, Nov 20, 2012 at 12:12 PM, Spike Grobstein <
spike@ticketevolution.com> wrote:

Hi Daniel,

I don't have much experience with couchdb. What's the best way to do this?

I'd do it through the couchdb admin interface, but it's like 10,000
documents.

...spike

On Nov 20, 2012, at 11:51 AM, Daniel DeLeo wrote:

On Tuesday, November 20, 2012 at 8:45 AM, Spike Grobstein wrote:

Hi Ketan,

Yeah, I've got something similar set up:

Chef compaction script run via cron · GitHub

I have that running once a day, but it's not affecting our free space
and at the rate it's going, I'll be out of space in about a month. The
couchdb size has already passed 40GB.

I really want to be able to just delete old data. I've got tons of
Sandbox data going back as far as march.

...spike

Sandboxes should be safe to delete, but I'm not sure it will fix your
issue. Would be interested to hear your results.

--
Daniel DeLeo