I’m building a recipe where I’ve to copy 2 2GB files. My VM has 4GB and it runs out of memory when I use remote_file our cookbook_file to copy the files to it.
Any idea on how to copy the files on some kind of streaming mode to avoid running out of memory?
This is one way to do it. Mind you it doesn’t do things like backup the current version, retry on network failures, etc…
ruby_block "httpclient - #{remote_path}" do
block do
require 'httpclient'
Chef::Log.info("Downloading #{remote_path} to #{local_path}")
::File.open(local_path, 'w') do |file|
::HTTPClient.new.get_content(remote_path) do |chunk|
file.write(chunk)
end
end
end
end
Cookbooks are really not the right tool for distributing large files like that, so you did the right thing by not using cookbook_file.
When your file is such a large percentage of the total memory, I expect that remote_file’s internal implementation is too inefficient. If I was to venture a guess, the problem may be that remote_file computes a hash of file, and may keep the file in memory for that purpose.
Try replacing remote_file with an execute resource that uses curl, or maybe a bash resource.
I agree with Noah, this should work. Possibly you’re hitting the issue when chef generates the checksum or the diff (though I thought we had a maximum file size for attempting a diff). Would need more detailed logs to figure it out.