How to copy big files from the cb to the node w/o running out of memory?

Hi all,

I’m building a recipe where I’ve to copy 2 2GB files. My VM has 4GB and it runs out of memory when I use remote_file our cookbook_file to copy the files to it.

Any idea on how to copy the files on some kind of streaming mode to avoid running out of memory?

Thanks!

This is one way to do it. Mind you it doesn’t do things like backup the current version, retry on network failures, etc…

    ruby_block "httpclient - #{remote_path}" do
      block do
        require 'httpclient'
        Chef::Log.info("Downloading #{remote_path} to #{local_path}")
        ::File.open(local_path, 'w') do |file|
          ::HTTPClient.new.get_content(remote_path) do |chunk|
            file.write(chunk)
          end
        end
      end
    end

Hi Brian,

Thanks for the response!

Do we have a way to access the cookbook host as an HTTP server when using
Chef Zero? or the only way is to setup an HTTP server for that?

Thanks

The cookbook_file resource should work fine AFAIK. If it doesn’t, that’s a bug.

Cookbooks are really not the right tool for distributing large files like that, so you did the right thing by not using cookbook_file.

When your file is such a large percentage of the total memory, I expect that remote_file’s internal implementation is too inefficient. If I was to venture a guess, the problem may be that remote_file computes a hash of file, and may keep the file in memory for that purpose.

Try replacing remote_file with an execute resource that uses curl, or maybe a bash resource.

Kevin Keane
The NetTech
http://www.4nettech.com
Our values: Privacy, Liberty, Justice
See https://www.4nettech.com/corp/the-nettech-values.html

I agree with Noah, this should work. Possibly you’re hitting the issue when chef generates the checksum or the diff (though I thought we had a maximum file size for attempting a diff). Would need more detailed logs to figure it out.