RE: Re: RE: Re: Docker_container dependency hell


#1

Thank you! I think your solution may be the close enough to a solution.

First of all, it turned out that my immediate problem was elsewhere: my docker context was actually set up with a delayed notification (a bash script that untars something), so the :build_if_missing and :run actions never had a chance to succeed on the initial run. That was easily fixed, thankfully.

Still, I think there are problems with your approach:

  • The docker container :run action seems to be doing the exact same thing as :redeploy. Meaning that the docker container is destroyed and recreated on each chef run (this might be related to issue #410; the issue is closed, but I think the problem still occurs)

  • Even if :run only ran the first time, deleting a container and then re-creating it is not the same thing as leaving it untouched. Among other things, it causes a service to go down and come back up, and it may break links and cause unexpected results on volumes, such as dangling volumes. In practical terms, it often may not matter since it would only happen with a brand-new container.

Kevin Keane

The NetTech

http://www.4nettech.com

Our values: Privacy, Liberty, Justice

See https://www.4nettech.com/corp/the-nettech-values.html

-----Original message-----
From: Sean OMeara someara@chef.io
Sent: Monday 7th September 2015 16:09
To: chef@lists.opscode.com
Subject: [chef] Re: RE: Re: Docker_container dependency hell

I’ve added a working test recipe so you can play with it in test-kitchen.

kitchen converge notifications-ubuntu-1504

During the first converge, it does indeed restart the containers at the end of the chef client run…

  • write some file
  • starting alice
  • write some file
  • starting bob
  • starting alice
  • starting bob

But that’s okay, the end state of the converge is what’s important, not the journey.

Run test-kitchen again
kitchen converge notifications-ubuntu-1504

During the second converge, nothing happens. the docker_containers are already in the “Up” state (docker ps -a), so there is nothing to do. Had the commands on the containers exited (ls -la / for example), :run would have taken action to converge them to their “Up” state.

The relevant line in the provider is here: https://github.com/bflad/chef-docker/blob/7a19c4937d60bd8ffb403f40b7dd70322ab685a0/libraries/provider_docker_container.rb#L183

Edit line 19 of test/cookbooks/docker_test/recipes/notifications.rb, and change “alice was here” to “alice was there”.

Run test-kitchen again.
kitchen converge notifications-ubuntu-1504

The only things that happen are:

‘file[/alice/file]’ is repaired
’docker_image[alice]’ is rebuilt
’docker_container[alice]’ is redeployed
’docker_container[bob]’ is redeployed

The restarts during the initial chef-client run are the price of admission for getting subsequent updates chained together with the notification system.

-s

On Tue, Sep 8, 2015 at 12:07 AM, Kevin Keane Subscription <subscription@kkeane.com mailto:subscription@kkeane.com > wrote:

I think this gist is pretty much what I am doing right now, and it actually illustrates the problem:

The docker_images are built twice, first with the build_if_missing action, and then with the subscribes action. In the gist, the this will happen in the correct sequence, but if the actions are spread over several cookbooks, that can’t be guaranteed, and the build_if_missing action may actually fire before the templates do.

The same problem exists with the docker container: the run action will fire, and then subscribes will stop and remove the just-created container, before re-creating it.

Also, the docker_container action run will fire on every chef run. Shouldn’t it at least be run_if_missing?

Agreed in principle on spreading over multiple cookbooks, but that’s not always realistic. For instance, In my actual scenario, I have a database container, a Web server container, a PHP FPM container, and one container (data only) for each hosted Web site. The cookbook for the hosted Web sites shouldn’t have to be involved with the database container.

Kevin Keane

The NetTech

760-721-8339

http://www.4nettech.com http://www.4nettech.com

Our values: Privacy, Liberty, Justice

See https://www.4nettech.com/corp/the-nettech-values.html

-----Original message-----
From: Sean OMeara <someara@chef.io mailto:someara@chef.io >
Sent: Monday 7th September 2015 14:13
To: chef@lists.opscode.com mailto:chef@lists.opscode.com
Subject: [chef] Re: Docker_container dependency hell

I believe this gist is what you’re looking for:

A few things…

  1. you probably want :run instead of :run_if_missing if you’re launching a long running service. It’s safe across multiple runs and will repair the container in the event of a crash or out-of-band administrative actions.

  2. Remember, :delayed notify/subscribes are queued until after the end of the chef-client converge.

  3. Avoid spreading resources with notify/subscribe relationships across multiple cookbooks. It creates a semantic dependency relationship between them that’s extremely difficult to reason about and test. The same recipe is best.

-s

On Mon, Sep 7, 2015 at 12:10 AM, Kevin Keane Subscription <subscription@kkeane.com mailto:subscription@kkeane.com > wrote:

I have two docker containers. Container 1 links to container 2 (target). Consequently, container2 needs to be created and started before container1.

What I am trying to accomplish is this:

  • if the containers do not exist, they will be run in the correct sequence, even if the images already exist.

  • If an image for container1 is updated, container 1 is redeployed.

  • If an image for container2 is updated, containers2 is redeployed, and then container1 is redeployed.

Here is what I am currently doing (simplified for readability)

docker_image image1 do

action :nothing

built only based on notifications

end

docker_container container2 do

action :run_if_missing

subscribe :redeploy, “docker_image[image2]”

end

And in another cookbook:

docker_image image2 do

action :nothing

built only based on notifications

end

docker_container container1 do

action :run_if_missing

subscribe :redeploy, "docker_image[image1]

subscribe :redeploy, "docker_container[container2]

end

Problems with this approach:

  • The containers are often built twice, first by the run_if_missing action, and then again by the redeploy action

  • The run_if_missing action does not observe dependencies, so container1’s run_if_missing action is actually invoked before container2 even exists, causing a build failure.

Obviously, I could circumvent this by changing the default action to :nothing, but then the containers aren’t built at all if they are missing.

Also, using :immediate for the subscriptions is not an option (because then the redeploy action may get invoked too early and multiple times, and various other problems).

What is the best way to resolve this?

Kevin Keane

The NetTech

http://www.4nettech.com http://www.4nettech.com

Our values: Privacy, Liberty, Justice

See https://www.4nettech.com/corp/the-nettech-values.html


#2

Delayed is the default notification timing, so unless you specified
:immediately, that’s expected.

If :run is behaving like :redeploy, you’re definitely hitting #410.

Let’s move this conversation over to that Github issue.

-s

On Tue, Sep 8, 2015 at 8:14 AM, Kevin Keane Subscription <
subscription@kkeane.com> wrote:

Thank you! I think your solution may be the close enough to a solution.

First of all, it turned out that my immediate problem was elsewhere: my
docker context was actually set up with a delayed notification (a bash
script that untars something), so the :build_if_missing and :run actions
never had a chance to succeed on the initial run. That was easily fixed,
thankfully.

Still, I think there are problems with your approach:

  • The docker container :run action seems to be doing the exact same thing
    as :redeploy. Meaning that the docker container is destroyed and recreated
    on each chef run (this might be related to issue #410; the issue is closed,
    but I think the problem still occurs)

  • Even if :run only ran the first time, deleting a container and then
    re-creating it is not the same thing as leaving it untouched. Among other
    things, it causes a service to go down and come back up, and it may break
    links and cause unexpected results on volumes, such as dangling volumes. In
    practical terms, it often may not matter since it would only happen with a
    brand-new container.

Kevin Keane

The NetTech

http://www.4nettech.com

Our values: Privacy, Liberty, Justice

See https://www.4nettech.com/corp/the-nettech-values.html

-----Original message-----
From: Sean OMeara someara@chef.io
Sent: Monday 7th September 2015 16:09
To: chef@lists.opscode.com
Subject: [chef] Re: RE: Re: Docker_container dependency hell

I’ve added a working test recipe so you can play with it in test-kitchen.

https://github.com/bflad/chef-docker/blob/7a19c4937d60bd8ffb403f40b7dd70322ab685a0/test/cookbooks/docker_test/recipes/notifications.rb

kitchen converge notifications-ubuntu-1504

During the first converge, it does indeed restart the containers at the
end of the chef client run…

  • write some file
  • starting alice
  • write some file
  • starting bob
  • starting alice
  • starting bob

But that’s okay, the end state of the converge is what’s important, not
the journey.

Run test-kitchen again
kitchen converge notifications-ubuntu-1504

During the second converge, nothing happens. the docker_containers are
already in the “Up” state (docker ps -a), so there is nothing to do. Had
the commands on the containers exited (ls -la / for example), :run would
have taken action to converge them to their “Up” state.

The relevant line in the provider is here:
https://github.com/bflad/chef-docker/blob/7a19c4937d60bd8ffb403f40b7dd70322ab685a0/libraries/provider_docker_container.rb#L183

Edit line 19 of test/cookbooks/docker_test/recipes/notifications.rb, and
change “alice was here” to “alice was there”.

Run test-kitchen again.
kitchen converge notifications-ubuntu-1504

The only things that happen are:

‘file[/alice/file]’ is repaired
’docker_image[alice]’ is rebuilt
’docker_container[alice]’ is redeployed
’docker_container[bob]’ is redeployed

The restarts during the initial chef-client run are the price of admission
for getting subsequent updates chained together with the notification
system.

-s

On Tue, Sep 8, 2015 at 12:07 AM, Kevin Keane Subscription <
subscription@kkeane.com> wrote:

I think this gist is pretty much what I am doing right now, and it
actually illustrates the problem:

The docker_images are built twice, first with the build_if_missing
action, and then with the subscribes action. In the gist, the this will
happen in the correct sequence, but if the actions are spread over several
cookbooks, that can’t be guaranteed, and the build_if_missing action may
actually fire before the templates do.

The same problem exists with the docker container: the run action will
fire, and then subscribes will stop and remove the just-created container,
before re-creating it.

Also, the docker_container action run will fire on every chef run.
Shouldn’t it at least be run_if_missing?

Agreed in principle on spreading over multiple cookbooks, but that’s not
always realistic. For instance, In my actual scenario, I have a database
container, a Web server container, a PHP FPM container, and one container
(data only) for each hosted Web site. The cookbook for the hosted Web sites
shouldn’t have to be involved with the database container.

Kevin Keane

The NetTech

760-721-8339

http://www.4nettech.com

Our values: Privacy, Liberty, Justice

See https://www.4nettech.com/corp/the-nettech-values.html

-----Original message-----
From: Sean OMeara someara@chef.io
Sent: Monday 7th September 2015 14:13
To: chef@lists.opscode.com
Subject: [chef] Re: Docker_container dependency hell

I believe this gist is what you’re looking for:

https://gist.github.com/someara/24edc8b969207fb55d7b

A few things…

  1. you probably want :run instead of :run_if_missing if you’re launching
    a long running service. It’s safe across multiple runs and will repair the
    container in the event of a crash or out-of-band administrative actions.

  2. Remember, :delayed notify/subscribes are queued until after the end of
    the chef-client converge.

  3. Avoid spreading resources with notify/subscribe relationships across
    multiple cookbooks. It creates a semantic dependency relationship between
    them that’s extremely difficult to reason about and test. The same recipe
    is best.

-s

On Mon, Sep 7, 2015 at 12:10 AM, Kevin Keane Subscription <
subscription@kkeane.com> wrote:

I have two docker containers. Container 1 links to container 2 (target).
Consequently, container2 needs to be created and started before container1.

What I am trying to accomplish is this:

  • if the containers do not exist, they will be run in the correct
    sequence, even if the images already exist.

  • If an image for container1 is updated, container 1 is redeployed.

  • If an image for container2 is updated, containers2 is redeployed, and
    then container1 is redeployed.

Here is what I am currently doing (simplified for readability)

docker_image image1 do

action :nothing

built only based on notifications

end

docker_container container2 do

action :run_if_missing

subscribe :redeploy, “docker_image[image2]”

end

And in another cookbook:

docker_image image2 do

action :nothing

built only based on notifications

end
docker_container container1 do

action :run_if_missing

subscribe :redeploy, "docker_image[image1]

subscribe :redeploy, "docker_container[container2]

end

Problems with this approach:

  • The containers are often built twice, first by the run_if_missing
    action, and then again by the redeploy action

  • The run_if_missing action does not observe dependencies, so
    container1’s run_if_missing action is actually invoked before container2
    even exists, causing a build failure.

Obviously, I could circumvent this by changing the default action to
:nothing, but then the containers aren’t built at all if they are missing.

Also, using :immediate for the subscriptions is not an option (because
then the redeploy action may get invoked too early and multiple times, and
various other problems).

What is the best way to resolve this?

Kevin Keane

The NetTech

http://www.4nettech.com

Our values: Privacy, Liberty, Justice

See https://www.4nettech.com/corp/the-nettech-values.html