Application Debugging and Feedback

I think we may be missing a key piece of functionality to make Habitat more useful when trying to diagnose package start failures. Consider the following common scenario;

I’m running OSX and I want to load a package that I didn’t create into the supervisor and use it. Let use core/mongodb as an example. Here’s the output I get;

hab svc load core/mongodb
Installing core/mongodb
Determining latest version of core/mongodb in the ‘stable’ channel
Verifying core/mongodb/3.6.4/20180717024452
[…]
Install of core/mongodb/3.6.4/20180717024452 complete with 12 new packages installed.
The core/mongodb service was successfully loaded

When I check the status of the supervisior I get this;

hab sup status
package type desired state elapsed (s) pid group
core/mongodb/3.6.4/20180717024452 standalone up down 86 mongodb.default

What happened? Why didn’t this start? I set some suggested debug flags like DEBUG=1 and / or RUST_LOG=debug and I still don’t end up with any indication of what went wrong. I went and looked into a few directories like /hab/pkgs/core/mongodb/3.6.4/20180717024452 and there’s no log of what happened. I suspect that it has something to do with the TOML host_ip being set to 127.0.0.1. OSX maybe looking for 0.0.0.0 and the app is getting confused.

In essence, Habitat is suppressing valuable application debug information. This is not an isolated example and I see people asking about failed loads all the time. What we need is to see the application output, through the loading, starting and into running of the application.

If I look at the core/mongodb run hook;

#!/bin/sh
export HOME=/hab/svc/mongodb/data
cd /hab/svc/mongodb

if [ “$(whoami)” = “root” ]; then
exec chpst
-U hab:hab
-u hab:hab
mongod --config /hab/svc/mongodb/config/mongod.conf 2>&1
else
exec mongod --config /hab/svc/mongodb/config/mongod.conf 2>&1
fi

The start information is being redirected properly, Habitat is just not displaying the information.

In addition, there seems to be no process or mechanism in place to let a user know of the package usage. In this case, it would been helpful to let a user know that if you are running OSX, create an alternate TOML with host_ip=0.0.0.0. I would suggest that information should be displayed with the package on Builder for the consumer.

My concern is that as the Habitat product continues to gain momentum, people are going to be clogging up the issues process and getting frustrated with product without this type of functionality in place.

Yeah, this is a bit strange. I was able to reproduce this as well. I am MacOS X 10.13.6 running hab 0.63.0/20180913153335.

Enter the studio, check that supervisor is running, then load core/mongodb

[1][default:/src:0]# hab svc status
No services loaded.
[2][default:/src:0]# hab svc load core/mongodb
» Installing core/mongodb
☁ Determining latest version of core/mongodb in the 'stable' channel
 ☛ Verifying core/mongodb/3.6.4/20180717024452
☛ Verifying core/acl/2.2.52/20180608091922
☛ Verifying core/attr/2.4.47/20180608091914
☛ Verifying core/cacerts/2018.03.07/20180608102212
☛ Verifying core/coreutils/8.29/20180608092141
☛ Verifying core/gcc-libs/7.3.0/20180608091701
☛ Verifying core/glibc/2.27/20180608041157
☛ Verifying core/gmp/6.1.2/20180608051426
☛ Verifying core/libcap/2.25/20180608091936
☛ Verifying core/linux-headers/4.15.9/20180608041107
☛ Verifying core/openssl/1.0.2n/20180608102213
☛ Verifying core/zlib/1.2.11/20180608050617
✓ Installed core/acl/2.2.52/20180608091922
✓ Installed core/attr/2.4.47/20180608091914
✓ Installed core/cacerts/2018.03.07/20180608102212
✓ Installed core/coreutils/8.29/20180608092141
✓ Installed core/gcc-libs/7.3.0/20180608091701
✓ Installed core/glibc/2.27/20180608041157
✓ Installed core/gmp/6.1.2/20180608051426
✓ Installed core/libcap/2.25/20180608091936
✓ Installed core/linux-headers/4.15.9/20180608041107
✓ Installed core/openssl/1.0.2n/20180608102213
✓ Installed core/zlib/1.2.11/20180608050617
✓ Installed core/mongodb/3.6.4/20180717024452
★ Install of core/mongodb/3.6.4/20180717024452 complete with 12 new packages installed.
The core/mongodb service was successfully loaded

At this point, I would expect mongodb to be running, but the supervisor shows the state of mongodb is down, but the output of the supervisor logs show the package loaded successfully

[3][default:/src:0]#  hab svc status
package                            type        desired  state  elapsed (s)  pid     group
core/mongodb/3.6.4/20180717024452  standalone  up       down   542          <none>  mongodb.default
[4][default:/src:0]# sup-log
--> Tailing the Habitat Supervisor's output (use 'Ctrl+c' to stop)
hab-sup(AG): ✓ Installed core/gcc-libs/7.3.0/20180608091701
hab-sup(AG): ✓ Installed core/glibc/2.27/20180608041157
hab-sup(AG): ✓ Installed core/gmp/6.1.2/20180608051426
hab-sup(AG): ✓ Installed core/libcap/2.25/20180608091936
hab-sup(AG): ✓ Installed core/linux-headers/4.15.9/20180608041107
hab-sup(AG): ✓ Installed core/openssl/1.0.2n/20180608102213
hab-sup(AG): ✓ Installed core/zlib/1.2.11/20180608050617
hab-sup(AG): ✓ Installed core/mongodb/3.6.4/20180717024452
hab-sup(AG): ★ Install of core/mongodb/3.6.4/20180717024452 complete with 12 new packages installed.
hab-sup(AG): The core/mongodb service was successfully loaded

If I run hab sup term and then restart the supervisor, mongodb comes up…

[5][default:/src:130]# hab sup term
[6][default:/src:0]# hab sup run
hab-sup(MR): Supervisor Member-ID 3b034b20115846e8ba25d1da5f81b372
hab-sup(MR): Starting core/mongodb
mongodb.default(UCW): Watching user.toml
hab-sup(MR): Starting gossip-listener on 0.0.0.0:9638
hab-sup(MR): Starting ctl-gateway on 127.0.0.1:9632
hab-sup(MR): Starting http-gateway on 0.0.0.0:9631
mongodb.default(HK): init, compiled to /hab/svc/mongodb/hooks/init
mongodb.default(HK): Hooks compiled
mongodb.default(SR): Hooks recompiled
default(CF): Updated mongod.conf 3aecd7ca0928810301b3f230de3a3e6a72779907be0c7f9d4c726ea55c6adea7
default(CF): Updated mongos.conf 8c3e4cd2f3cdd7cd922ce3b529f93672f3af8274e5f6d462a5f8824c25bbbe18
mongodb.default(SR): Configuration recompiled
mongodb.default(SR): Initializing
mongodb.default(SV): Starting service as user=hab, group=hab
mongodb.default(O): 2018-09-25T23:01:35.248+0000 I CONTROL  [main] note: noprealloc may hurt performance in many applications
^Chab-sup(MR): Gracefully departing from butterfly network.
mongodb.default(SV): Stopping...
mongodb.default(SV): Shutdown OK: Graceful Termination
hab-launch(SV): Hasta la vista, services.
ERROR 2018-09-25T23:01:40Z: hab_launch: Launcher exiting with code 0
[7][default:/src:0]# hab sup run > /hab/sup/default/sup.log &
[1] 262
[8][default:/src:0]# sup-log
--> Tailing the Habitat Supervisor's output (use 'Ctrl+c' to stop)
hab-sup(MR): Supervisor Member-ID 3b034b20115846e8ba25d1da5f81b372
hab-sup(MR): Starting core/mongodb
mongodb.default(UCW): Watching user.toml
hab-sup(MR): Starting gossip-listener on 0.0.0.0:9638
hab-sup(MR): Starting ctl-gateway on 127.0.0.1:9632
hab-sup(MR): Starting http-gateway on 0.0.0.0:9631
mongodb.default(HK): Hooks compiled
mongodb.default(SR): Initializing
mongodb.default(SV): Starting service as user=hab, group=hab
mongodb.default(O): 2018-09-25T23:01:52.838+0000 I CONTROL  [main] note: noprealloc may hurt performance in many applications
^C
[9][default:/src:130]# hab svc status
package                            type        desired  state  elapsed (s)  pid  group
core/mongodb/3.6.4/20180717024452  standalone  up       up     10           338  mongodb.default

Above you can see that mongodb is now running without issue

We've done a bit of poking around and are still not quite certain what is going on, but we have some leads we'll track down.

In the meantime, it appears this is memory-related in some regard. Running a macOS Docker Studio with the default Docker settings means that 2GB of RAM are allocated to the VM being used under the hood (this is a Docker-on-macOS thing, not anything Habitat-specific). We have observed that boosting this allocation allows core/mongodb to load and run the first time in a studio. I've set it to 4GB, for example, and it works fine. It may be able to use a lower allocation and still work, but I haven't tried.

image
image

For what it's worth, we have only been able to reproduce this in a macOS Docker Studio, and with a relatively low RAM allocation as described above.

The lack of feedback is certainly a bad experience, and as we investigate further, we'll be sure to add some better feedback mechanisms so it will be more clear what is going on.