On-prem builder upstream_depot seems to not be working

Whenever I install a package not present in our local depot, it always fails. It doesn’t appear that the upstream packages are ever downloaded into our depot and made available, even after multiple tries and waiting. bldr.habitat.sh is definitely reachable from the VM running our depot.

For now, I’m working around this by manually installing the packages from public builder and then uploading the harts to our depot.

if you uploaded them to the depot, did you make sure to change the channel to stable ? I believe on default, uploads are channel unstable

I am promoting manually mirrored packages to stable (I just left that out). Here’s my gross workaround script:

hab studio run "hab pkg install $1 --url https://bldr.habitat.sh -z '' && hab pkg upload /hab/cache/artifacts/$(echo $1 | sed 's/\//-/g')-x86_64-linux.hart && hab pkg promote $1 stable"

Where $1 is a fully-qualified package identifier. It also expects that your auth token and on-prem builder URL are set in the environment.

The reason I have to do this is because builder refuses to pull from the upstream on its own.

@raskchanky you have any insight here?

Out of curiosity does it fail to pull from upstream for all of its packages?

It’s failing for all packages I don’t have cached locally or already present on our depot. The only packages I’m pulling from upstream are core packages. I haven’t noticed any instance of success and I’m not sure when this stopped working. This may have actually started weeks ago and only manifested after the base plan refresh.

I actually just noticed a case where my workaround failed: core/hab/0.57.0/20180614230004

There’s a few things you can check that might help narrow down the problem.

If you look at /hab/svc/builder-api/config/config.toml there should be something that looks similar to this:

upstream_depot = "https://bldr.habitat.sh"

If that’s not there, you can add it by putting those 2 lines into a user.toml file and running:

cat user.toml | hab config apply builder-api.default $(date +%s)

Additionally, the upstream syncing process writes a log file to /hab/svc/builder-api/var/builder-upstream.log. That should contain details about what the syncing process is actually doing.

Hmmm. So it looks like it is actually working now. The config for the upstream depot was definitely applied previously (I have it in the same /hab/user/builder-api/config/user.toml I used to configure the s3 backend and I verified via the supervisor REST API).

Looking further back in the log, I see a lot of this:

2018-06-15 00:24:52,FAILURE: core/hab-sup (DepotClientError(APIError(ServiceUnavailable, "")))
2018-06-15 00:24:52,FAILURE: core/redis (DepotClientError(APIError(ServiceUnavailable, "")))
2018-06-15 00:25:55,FAILURE: core/hab-sup (DepotClientError(APIError(ServiceUnavailable, "")))
2018-06-15 00:25:55,FAILURE: core/redis (DepotClientError(APIError(ServiceUnavailable, "")))

And some of this:

2018-06-13 01:46:54,FAILURE: core/erlang (NetError(NetError(code: DATA_STORE msg: "vt:origin-package-create:1")))
2018-06-13 01:54:07,FAILURE: core/nginx (NetError(NetError(code: DATA_STORE msg: "vt:origin-package-create:1")))
2018-06-13 02:08:42,FAILURE: core/jre8 (NetError(NetError(code: DATA_STORE msg: "vt:origin-package-create:1")))

The current logs seem fine and the symptoms seem gone. No clue what changed.

I did notice a very long delay between when I make a request and when a relevant log entry appears in builder-upstream.log. On the order of 20-30 seconds.

Yeah, right now there’s a background thread that ticks every 60 seconds and processes whatever upstream packages need to be synced. The initial implementation did it on demand, but was changed. We’ve gotten feedback from a few sources, though, that this is sub-optimal, so I’m hoping to have some fixes in for this coming soon.

Glad to hear it’s working now!