Options for self-hosting harts

Out of the box, Habitat expects packages to come from builder. This is understandable, and sane, but under some circumstances, may not be possible. I have two scenarios which will arise imminently:

  1. An environment in which outbound http requests to builder will not be permitted
  2. An environment with no internet connectivity

What are my options for hosting harts for internal use?

If I simply keep the harts and serve them over http from a local webserver, is that sufficient? Is there some metadata that is needed?

What is involved in running one’s own depot, and what advantages would that confer over simply hosting the packages on a webserver?

Assuming I use the Chef habitat cookbook, how do I tell Chef/Habitat, when installing a hab package, to use my own source, rather than builder?

@sns you are probably going to want to keep a cache of packages somewhere, doesn’t really matter where, you just need access to them in your env. The process to get a .hart is a little convoluted at the moment. You have to install a package locally which will cache the actual .hart file in /hab/cache/artifacts. I think @raskchanky might be actively working on a solution for this.

If you are in a Chef shop already you can use it to drop off the hart files and use the hab cookbook to install the Habitat package. @jtimberman maintains the habitat cookbook and may be able to give you guidance on using it for local packages.

WRT running a local depot, we don’t recommend it at the moment but have a solution coming :soon: for an on-prem solution.

@sns On-prem depot is going to be in alpha state soon! You can check out progress here: https://github.com/habitat-sh/builder/milestone/1

I’m really interested in experimenting with an IPFS-based depot. How could one go about configuring their local habitat client to use a custom routine for resolving and fetching .harts?

You would first need a way to resolve package identifiers to content hashes, then a way to fetch a file from a content hash. The latter is easy, but the former is hard to do–at least in a properly decentralized way–but aligns with the overall IPFS challenge of naming content.

With build artifacts hosted in IPFS, policies for how to replicate and distribute artifacts could me managed at the cluster level, and all the participating users could serve as fully-qualified global mirrors for the packages they use

I suppose instead of providing resolve/fetch hooks, you might also just create an alternative implementation of the proposed hab pkg download command that just work totally independently and populates /hab/cache/artifacts. How might you utilize an upstream process to resolve all the transitive dependencies in that scenario though? Does that currently get handled by the depot when you ask it for a package or does the client resolve dependencies after it downloads each package?

A way we have dealt with self-hosting harts is we keep any of that private harts checked into our source code. We have a folder called localdepot and when the user pops into the studio we automate installation of any hart from that folder.

Caveat this works well for small teams and where the harts are not that large or don’t have many hart dependencies. Of course we do have network connectivity but most of the time when we build and iterate our apps in the studio, we use NO_INSTALL_DEPS flag to not reach out to net for updated packages. Also public harts don’t change much or that fast. :wink:

For something that is a bit larger scale you really do need an on-perm builder/depot. I would also say it’s needed because of how much Habitat studio requires connectivity to a depot to install any of its own dependencies to setup a studio and habitat supervisor.

Another option is to use the tar exporter which will tar up your app and all if it’s deps. You can then carry that around instead of managing multiple harts.

This is exactly what we do. We do development on the outside then we tar up our app and sneaker net that to our testing environments.

However it seems to me the problem being described by the OP is they don’t have internet access at all or it would be difficult to punch a hole to the public depot.

The immediate challenge we have is that our gitlab test runners are isolated from the public internet, so testing a cookbook which installs a habitat package is a challenge.

I’ve had a look at the hab_package resource, but on an initial look, it seems as if the provider simply runs hab pkg install with the --url option:

The :bldr_url isn’t required:

If omitted, my assumption is that hab will just try to use the default url. I don’t see a way to specify it to use a local filesystem. If that were possible we could get the harts with remote_file resources, but that would still require us to move the harts into the test environment. We could put them in S3, I suppoose?

What does hab look for when specifying a builder URL? Does it make a series of API calls, or could we simply give it a link to a webserver serving the artefacts?

OK, I’ve dug into this more, and in order to support air-gap installs, I’m going to add functionality to the hab_package resource/provider to support ‘local’ harts. I’ll update when done.