Help Wanted: Vendoring Python Modules

Hi Inhabitants,
I've started a project to vendor bigger Python modules, which at smartB has really improved our build times. PRs are wanted/welcome!

Sometime ago we talked about having an origin just for python mods like this. My concern is what’s the benefit of submitting to here vs just pushing up to core?

@bdangit, the core origin would be a great long-term home for these. I wanted to start here to see if the work is useful and get some stability and good patterns in place. Would more than happy to see this all go further upstream. @eeyun, thoughts?

Also, @georgemarshall, I’d love to get feedback from you on this work if vendoring is something you are interested in still.

Sorry for the delayed response on this!

I think having larger python modules packaged as py-FOO or python-FOO is actually probably a fairly good idea. A year ago I likely wouldn’t have thought so but considering that all other major distros do this (well at least arch and ubuntu) it probably makes sense. The difficult bit will be determining where to draw the line on what is reasonable to package separately and what’s not. Any ideas?

I’d leave this up to operator pain. Make it easy to vendor new modules and describe how to test them, and that process can be used for community-contributed PRs.

If I had to do this quantitatively, I’d vendor any module taking seconds over a threshold to install in a Habitat build.

The other major reason to vendor would be problems with compiled extensions related to external dependencies, but that I’d also let developer pain drive.

So, @eeyun & @bdangit, would you like to see these in a PR to core?

If so, I think we need a naming convention that makes sense. It does indeed look like Ubuntu/Debian do it like <language_name>-<vendored_module_name>. Until hab pkg search is fixed I think it makes sense to use names that people can at least guess, and apt still seems widespread, so it makes sense to me to follow that naming convention (plus the convention itself seems pretty sane).

Yeah I think I’m good with that personally. It’s possible other maintainers will disagree but I’m willing to advocate for it.

But does this mean you want to see a PR? :wink:

Yes. The reason is that the core-plans team is a bit larger community and while not everyone would be a pythonista, they can at least vet semantics/style/best practices/etc.

A while back we had, @georgemarshall and a few others had consensus around py3-<vendored_module_name> or py2-<vendored_module_name>.

shore up the pr! Give us py3-scipy and py3-tensorflow :smile:

1 Like