[How To] So you want boot with core/grub or how I learned to stop worrying and love hab (aka: changing your bootloader to core/grub)

So you want to replace your bootloader with core/grub? or maybe you don’t know you do yet…

WHY? I hear you ask. Why not? I thought it would be funny. At this point, I’m still very much trying to learn Habitat and what it can and cannot do. VMs are disposable, so worst case scenario I have to delete a non booting VM, best case scenario I delete a booting VM having proven it works…

In this post, we’ll explore how I did just that!

I spun up a brand new CentOS 7 instance in our VMware/OpenStack instance. (This is only relevant due to cloud-init). Nothing really remarkable about the image. It’s generally been hardened following CIS specs.

# install hab, all the obvious places are mouted as noexec, so install using `/root` four our tmpdir
# tsk tsk, curling straight to a root shell
curl https://raw.githubusercontent.com/habitat-sh/habitat/master/components/hab/install.sh | sudo TMPDIR=/root bash

# remove grub rpms
# grub2* would work too, but it doesn't look like a few of the more esoteric packages are included in core/grub
yum remove grub2 grub2-tools 

# install grub with habitat
hab pkg install core/grub

# binlink so our system knows where to find our binaries
hab pkg binlink core/grub

# recreate our grub config, _might_ be unnecessary...
# note on RHEL/CentOS systems this was `grub2-mkconfig` before
grub-mkconfig

# reboot
reboot

Interestingly it seems cloud-init didn’t pick up the new name as the instance name changed when it rebooted, but the system seemed to come up straight away with no complaints. So I’m calling it a win!

I’ve created a gist of my terminal output during the ordeal for your light reading. There are some interesting differences in the grub.conf. (Should have run a diff!)

EDIT: So it seems my grub-mkconfig bit is only printing to STDOUT… what you actually need is grub-mkconfig -o /boot/grub.cfg (or wherever your actual grub.cfg file lives, seems to vary depending on OS/Initial grub version)

Funny, in making sure all packages would build fine in the new base, I was wondering “Who is gonna use grub from habiat anyways?” And there is my answer :wink:

Do you see any useful feature from habitat itself that would make this really neat? Or is habitat just a package installer here?

There is a cool feature hiding in here. Eventually, I want to hab pkg export qemu core/postgresql, and get a working bootable VM that can run directly. That would mean booting from grub into a hab packaged linux kernel, with the launcher as pid 1. We have some working prototypes from a way back - but it’s definitly a feature we’re going to build, because seriously: how dope is that.

:slight_smile:

2 Likes

It’s like my brain is that tree and you’re those little cookie elves :slight_smile:

Also, Uh oh… @adam is replying to MY post :eyes:

So, uhh… I guess it’s not as simple as just copying the bzImage from /hab/pkgs/core/linux/4.11.1/20170807154945/boot/bzImage to /boot and creating a custom menu entry:

This did NOT work… (it just prints the message “bzImage…” and hangs, VM doesn’t respond to keyboard interrupts, had to hard power cycle it):

cat /hab/pkgs/core/grub/2.02/20180416191836/etc/grub.d/40_custom
#!/bin/sh
exec tail -n +3 $0
# This file provides an easy way to add custom menu entries.  Simply type the
# menu entries you want to add after this comment.  Be careful not to change
# the 'exec tail' line above.
# menuentry 'HabLinux' --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-92d5200a-94b3-45c5-8b1e-2db2717a1bd8' {
menuentry 'HabLinux' {
	load_video
	set gfxpayload=keep
	insmod gzio
	insmod part_msdos
	insmod ext2
	set root='hd0,msdos1'
	if [ x$feature_platform_search_hint = xy ]; then
	  search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1  940ae1ee-f435-4b54-9df7-0390a553a103
	else
	  search --no-floppy --fs-uuid --set=root 940ae1ee-f435-4b54-9df7-0390a553a103
	fi
	echo	'bzImage ...'
	linux	/bzImage root=/dev/mapper/vg_root-lv_root ro
}   

(admittedly, I just copied an existing menuentry and made my changes…)

@rsertelon well, where I was headed was trying to boot from a linux kernel installed with Habitat, but I think Adam hit the nail on the head with running the habsup as PID 1 . What I think would be really cool is being able to leverage the gossip ring for timing boots. I.e.: “we need the DB to come up before the web server” or have VMs in a “standby” state where the habsup prevents boot until it’s triggered “hey complete your boot now” and that instance starts taking on workloads… there’s a few ideas bouncing around in my head, but I think there’s a lot of potential for “a bootloader that talks to other bootloaders”. well, on second thought… that would require a running habsup… HOWEVER, I still think there’s some utility there, like “hey trigger a rebuild of your grub.cfg”

Very. very dope! Got anything I can poke at?! Now you got me all excited!

This is making me want to replace init with hab sup

1 Like

@qubitrenegade . This is awesome! It’s been a while since I’ve worked on this so my memory may be a bit fuzzy, but I think you’re really really close.

The 4.11.1 kernel package is compiled with a minimal set of options and no modules, so I’m fairly confidant that it doesn’t have the necessary bits to boot your system. However, in the unstable channel is a 4.13.1 kernel package that has a whole pile of modules available, you just need to make sure they get copied into the right location. You can see an example of that here. There may be some additional work to be done for your use case, I’ve only booted with a bunch of hand-rolled scripts.

As for starting with the Launcher as pid 1, it’s doable but last time I was playing with this it didn’t understand what it meant to shut down a system (unmount fs, swapoff, etc), so you could kill it (kernel panic!) or kill the vm it was running on (probably a corrupted fs).

@smacfarlane something I’ve been struggling with is order of operations. Like, take booting a system out of the equation… how do I ensure I start the DB before I start the webapp?

Also, I see that 4.11.1 is pretty minimal, I was trying to compile something else (maybe it was keepalived) and was scratching my head “what do you mean blahblah.h not found!” then I realized 11 vs 13 was quite the difference.

Hmm… I wonder if I could replace openssh package with core/ssh… O.O