a violet pig being the logo of this site

Creating an initramfs on gentoo

Abstract

This article describes the creation process of an initramfs for a manually compiled kernel on a gentoo system. The resulting ram filesystem will mount a root filesystem from a lvm volume, which is located on a hard disk encrypted with dm-crypt and luks. The ram filesystem will set keymap and consolefont, and provide support for uvesafb and fbcondecor. It will also provide resume support using userspace software suspend.

Introduction

Users of the source-based distribution Gentoo have to build their kernels on their own. While this is easy, if you can use a static kernel, it becomes more complicated, if you need an initramfs to provide the kernel with some userspace programs before the actual root filesystem can be mounted. Common use cases of initramfs are root filesystems, which are located on encrypted disk or logical volumes. In this article, I will illustrate the process of building such an initramfs.

This article’s scope is limited to the creation of the initial ram filesystem itself. It is assumed, that you’re familiar with installing packages, setting useflags and building the kernel.

The complete code of this initramfs and the corresponding build script is available in a mercurial repository.

Building the kernel

The kernel needs to support initramfs, thus the CONFIG_BLK_DEV_INITRD and CONFIG_BLK_DEV_RAM options must be enabled in kernel configuration.

Depending on the features to be included, the following options also need to be enabled:

lvm support
CONFIG_BLK_DEV_DM
Crpytsetup/LUKS
CONFIG_DM_CRYPT and the required algorithms in “Cryptgraphic API”
uvesafb
CONFIG_FB_UVESA
fbcondecor
CONFIG_FRAMEBUFFER_CONSOLE and CONFIG_FB_CON_DECOR
Userspace software suspend
CONFIG_PM and CONFIG_HIBERNATION

Note

You must have these drivers compiled into the kernel, because the ram filesystem will not support kernel modules. These modules can’t be unloaded anyways, thus kernel module loading would only add needless complexity to the initramfs.

Once the kernel is configured make all install modules_install will install the kernel on your system.

General layout of the build process

The initramfs is a (gzipped) cpio archive. This allows to include special files like device files into the initramfs without root privileges during the build process.

To create such an archive, the kernel provides a binary called gen_init_cpio located at usr/gen_init_cpio in the kernel source tree. This program reads a text file describing the contents of the initial ram filesystem and creates a cpio archive from this description.

On the top of this sits a small python script, that expands some place holders in the initramfs description, feeds the generated description to gen_init_cpio, gzips the output and installs the resulting file at /boot.

Preparing the build process

Create a new directory at some place in your home directory. Though only four files will be created throughout this article, placing these in a separate directory will help you to keep the overview and avoids cluttering your home directory.

Basic initramfs layout

The initramfs works like a preliminary root filesystem. Thus it needs a basic directory layout and some basic set of userspace programs.

The basic directory layout is simple. Create a new file makefile.tmpl in your build directory and add the following content:

# directory structure
dir /proc 755 0 0
dir /sys 755 0 0
dir /dev 755 0 0
dir /lib 755 0 0
slink /lib64 /lib 755 0 0
dir /bin 755 0 0
dir /sbin 755 0 0
dir /mnt 755 0 0
dir /etc 755 0 0
dir /usr 755 0 0
dir /usr/share 755 0 0

This file will – after being processed by a python script – serve as input to gen_init_cpio. The syntax of this file is beyond the scope of this article, run /usr/src/linux/usr/gen_init_cpio without any options to get documentation about the file format.

For this article busybox was chosen to provide the basic toolset of userspace programs. Compared to other programs like uclibc and klibc busybox is somewhat larger, but it’s also the most complete and the most easy to install.

Install sys-apps/busybox with the static useflag set (there is no linker in early userspace). This will install a statically compiled busybox at /bin/busybox. This file is to be added to the ram filesystem by appending the following lines to makefile.tmpl:

# basic tools
file /bin/busybox /bin/busybox 755 0 0

Busybox is a “multi-call binary”, which means that all userspace programs reside as “applets” within one single binary. For statically compiled programs, this saves a lot of memory.

Aside from basic tools, some fundamental device files and directories are required in makefile.tmpl:

# important device files
nod /dev/mem 600 0 0 c 1 1
nod /dev/null 666 0 0 c 1 3
nod /dev/zero 600 0 0 c 1 5
nod /dev/console 600 0 0 c 5 1
nod /dev/tty0 620 0 0 c 4 0
nod /dev/tty1 600 0 0 c 4 1

# device directories
dir /dev/fb 755 0 0
dir /dev/misc 755 0 0
dir /dev/vc 755 0 0

The init script

When the kernel finds an initramfs when booting the system, it starts /init to perform the initialization until the real root filesystem is mounted.

In this article /init will be provided by a simple shell script. Thus, create a new file called init.sh and add the following line to your makefile.tmpl:

file /init ${package}/init.sh 755 0 0

${package} serves as placeholder here and points to the build directory. This placeholder will later be expanded by the python script, that runs the actual build process. Just don’t worry about this, yet.

Because the shell is provided by busybox, the shebang looks a bit unusual:

#!/bin/busybox ash

Note

As all busybox applets ash is stricly POSIX-compliant and does not provide all the features of common user shells like bash or zsh. Thus you must refrain from using “bashisms” if you extend this shell script.

Now some simple functions to report errors and print messages are defined:

err () { echo "ERROR: $@"; }
msg () { [ "${quiet}" != "y" ] && echo $@; }

After these functions a very important variable needs to be declared. It holds the path of the device file, that contains the root filesystem. This will be mounted readonly right before the initramfs is left and the normal init process begins.. In my case, it is an lvm volume, so my declaration looks as follows:

export ROOT="/dev/mapper/linux-gentoo"

Of course, you have to set this variable according your environment. Some feature also require variables for configuration, so this place in the init script will be referred to as variable section troughout this article.

At the first beginning of the actual operation, the init script needs to provide the essential pseudo-filesystems to communicate with the kernel:

mount -t sysfs none /sys
mount -t proc  none /proc

In case you wonder, why mount can be called, though it was not added to the initramfs: The script runs within busybox, thus its applets can be used just like any external command.

Next the script deals with the kernel messages. If these are not silenced, they clutter the screen and are likely to obscure any important prompts (like the cryptsetup password prompt):

msg 'Silencing procfs ...'
echo 0 > /proc/sys/kernel/printk

Moreover the kernel command line is loaded and parsed, to let the user silence the messages from the initramfs by passing quiet to the kernel command line:

read CMDLINE </proc/cmdline
export CMDLINE
for x in ${CMDLINE}
do
    case "${x}" in
        quiet)
            quiet='y'
            export quiet
            ;;
    esac
done

The next step will create the necessary device files. Normally this is done by udev. However, including udev in the initramfs is a rather complicated task, as it would require a static udev to be compiled during the build process. Therefore the mdev applet from busybox will serve as a replacement, as it does very much the same thing (though not at such a high level). The init script calls this applet to create a rudimentary device structure under /dev and afterwards makes sure, that absolutely important nodes exist:

echo "/sbin/modprobe" > /proc/sys/kernel/modprobe

msg "Creating device nodes ..."
mdev -s
echo mdev > /proc/sys/kernel/hotplug

# assure, that import nodes exist
[ ! -c "$NEWROOT/dev/console" ] && mknod "$NEWROOT/dev/console" c 5 1
[ ! -c "$NEWROOT/dev/null" ] && mknod "$NEWROOT/dev/null" c 1 3

Once this is done, the initramfs environment is ready for all sorts of userspace programs. These will be covered in the next section.

Features

Keymap and console font

Busybox uses a different keymap format, thus it is not possible to use the existing keymaps under /usr/share/keymaps/ directly. Instead, a busybox-compatible keymap has to be created using busybox dumpkmap.

Change to a text console (using Ctrl+Alt+F1 for instance), change to your initramfs directory and execute the following command:

busybox dumpkmap > keymap

This will create a busybox keymap from the current console keymap (make sure, that the console has proper keymap setup!). Now you can include this keymap in makefile.tmpl:

# keymap
file /etc/keymap ${package}/keymap 644 0 0

To load the keymap in init.sh, the following code must be added:

# set keymap
msg "Enabling unicode ..."
kbd_mode -u -C /dev/tty1
printf "\033%%G" >> /dev/console
msg "Loading keymap ..."
loadkmap < /etc/keymap

Note

kbd_mode -u -C /dev/tty1 enables unicode. Don’t include this line, if your keymap doesn’t support unicode.

To change the default console font, an alternative font must be added to the initramfs. This article uses the excellent terminus font (media-fonts/terminus-font):

dir /usr/share/consolefonts 755 0 0
file /usr/share/consolefonts/ter-v16n.psf.gz
/usr/share/consolefonts/ter-v16n.psf.gz 644 0 0

To enable the font, the following code is used in init.sh:

# set console font
msg "Setting font ..."
setfont /usr/share/consolefonts/${CONSOLEFONT}.psf.gz -C /dev/tty1
printf "\033(K" >> /dev/console

${CONSOLEFONT} must be set in the variable section, in this case the following setting is appropriate:

export CONSOLEFONT='ter-v16n'

Of course, any font from /usr/share/consolefonts/ can be used, but remember to adjust the code in this section to use your favourite font.

uvesafb

Uvesafb is a modern framebuffer driver, which works on non-x86 architectures. It needs a little userspace daemon called v86d, which must be provided by the initramfs. Install sys-apps/v86d. This will pull in dev-libs/klibc. For v86d to work, klibc must be built against a kernel, that has CONFIG_FB_UVESA set. It is therefore important, that you compile your kernel before installing sys-apps/v86d.

Including v86d is rather simple. Just add the following line to makefile.tmpl:

# framebuffer support
file /sbin/v86d /sbin/v86d 755 0 0

No modifcation of the init script is required. The kernel does all the work and calls this helper automatically when uvesafb is loaded.

fbcondecor

fbcondecor is a patch included in sys-kernel/gentoo-sources, which allows to display background image on framebuffer consoles.

Like uvesafb fbcondecor needs a userspace helper application called fbcondecor_helper. It is contained in media-gfx/splashutils. This package also provides a sh file to be used by the init script to set up fbcondecor. These files need to be included by the initramfs makefile, so add the following lines to makefile.tmpl:

# splash support
file /sbin/fbcondecor_helper /sbin/fbcondecor_helper 755 0 0
slink /sbin/splash_helper /sbin/fbcondecor_helper 755 0 0
file /etc/initrd.splash /usr/share/splashutils/initrd.splash 644 0 0

The userspace helper needs some directories in order to work correctly [1]:

dir /lib/splash 644 0 0
dir /lib/splash/sys 644 0 0
dir /lib/splash/proc 644 0 0

Moreover you will need a theme, like the nice natural_gentoo from media-gfx/splash-themes-gentoo:

dir /etc/splash 755 0 0
dir /etc/splash/natural_gentoo 755 0 0
file /etc/splash/natural_gentoo/1280x800.cfg /etc/splash/natural_gentoo/1280x800.cfg 644 0 0
dir /etc/splash/natural_gentoo/images 755 0 0
file /etc/splash/natural_gentoo/images/silent-1280x800.jpg /etc/splash/natural_gentoo/images/silent-1280x800.jpg 644 0 0
file /etc/splash/natural_gentoo/images/verbose-1280x800.jpg /etc/splash/natural_gentoo/images/verbose-1280x800.jpg 644 0 0

The init script part for fbcondecor is rather simple:

# setup splash screen
. /etc/initrd.splash
msg "Setting up splashscreen ..."
splash init

This sources the splash function provided by splashutils and initializes the splash screen.

Device mapper

This section describes the setup of device-mapper software such as lvm and dm-crypt/cryptsetup.

All this software has in common, that the device mapper structure needs to be set up. This infrastructure is compiled of the directory for the device files and the control device. Therefore the following line must be added to makefile.tmpl:

dir /dev/mapper 755 0 0

And the following snippet must be added to init.sh:

# create device mapper control file
if [ -e "/sys/class/misc/device-mapper" ]; then
    if [ ! -c "/dev/mapper/control" ]; then
        # create device, if required
        msg 'Creating device mapper control node ...'
        read dev_t < /sys/class/misc/device-mapper/dev
        mknod "/dev/mapper/control" c `echo "$dev_t" | tr ':' ' '`
    fi
else
    err "Device mapper unavailable, aborting ..."
    exit
fi

This reads the major and minor number of the control device from /sys/class/misc/device-mapper and creates the control device at /dev/mapper/control.

Cryptsetup/LUKS

To enable cryptsetup support, only a single file is needed, which must be added to makefile.tmpl using the following line:

# cryptsetup support
file /sbin/cryptsetup /sbin/cryptsetup 755 0 0

In the init script, cryptsetup is called to open the container:

# unlock crypto device
msg 'Opening luks container ...'
while ! cryptsetup luksOpen "${CRYPTROOT}" "${CRYPTNAME}" > /dev/null; do
    sleep 2;
done

The variables ${CRYPTROOT} and ${CRYPTNAME} must be assigned to proper values in the variable section (see The init script). The former is the path of the raw device file, where the encrypted partition is located on, the latter the name, which will be used to create a device file for the opened container within /dev/mapper/ (which you can choose freely). In my case, these declarations look as follows:

export CRYPTROOT="/dev/sda4"
export CRYPTNAME="root"

With these values /dev/sda4 is opened and made available at /dev/mapper/root.

LVM

To use LVM, the lvm executable and lvm.conf are required. These files must be added to makefile.tmpl:

file /sbin/lvm /sbin/lvm.static 755 0 0
dir /etc/lvm 755 0 0
file /etc/lvm/lvm.conf /etc/lvm/lvm.conf 644 0 0

This adds the statically linked lvm executable and the lvm configuration to the initramfs. Make sure, that the lvm configuration is correct and working, before you build the initramfs.

Two calls must be added to init.sh:

# discover and enable lvm volumes
msg 'Scanning lvm volumes ...'
lvm vgscan --ignorelockingfailure > /dev/null
msg 'Enabling lvm volumes ...'
lvm vgchange --ignorelockingfailure -ay > /dev/null

This code first scans all devices for LVM volumes and creates device nodes for them. Then all LVM volumes are enabled.

Userspace software suspend

Unlike TuxOnIce, Userspace software suspend works without a huge kernel space, but requires a userspace application to resume. Install sys-power/suspend and add the following lines to makefile.tmpl:

# uswsusp support
file /sbin/resume /usr/lib/suspend/resume 755 0 0
file /etc/suspend.conf /etc/suspend.conf 644 0 0

This adds the resume helper and the suspend configuration. Make absolutely sure, that the configuration is correct, before building the initramfs.

The resume helper must be called from init.sh in order to resume:

# try to resume
resume

This funtion does not return, so it should be the very last call right before the initramfs is left as described in Leaving the initramfs. Moreover the device, that holds the image data must be available at the time of invocation, or resume will fail.

Leaving the initramfs

Now the device containing the root filesystem is available and everything has been set up. Thus it is time to leave the initial ram disk and let the real init process take over. This is done in init.sh. First, the root filesystem needs to be mounted:

msg 'Mounting root filesystem ...'
mount -o ro "${ROOT}" "/mnt"

To let fsck do its work during the normal boot process, the root filesystem must be mounted read only.

Now some cleanup must be performed:

msg 'Cleaning up ...'
echo > /proc/sys/kernel/hotplug
umount /proc
umount /sys

The path to the kernels hotplug policy agent is reset, so that udev can step in and the pseudo-filesystems are unmounted. Now switch_root is executed, which switches the root filesystem and starts the init process:

msg 'Switching root ...'
exec switch_root /mnt /sbin/init

This is the very last action of the initial ram disk, now the real boot process starts.

The build script

The source directory contains makeinitramfs.py, a python script to ease the build process (requires Python 2.6). The script expands the placeholders in makefile.tmpl, runs gen_init_cpio and writes the generated file to initramfs in the current directory.

If the option -z or --gzip is specified, the generated image is gzipped. With -i or --install the generated image is installed at /boot/initramfs. The breaks the complete build process down to this simple command:

python makeinitramfs.py -z -i

Warning

Don’t run this command as root. If the script detects, that it doesn’t run with uid 0, it uses sudo or su (if the former isn’t found) to install the created initramfs image. Moreover it doesn’t start itself with root privileges, but instead uses the popular install utility. Thus using the -i option is equivalent to the following command:

sudo install -m 644 initramfs /boot

This keeps the amount of privileged code within reasonable limits.

Booting the initramfs

To use the created initramfs, an initrd command must be added to the kernel entry in grub.conf:

initrd               /initramfs

Am example entry is shown below:

title                Gentoo x86_64 2.6.27-gentoo-r7
kernel               /vmlinuz-2.6.27-gentoo-r7 quiet video=uvesafb:1280x800-32@60,mtrr:3,ywrap splash=verbose,theme:natural_gentoo
initrd               /initramfs

quiet supresses the messages from the inital ram filesystem. video=uvesafb:1280x800-32@60,mtrr:3,ywrap configures the framebuffer driver, using a 1280x800 resolution with 32 bit color depth and 60 Hz refresh rate. splash=verbose,theme:drops sets up fbcondecor. verbose disables the silent splash screen, but only the console decorations, theme:drops defines the splash theme to use.

Footnotes

[1]I don’t know why, any hints appreciated.