Linux initramfs architecture explained through early userspace design cpio archive structure mkinitcpio dracut workflow and root filesystem mounting

Advertisement: Alibaba.com (RU) LLC, TIN: 7703380158 erid=2SDnjdb8wti

The Linux kernel is a capable piece of software, but there is one thing it cannot reliably do by itself: find and mount the root filesystem on a modern, complex system. Storage may be encrypted. The root partition may live on a logical volume, a software RAID array, or a network share that requires DHCP negotiation before any disk is even visible. Device drivers for the storage controller may not be compiled directly into the kernel, sitting instead as loadable modules on the very filesystem that has not been mounted yet. The circularity of that problem is not theoretical. It is a practical constraint that every Linux installation must resolve before the system can do anything useful.

initramfs is the resolution. It is a temporary root filesystem loaded entirely into RAM, handed to the kernel by the boot loader before control passes to the real system. What happens inside that small, self-contained environment determines whether the machine boots cleanly or stops with a kernel panic at the worst possible moment.

What initramfs Actually Is at the Filesystem Level

The name contains the answer. initramfs combines three concepts: it is an initial (early-boot) RAM (memory-resident) filesystem. Technically, it is a cpio archive, optionally compressed, that the kernel extracts into a tmpfs instance at startup. The choice of cpio over tar is deliberate and documented in the kernel source: the cpio format is simpler, easier to generate and parse, and the kernel's implementation of it is entirely self-contained, requiring no external tooling at boot time.

The kernel handles two variants of this archive. The first is an internal initramfs, linked directly into the kernel binary itself during the build process. By default this internal archive is essentially empty, consuming roughly 134 bytes on x86. The second, and far more common variant, is an external initramfs image stored in /boot, referenced by the boot loader configuration, and passed to the kernel via the firmware's boot parameters structure. When the kernel detects a valid cpio archive at the address provided by the boot loader, it extracts its contents into tmpfs and uses that as the root filesystem for the duration of early userspace.

The distinction from the older initrd mechanism is worth understanding. The original initrd was a block device image of a real filesystem, typically ext2, that the kernel had to mount using a driver compiled into itself. initramfs requires no filesystem driver: tmpfs is always available because it is built into the kernel. The kernel simply unpacks the cpio stream into memory and proceeds. That difference eliminates a dependency cycle and makes the boot environment genuinely self-sufficient.

After extraction, the kernel looks for a single file at the root of the archive: /init. If that file exists and is executable, the kernel runs it as PID 1. From that point forward, userspace code is in control of the boot process. It is responsible for everything that needs to happen before the real root filesystem can be mounted and switch_root can hand control to the final init system.

Why Early Userspace Is the Right Place for Complexity

Many operations that seem like they should happen in the kernel are better handled in userspace, and not only because userspace code is easier to write and debug. Userspace can use a full C library, can spawn helper processes, can interact with the kernel through ordinary system calls, and can be updated or replaced without recompiling the kernel itself.

Consider full-disk encryption with LUKS. Decrypting a partition requires prompting the user for a passphrase, optionally querying a TPM or a remote key server, running the cryptographic operations, and assembling the resulting block device. That sequence is tractable in a shell script or a compiled binary. Doing it inside the kernel would mean either embedding complex policy into kernel space or exposing far more surface area to privilege escalation. The initramfs model keeps that complexity exactly where it belongs: in userspace, isolated, auditable, and replaceable.

The same logic applies to LVM volume groups, software RAID arrays assembled with mdadm, Btrfs RAID configurations, NFS root mounts, and even systems that simply use a separate /usr partition. All of them need setup work before the real root is accessible. The initramfs /init script performs that work and then calls switch_root to hand off to the real system.

Inspecting and Building a Minimal initramfs by Hand

Before reaching for automated tools, building a minimal initramfs manually illuminates exactly what these tools produce and why. The archive structure is simple: a directory tree packed into a cpio stream and compressed.

Starting from a staging directory:

mkdir -p /tmp/initramfs/{bin,dev,proc,sys,mnt/root,lib,lib64}

# Copy a statically linked shell as /init's interpreter
cp /bin/busybox /tmp/initramfs/bin/busybox
chmod +x /tmp/initramfs/bin/busybox
ln -s busybox /tmp/initramfs/bin/sh

# Create required device nodes
mknod /tmp/initramfs/dev/console c 5 1
mknod /tmp/initramfs/dev/null    c 1 3
mknod /tmp/initramfs/dev/tty     c 5 0

The /init script is the critical piece. A minimal but functional example:

#!/bin/sh
# Mount essential virtual filesystems
mount -t proc  none /proc
mount -t sysfs none /sys
mount -t devtmpfs devtmpfs /dev

echo "Early userspace active. Attempting to mount root..."

# Mount the real root filesystem (adjust device and fstype as needed)
mount -o ro /dev/sda1 /mnt/root

# Clean up virtual mounts
umount /proc /sys /dev

# Hand control to the real init
exec switch_root /mnt/root /sbin/init

Packing and compressing the staging directory into an image:

cd /tmp/initramfs
find . | cpio --create --format=newc | gzip -9 > /boot/initramfs-custom.img

The resulting file is a gzip-compressed newc-format cpio archive. The kernel recognizes the gzip magic bytes, decompresses into tmpfs, and runs /init. That is the entire mechanism. Everything else built on top of it is tooling, automation, and policy.

To inspect an existing initramfs image without modifying it, the archive can be extracted to a temporary directory:

mkdir /tmp/inspect-initramfs && cd /tmp/inspect-initramfs
# For gzip-compressed images (Debian/Ubuntu default):
zcat /boot/initramfs-$(uname -r).img | cpio -idmv

# For zstd-compressed images (Arch Linux mkinitcpio default for kernel 5.9+):
zstd -d /boot/initramfs-linux.img -o /tmp/initramfs.cpio
cpio -idmv < /tmp/initramfs.cpio

Once extracted, the full tree of binaries, libraries, modules, and hook scripts becomes visible and navigable with ordinary filesystem tools.

mkinitcpio: The Arch Linux Approach to Controlled Composition

mkinitcpio is the initramfs generator developed by and for Arch Linux, though it has seen adoption beyond that distribution. Its central design principle is explicit control: the administrator specifies exactly which kernel modules and which hooks run during initramfs construction and early boot.

The configuration file /etc/mkinitcpio.conf exposes six variables. The most important are MODULES, HOOKS, and COMPRESSION:

# /etc/mkinitcpio.conf

# Kernel modules to load unconditionally before hooks run
MODULES=(vfio_pci vfio vfio_iommu_type1)

# Hooks executed in order during image build and early boot
HOOKS=(base udev autodetect modconf kms keyboard keymap consolefont
       block encrypt lvm2 filesystems fsck)

# Compression algorithm. Default is zstd for kernels 5.9 and newer
COMPRESSION="zstd"
COMPRESSION_OPTIONS=(-19 -T0)

Hook order in the HOOKS array is not cosmetic. It is sequential and consequential. The encrypt hook must appear before lvm2 if the LVM volume group lives on an encrypted partition. The block hook must appear before any filesystem-related hooks. Getting this order wrong produces a system that does not boot, often with an error message obscure enough to send an unprepared administrator toward recovery media.

After editing the configuration, the initramfs is rebuilt with:

# Rebuild using the preset for the currently installed linux kernel
sudo mkinitcpio -p linux

# Rebuild all presets for all installed kernels
sudo mkinitcpio -P

# Dry run showing what would be included without writing an image
sudo mkinitcpio -p linux --dry-run

mkinitcpio by default generates images using zstd compression with multi-threaded encoding, producing compact images that decompress quickly on boot. The reproducibility of its output is a design goal: two successive runs on the same system produce binary-identical images, because all timestamps inside the archive are set to the Unix epoch.

Dracut: Automatic Detection for Fedora, RHEL, and Beyond

Dracut takes a philosophically different approach. Rather than asking the administrator to enumerate what the initramfs needs, it inspects the running system and determines that automatically. It examines which modules are loaded, which filesystems are in use, what the root device is, and assembles an image that contains precisely the components required to reproduce the current boot on the current hardware.

On Red Hat family distributions, Fedora, RHEL, Rocky Linux, and their derivatives, dracut is the default initramfs generator. Its basic invocation requires no arguments:

# Generate initramfs for the currently running kernel
sudo dracut --force

# Generate for a specific kernel version
sudo dracut --force /boot/initramfs-6.6.8.img 6.6.8-200.fc39.x86_64

# Build host-only image (excludes modules unnecessary on this hardware)
sudo dracut --hostonly --force

# List all available dracut modules
dracut --list-modules

The --hostonly flag is significant in practice. A generic dracut image includes broad hardware support and can boot on many different systems. A host-only image includes only what the current hardware needs, producing a smaller archive with faster decompression. For production servers that never change their hardware, host-only mode is the sensible default.

Dracut's configuration lives in /etc/dracut.conf and drop-in files under /etc/dracut.conf.d/. Adding a module, setting compression, or including extra binaries follows a declarative syntax:

# /etc/dracut.conf.d/custom.conf

# Force inclusion of specific dracut modules
add_dracutmodules+=" crypt lvm "

# Exclude dracut modules that are not needed
omit_dracutmodules+=" nfs brltty "

# Compress with zstd
compress="zstd"

# Include extra files verbatim
install_items+=" /etc/crypttab /usr/bin/custom-tool "

One characteristic that surprises administrators migrating from mkinitcpio is that dracut images tend to be larger in their default configuration. The difference comes from inclusion breadth and compression settings: dracut defaults to gzip at a moderate compression level, while mkinitcpio on recent kernels defaults to zstd with aggressive settings. With equivalent content and equivalent compression, the sizes converge.

Compression Methods and Their Trade-offs

The kernel supports multiple compression formats for the initramfs archive, and the choice affects both image size and boot time in ways that are not always intuitive.

gzip is the universal fallback: fast to compress, widely supported, moderate compression ratio. LZ4 compresses far less aggressively but decompresses at speeds approaching memory bandwidth, making it attractive on systems where boot time matters more than image size, the rationale behind Ubuntu's switch to LZ4 in 2018. zstd, available since kernel 4.14, offers compression ratios comparable to xz at decompression speeds closer to LZ4. It is the most balanced option for general use and the reason mkinitcpio adopted it as the default for kernels 5.9 and newer.

xz achieves the smallest possible image sizes but at a steep decompression cost. On embedded systems with constrained flash storage, that trade-off may be acceptable. On a workstation that boots daily, the seconds spent decompressing an xz image accumulate into a real annoyance.

The microcode prepending behavior is a subtlety worth knowing: on most modern Linux systems, the initramfs image in /boot is actually two concatenated cpio archives. The first is an uncompressed cpio containing CPU microcode updates from the intel-ucode or amd-ucode packages. The second is the compressed initramfs proper. The kernel processes them sequentially: it extracts the uncompressed microcode archive first, loads the microcode into the CPU, then extracts and runs the main initramfs. This is why examining an initramfs with a naive zcat | cpio -t sometimes yields confusing output: the tool is encountering the uncompressed prefix before reaching the compressed section.

What Happens When the initramfs Cannot Mount Root

The failure mode of an initramfs problem is one that system administrators encounter at the least convenient times: the kernel boots, the initramfs starts, and then the system drops into an emergency shell or simply hangs with a message about being unable to find the root device.

The most common causes are a mismatch between the root= kernel parameter and the actual device path after the initramfs runs, a missing kernel module for the storage controller, a hook missing from or mispositioned in the HOOKS array, or a corrupt initramfs image caused by interrupted regeneration during a package update.

Booting into recovery mode with an accessible initramfs emergency shell allows inspection:

# Inside the initramfs emergency shell:

# Check which block devices are visible
ls /dev/disk/by-uuid/

# Attempt manual mount
mount -o ro /dev/nvme0n1p2 /mnt/root

# Check available modules
find /lib/modules -name "*.ko" | grep nvme

# Load a missing module manually and retry
modprobe nvme

From that shell, an administrator can diagnose what the automated init script could not do, then rebuild a corrected image after chrooting into the system from a rescue environment.

The initramfs is a small environment with an outsized influence on boot reliability. Administrators who understand its structure, the tools that build it, and the sequence it executes are administrators who fix boot failures quickly rather than reinstalling out of frustration. That understanding begins with recognizing what the initramfs actually is: not a black box conjured by a package manager, but a cpio archive containing a shell script and a set of binaries that together solve one clearly defined problem before handing the system off to everything that comes next.