Jacek's Blog

Software Engineering Consultant

Single-Command Server Bootstrapping

April 5, 2023 nix

When you spin up a new VM or bare metal server at some cloud provider, what is the fastest and easiest way to get the server to run a certain configuration? In this article, I show how to do the partitioning, formatting, and installing a fully configured NixOS, starting from a random rescue system, in 5 minutes and with literally a single command: nixos-anywhere.

No matter if we’re running some old preinstalled Linux Distro on an existing machine, or if it’s running some rescue live image, nixos-anywhere helps us iron anything else over it. While this approach works practically on any VM or bare metal machine, regardless if it’s hosted by some cloud provider or in your own flat, we are going to demonstrate it on Hetzner machines: The first example deploys a simple, minimal NixOS configuration on a cheap Hetzner Cloud VM. Afterward, we will go more into detail about how the whole thing works. Finally, we will perform a more complex example with RAID schemes on multiple hard disks.

A Simple Single-Command Deployment Example

This example could be done on/with any cloud-hosted VM or standing around server/laptop: Our bootstrap requirement is that the machine has booted some Linux that is configured in a way that we can SSH into it and perform the kexec system call. Systems that are configured like this are typically the VMs and bare metal servers of cloud providers, which run on either some minimal rescue system after you ordered them, or a completely preconfigured Ubuntu/Debian/Fedora.

For the minimal example, we choose a Hetzner Cloud VM, which costs ~5€ per month and is available a few minutes after ordering it:

We just check out some of the cheap machines and buy it

None of the available distros is the one we like to have: NixOS. After you have ordered a VM, the Hetzner Cloud service does later provide the possibility to boot a NixOS live system, but as it turns out we don’t need that anymore because we have an even quicker approach.

After going through the ordering process, it takes a few minutes until the new VM pops up. After that happened, we open the command center to get its IP:

This is what the control center of our new VM looks like

So, what needs to happen now? Someone needs to:

  1. Find out what device the hard disk is: /dev/sda? /dev/vda? /dev/nvme1n1?
  2. Partition, format, and mount it
  3. Install new GNU/Linux distribution, activate SSH with pub key
  4. Reboot
  5. Perform all steps to reach the final configuration

The NixOS installer allows us to combine steps 3 and 5.

Before we can run nixos-anywhere, we need to create the system configuration that contains all the information for steps 1-5 in form of a flake.nix:

# file: flake.nix
{
  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-22.11";
    disko.url = "github:nix-community/disko";
    disko.inputs.nixpkgs.follows = "nixpkgs";
  };

  outputs = { self, nixpkgs, disko, ... }: {
    nixosConfigurations.hetzner-cloud = nixpkgs.lib.nixosSystem {
      system = "x86_64-linux";
      modules = [
        ({modulesPath, ... }: {
          imports = [
            "${modulesPath}/installer/scan/not-detected.nix"
            "${modulesPath}/profiles/qemu-guest.nix"
            disko.nixosModules.disko
          ];
          disko.devices = import ./single-gpt-disk-fullsize-ext4.nix "/dev/sda";
          boot.loader.grub = {
            devices = [ "/dev/sda" ];
            efiSupport = true;
            efiInstallAsRemovable = true;
          };
          services.openssh.enable = true;

          users.users.root.openssh.authorizedKeys.keys = [
            "<...your SSH public key here...>"
          ];
        })
      ];
    };
  };
}

The lines that appear in the context of the modules = [ ... ]; assignment represent the actual NixOS system configuration, which sets up this machine to be a QEMU guest (that works for Hetzner VMs), uses a partitioning scheme that we explain in the next paragraph, describes the GRUB configuration, sets up SSH, and finally adds our SSH key to it for the root user.

All lines with disko in them are for partitioning and formatting. As we can see, the configuration calls another file for that purpose:

# file: single-gpt-disk-fullsize-ext4.nix
diskDevice:

{
  disk.${diskDevice} = {
    device = diskDevice;
    type = "disk";
    content = {
      type = "table";
      format = "gpt";
      partitions = [
        {
          name = "boot";
          start = "0";
          end = "1M";
          part-type = "primary";
          flags = [ "bios_grub" ];
        }
        {
          name = "ESP";
          start = "1MiB";
          end = "100MiB";
          bootable = true;
          content = {
            type = "filesystem";
            format = "vfat";
            mountpoint = "/boot";
          };
        }
        {
          name = "root";
          start = "100MiB";
          end = "100%";
          part-type = "primary";
          bootable = true;
          content = {
            type = "filesystem";
            format = "ext4";
            mountpoint = "/";
          };
        }
      ];
    };
  };
}

This looks like a JSON document that explains that we take the one disk the system has, give it a boot, ESP, and root partition with specific flags and sizes, format them with specific file systems, and want them at specific mount points later. The NixOS config is loaded in a way that it uses the disko tooling to partition and format everything correctly at setup time and mount everything at every boot.

That was many lines of config, but it’s enough to just start:

nix run github:numtide/nixos-anywhere -- \
  root@65.109.232.132 \
  --flake .#hetzner-cloud

Note that .#hetzner-cloud references “this flake” via .# and then the NixOS config as we named it in the flake.nix file.

The command takes a few minutes. Afterward, the system goes offline for a final reboot and then we can ssh into it and see that everything works:

[tfc@jongepad:~]$ ssh root@65.109.232.132

[root@nixos:~]# uname -a
Linux nixos 5.15.105 #1-NixOS SMP Thu Mar 30 10:48:01 UTC 2023 x86_64 GNU/Linux

[root@nixos:~]# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda      8:0    0 38.1G  0 disk
├─sda1   8:1    0  960K  0 part
├─sda2   8:2    0   99M  0 part /boot
└─sda3   8:3    0   38G  0 part /nix/store
                                /
sr0     11:0    1 1024M  0 rom

Wow, that “just worked” and it was easy and quick!

If we want to evolve the configuration step by step now, we can change anything we want in the NixOS config in the flake.nix file. For this system, we don’t need nixos-anywhere any longer (as long as we don’t want to change its partitioning scheme). To deploy the change, we can run the nixos-rebuild command:

nixos-rebuild switch \
  --flake .#hetzner-cloud \
  --target-host root@65.109.232.132 \

Now we can simply commit the configuration to a repository and keep it up to date there, together with some nice documentation. We could set up a CI that automatically pushes the latest config (if it builds), or configure the server in a way that it automatically pulls the latest config every night (which I do with some of my machines).

How Does This All Work?

Multiple things happened during installation:

Build the System Configuration

Our flake.nix file describes a full NixOS system. A so-called “top-level closure” can be built from such system descriptions. The top-level derivation contains an install script that, when launched, can install itself onto a disk and prepare it for boot. That is part of the smart NixOS installer design, and nixos-anywhere reuses it.

Generate Partitioning Script

The tooling that comes with disko consumes our disk configuration file and generates two things:

  1. A bash script that partitions a given disk
  2. A NixOS configuration snippet that educates our NixOS top-level derivation about how and where to mount things at boot.

Number 1 is especially astonishing because disko understands complex configurations: Do you want different RAID configurations on multiple disks that are encrypted with LVM and then provide ZFS pools that are mounted at random positions in your file system tree? No problem, disko generates that for you.

Switch the Running Server from Random $Linux to NixOS Live Installer

This is the phase where nixos-anywhere shines. Starting with some random GNU/Linux distribution (it does not matter if the target machine runs some live image in RAM or if runs an old GNU/Linux installation from disk) provided by our cloud hoster, it just needs root rights to download a minimal NixOS installer image and switch kernel and filesystem to it by running the kexec system call, which only takes a few seconds. You can see how quick this this happens from reading the log that floods over the shell after you entered the initial command.

The NixOS live installer that is then executed, has all the kernel modules and tools to build all we need. This is especially useful if you want to deploy some complex ZFS stuff to disk but then need to install ZFS tooling and build kernel modules that are not distributed with Ubuntu/Debian typically, so you have to manage all of it yourself. Not with nixos-anywhere!

Iron the New System Over the Old One

As the last step, nixos-anywhere transfers the top-level system closure to the target system and calls its installation script. When the installation is done, it reboots the system. It assumes that you configured the system in a way that you can reach it without passwords after installation and reboot, which we did in the example.

If anything goes wrong in this stage, the system is of course in a broken state. What we need to do then is open the control center of the cloud provider, activate some rescue disk, and start the whole process all over.

For this example article, I needed only one attempt. There were more complex scenarios where I had to cycle through multiple attempts, where every attempt only cost me a few minutes.

The Complex Example: Setting Up a Bare Metal Server With RAID on Multiple Disks

For a customer project, I needed to get and set up a physical, bare metal Hetzner server. Hetzner provides these via their Hetzner Robot interface, which also provides a nice shopping experience. For this project, I ordered a machine with fast NVMe disks for the system root and two 20TB SSDs for databases and other data which will be a lot.

This is what the partitioning scheme looks like after installation:

# lsblk
NAME        MAJ:MIN RM    SIZE RO TYPE  MOUNTPOINTS

nvme0n1     259:0    0  476.9G  0 disk
├─nvme0n1p1 259:3    0    960K  0 part
├─nvme0n1p2 259:5    0   1023M  0 part
 └─md127     9:127  0 1022.9M  0 raid1 /boot
└─nvme0n1p3 259:7    0  475.9G  0 part
  └─md126     9:126  0  475.8G  0 raid1 /nix/store
                                        /
nvme1n1     259:1    0  476.9G  0 disk
├─nvme1n1p1 259:2    0    960K  0 part
├─nvme1n1p2 259:4    0   1023M  0 part
 └─md127     9:127  0 1022.9M  0 raid1 /boot
└─nvme1n1p3 259:6    0  475.9G  0 part
  └─md126     9:126  0  475.8G  0 raid1 /nix/store
                                        /
sda           8:0    0   10.9T  0 disk
└─sda1        8:1    0   10.9T  0 part
  └─md125     9:125  0   10.9T  0 raid1 /var/lib
sdb           8:16   0   10.9T  0 disk
└─sdb1        8:17   0   10.9T  0 part
  └─md125     9:125  0   10.9T  0 raid1 /var/lib

The disko file that describes this layout is of course a bit larger, so let me just link to it.

Hetzner bare metal servers are a little more complex to set up than Hetzner VMs because they don’t simply get an IP address assigned via DHCP. We have to note down the addresses and details from the command center and then put them into our NixOS config. The network part of the config is also available in this part of the NixOS configuration.

I checked out most of this system config from existing scripts which did not completely do what I needed but provided many useful snippets.

The overall configuration is much more complicated, but bootstrapping and redeployment of configuration changes work with the same ease and speed as the two commands that we used in the first example, too!

Summary

When I faced the requirements for the disk partitioning scheme, I first thought “Oh no, that will be a long chain of commands to automate…”. Turns out, this is the easy part with the right tooling: disko is an amazing tool, but the documentation is lacking a bit. The quickstart guide and the very rich collection of example partitioning, RAID, and formatting schemes were however helpful to get it running.

The resulting Disko partition & format declarations can look very complex, but that is due to the domain of partitioning and formatting disks having its own complexity.

I am looking forward to using this magic tool even more. It helps provide and maintain clients’ server configs with ease and speed, no matter how complex their requirements end up being.

Having met the author of disko in real life in the past, I chatted him up to discuss a few things about the combination of disko and nixos-anywhere. On this occasion, he told me that there have been some interesting thoughts on how to improve the size and speed of nixos-anywhere even further.

If you happened to like this article or need some help with Nix/NixOS, also have a look at my corporate Nix & NixOS Trainings and Consulting Services.