Reading List

How do Nix builds work? from Julia Evans RSS feed.

How do Nix builds work?

Hello! For some reason after the last nix post I got nerdsniped by trying to understand how Nix builds work under the hood, so here’s a quick exploration I did today. There are probably some mistakes in here.

I started by complaining on Mastodon:

are there any guides to nix that start from the bottom up (for example starting with this bash script and then working up the layers of abstraction) instead of from the top down?

all of the guides I’ve seen start by describing the nix programming language or other abstractions, and I’d love to see a guide that starts with concepts I already understand like compiler flags, linker flags, Makefiles, environment variables, and bash scripts

Ross Light wrote a great blog post in response called Connecting Bash to Nix, that shows how to compile a basic C program without using most of Nix’s standard machinery.

I wanted to take this a tiny bit further and compile a slightly more complicated C program.

the goal: compile a C program, without using Nix’s standard machinery

Our goal is to compile a C program called paperjam. This is a real C program that wasn’t in the Nix repository already. I already figured out how to compile it in this post by copying and pasting a bunch of stuff I didn’t understand, but this time I wanted to do it in a more principled way where I actually understand more of the steps.

We’re going to avoid using most of Nix’s helpers for compiling C programs.

The plan is to start with an almost empty build script, and then resolve errors until we have a working build.

first: what’s a derivation?

I said that we weren’t going to talk about too many Nix abstractions (and we won’t!), but understanding what a derivation is really helped me.

Everything I read about Nix talks about derivations all the time, but I was really struggling to figure out what a derivation is. It turns out that derivation is a function in the Nix language. But not just any function! The whole point of the Nix language seems to be to to call this function. The official documentation for the derivation function is actually extremely clear. Here’s what I took away:

derivation takes a bunch of keys and values as input. There are 3 required keys:

  1. system: the system, for example x86_64-darwin
  2. name: the name of the package you’re building
  3. builder: a program (usually a bash script) that runs the build

Every other key is an arbitrary string that gets passed as an environment variable to the builder shell script.

derivations automatically build all their inputs

A derivation doesn’t just call a shell script though! Let’s say I reference another derivation called pkgs.qpdf in my script.

Nix will:

  • automatically build the qpdf package
  • put the resulting output directory somewhere like /nix/store/4garxzr1rpdfahf374i9p9fbxnx56519-qpdf-11.1.0
  • expand pkgs.qpdf into that output directory (as a string), so that I can reference it in my build script

The derivation function does some other things (described in the documentation), but “it builds all of its inputs” is all we really need to know for now.

step 1: write a derivation file

Let’s write a very simple build script and call the derivation function. These don’t work yet, but I found it pretty fun to go through all the errors, fix them one at a time, and learn a little more about how Nix works by fixing them.

Here’s the build script (build_paperjam.sh). This just unpacks the tarball and runs make install.

#!/bin/bash

tar -xf "$SOURCE"
cd paperjam-1.2 
make install

And here’s the Nix code calling the derivation function (in paperjam.nix). This calls the core derivation function, without too much magic.

let pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/4d2b37a84fad1091b9de401eb450aae66f1a741e.tar.gz") {};

builtins.derivation {
  name = "paperjam-fake";
  builder = ./build-paperjam.sh;
  system = builtins.currentSystem;

  SOURCE = pkgs.fetchurl {
    url = "https://mj.ucw.cz/download/linux/paperjam-1.2.tar.gz";
    hash = "sha256-0AziT7ROICTEPKaA4Ub1B8NtIfLmxRXriW7coRxDpQ0";
  };

}

The main things here are:

  • fetchurl (which downloads an url and puts the path in to the SOURCE environment variable)
  • pkgs (which lets us depend on other Nix packages from the central repository). I don’t totally understand this but I’m already in a pretty deep rabbit hole so we’re going to leave that for now.

SOURCE evaluates to a string – it’s the path to the downloaded source tarball.

problem 1: tar: command not found

Nix needs you to declare all the dependencies for your builds. It forces this by removing your PATH environment variable so that you have no binaries in your PATH at all.

This is pretty easy to fix: we just need to edit our PATH.

I added this to paperjam.nix to get tar, gzip, and make:

  PATH = "${pkgs.gzip}/bin:${pkgs.gnutar}/bin:${pkgs.gnumake}/bin";

problem 2: we need a compiler

Next, we had this error:

g++ -O2 -Wall -Wextra -Wno-parentheses -std=gnu++11 -g -DVERSION='"1.2"' -DYEAR='"2022"' -DBUILD_DATE='""' -DBUILD_COMMIT='""'   -c -o paperjam.o paperjam.cc
make: g++: No such file or directory

So we need to put a compiler in our PATH. For some reason I felt like using clang++ to compile, not g++. To do that I need to make 2 changes to paperjam.nix:

  1. Add the line CXX="clang++";
  2. Add ${pkgs.clang}/bin to my PATH

problem 3: missing header files

The next error was:

 > ./pdf-tools.h:13:10: fatal error: 'qpdf/QPDF.hh' file not found
 > #include <qpdf/QPDF.hh>

Makes sense: everything is isolated, so it can’t access my system header files. Figuring out how to handle this was a little more confusing though.

It turns out that the way Nix handles header files is that it has a shell script wrapper around clang. So when you run clang++, you’re actually running a shell script.

On my system, the clang++ wrapper script was at /nix/store/d929v59l9a3iakvjccqpfqckqa0vflyc-clang-wrapper-11.1.0/bin/clang++. I searched that file for LDFLAGS and found that it uses 2 environment variables:

  1. NIX_LDFLAGS_aarch64_apple_darwin
  2. NIX_CFLAGS_COMPILE_aarch64_apple_darwin

So I figured I needed to put all the arguments to clang in the NIX_CFLAGS variable and all the linker arguments in NIX_LDFLAGS. Great! Let’s do that.

I added these 2 lines to my paperjam.nix, to link the libpaper and qpdf libraries:

NIX_LDFLAGS_aarch64_apple_darwin = "-L ${pkgs.qpdf}/lib   -L ${pkgs.libpaper}/lib";
NIX_CFLAGS_COMPILE_aarch64_apple_darwin = "-isystem ${pkgs.qpdf}/include   -isystem ${pkgs.libpaper}/include";

And that worked!

problem 4: missing c++abi

The next error was:

> ld: library not found for -lc++abi

Not sure what this means, but I searched for “abi” in the Nix packages and fixed it by adding -L ${pkgs.libcxxabi}/lib to my NIX_LDFLAGS environment variable.

problem 5: missing iconv

Here’s the next error:

> Undefined symbols for architecture arm64:
>   "_iconv", referenced from: ...

I started by adding -L ${pkgs.libiconv}/lib to my NIX_LDFLAGS environment variable, but that didn’t fix it. Then I spent a while going around in circles and being confused.

I eventually figured out how to fix this by taking a working version of the paperjam build that I’d made before and editing my clang++ wrapper file to print out all of its environment variables. The LDFLAGS environment variable in the working version was different from mine: it had -liconv in it.

So I added -liconv to NIX_LDFLAGS as well and that fixed it.

why doesn’t the original Makefile have -liconv?

I was a bit puzzled by this -liconv thing though: the original Makefile links in libqpdf and libpaper by passing -lqpdf -lpaper. So why doesn’t it link in iconv, if it requires the iconv library?

I think the reason for this is that the original Makefile assumed that you were running on Linux and using glibc, and glibc includes these iconv functions by default. But I guess Mac OS libc doesn’t include iconv, so we need to explicitly set the linker flag -liconv to add the iconv library.

problem 6: missing codesign_allocate

Time for the next error:

libc++abi: terminating with uncaught exception of type std::runtime_error: Failed to spawn codesign_allocate: No such file or directory

I guess this is some kind of Mac code signing thing. I used find /nix/store -name codesign_allocate to find codesign_allocate on my system. It’s at /nix/store/a17dwfwqj5ry734zfv3k1f5n37s4wxns-cctools-binutils-darwin-973.0.1/bin/codesign_allocate.

But this doesn’t tell us what the package is called – we need to be able to refer to it as ${pkgs.XXXXXXX} and ${pkgs.cctools-binutils-darwin} doesn’t work.

I couldn’t figure out a way go from a Nix folder to the name of the package, but I ended up poking around and finding out that it was called pkgs.darwin.cctools.

So I added ${pkgs.darwin.cctools}/bin to the PATH.

problem 7: missing a2x

Easy, just add ${pkgs.asciidoc}/bin to the PATH.

problem 8: missing install

make: install: No such file or directory

Apparently install is a program? This turns out to be in coreutils, so we add ${pkgs.coreutils}/bin to the PATH. Adding coreutils also fixes some other warnings I was seeing about missing commands like date.

problem 9: can’t create /usr/local/bin/paperjam

This took me a little while to figure out because I’m not very familiar with make. The Makefile has a PREFIX of /usr/local, but we want it to be the program’s output directory in /nix/store/

I edited the build-paperjam.sh shell script to say:

make install PREFIX="$out"

and everything worked! Hooray!

our final configuration

Here’s the final paperjam.nix. It’s not so different from what we started with – we just added 4 environment variables.

let pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/ae8bdd2de4c23b239b5a771501641d2ef5e027d0.tar.gz") {};
in

builtins.derivation {
  name = "paperjam-fake";
  builder = ./build-paperjam.sh;
  system = builtins.currentSystem;

  SOURCE = pkgs.fetchurl {
    url = "https://mj.ucw.cz/download/linux/paperjam-1.2.tar.gz";
    hash = "sha256-0AziT7ROICTEPKaA4Ub1B8NtIfLmxRXriW7coRxDpQ0";
  };

  CXX="clang++";
  PATH = "${pkgs.gzip}/bin:${pkgs.gnutar}/bin:${pkgs.gnumake}/bin:${pkgs.clang}/bin:${pkgs.darwin.cctools}/bin:${pkgs.asciidoc}/bin:${pkgs.coreutils}/bin:${pkgs.bash}/bin";
  NIX_LDFLAGS_aarch64_apple_darwin = "-L ${pkgs.qpdf}/lib   -L ${pkgs.libpaper}/lib -L ${pkgs.libcxxabi}/lib -liconv -L ${pkgs.libiconv}/lib ";
  NIX_CFLAGS_COMPILE_aarch64_apple_darwin = "-isystem ${pkgs.qpdf}/include   -isystem ${pkgs.libpaper}/include";
}

And here’s the final build-paperjam.sh build script. Here we just needed to edit the make install line to set the PREFIX.

#!/bin/bash

tar -xf "$SOURCE"
cd paperjam-1.2
make install PREFIX="$out"

let’s look at our compiled derivation!

Now that we understand this configuration a little better, let’s talk about what nix-build is doing a little more.

Behind the scenes, nix-build paperjam.nix actually runs nix-instantiate and nix-store --realize:

$ nix-instantiate paperjam.nix
/nix/store/xp8kibpll55s0bm40wlpip51y7wnpfs0-paperjam-fake.drv
$ nix-store --realize /nix/store/xp8kibpll55s0bm40wlpip51y7wnpfs0-paperjam-fake.drv

I think what this means is that paperjam.nix get compiled to some intermediate representation (also called a derivation?), and then the Nix runtime takes over and is in charge of actually running the build scripts.

We can look at this .drv intermediate representation with nix show-derivation

{
  "/nix/store/xp8kibpll55s0bm40wlpip51y7wnpfs0-paperjam-fake.drv": {
    "outputs": { "out": { "path": "/nix/store/bcnyqizvcysqc1vy382wfx015mmwn3bd-paperjam-fake" }
    },
    "inputSrcs": [ "/nix/store/pbjj91f0qr8g14k58m744wdl9yvr2f5k-build-paperjam.sh" ],
    "inputDrvs": {
      "/nix/store/38sikqcggyishxbgi2xnyrdsnq928gqx-asciidoc-10.2.0.drv": [ "out" ],
      "/nix/store/3llc749f9pn0amlb9vgwsi22hin7kmz4-libcxxabi-11.1.0.drv": [ "out" ],
      "/nix/store/a8ny8lrbpyn15wdxk3v89f4bdr08a38a-libpaper-1.1.28.drv": [ "out" ],
      "/nix/store/d888pj9lll12s5qx11v850g1vd4h3vxq-cctools-port-973.0.1.drv": [ "out" ],
      "/nix/store/gkpdv7xl39x9yxch0wjarq19mmv7j1pm-bash-5.2-p15.drv": [ "out" ],
      "/nix/store/hwx16m7hmkp2rcik8h67nnyjp52zj849-gnutar-1.34.drv": [ "out" ],
      "/nix/store/kqqwffajj24fmagxqps3bjcbrglbdryg-gzip-1.12.drv": [ "out" ],
      "/nix/store/lnrxa45bza18dk8qgqjayqb65ilfvq2n-qpdf-11.2.0.drv": [ "out" ],
      "/nix/store/rx7a5401h44dqsasl5g80fl25jqqih8r-gnumake-4.4.drv": [ "out" ],
      "/nix/store/sx8blaza5822y51abdp3353xkdcbkpkb-coreutils-9.1.drv": [ "out" ],
      "/nix/store/v3b7r7a8ipbyg9wifcqisf5vpy0c66cs-clang-wrapper-11.1.0.drv": [ "out" ],
      "/nix/store/wglagz34w1jnhr4xrfdk0g2jghbk104z-paperjam-1.2.tar.gz.drv": [ "out" ],
      "/nix/store/y9mb7lgqiy38fbi53m5564bx8pl1arkj-libiconv-50.drv": [ "out" ]
    },
    "system": "aarch64-darwin",
    "builder": "/nix/store/pbjj91f0qr8g14k58m744wdl9yvr2f5k-build-paperjam.sh",
    "args": [],
    "env": {
      "CXX": "clang++",
      "NIX_CFLAGS_COMPILE_aarch64_apple_darwin": "-isystem /nix/store/h25d99pd3zln95viaybdfynfq82r2dqy-qpdf-11.2.0/include   -isystem /nix/store/agxp1hx267qk1x79dl4jk1l5cg79izv1-libpaper-1.1.28/include",
      "NIX_LDFLAGS_aarch64_apple_darwin": "-L /nix/store/h25d99pd3zln95viaybdfynfq82r2dqy-qpdf-11.2.0/lib   -L /nix/store/agxp1hx267qk1x79dl4jk1l5cg79izv1-libpaper-1.1.28/lib -L /nix/store/awkb9g93ci2qy8yg5jl0zxw46f3xnvgv-libcxxabi-11.1.0/lib -liconv -L /nix/store/nmphpbjn8hhq7brwi9bw41m7l05i636h-libiconv-50/lib ",
      "PATH": "/nix/store/90cqrp3nxbcihkx4vswj5wh85x5klaga-gzip-1.12/bin:/nix/store/siv9312sgiqwsjrdvj8lx0mr3dsj3nf5-gnutar-1.34/bin:/nix/store/yy3fdgrshcblwx0cfp76nmmi24szw89q-gnumake-4.4/bin:/nix/store/cqag9fv2gia03nzcsaygan8fw1ggdf4g-clang-wrapper-11.1.0/bin:/nix/store/f16id36r9xxi50mgra55p7cf7ra0x96k-cctools-port-973.0.1/bin:/nix/store/x873pgpwqxkmyn35jvvfj48ccqav7fip-asciidoc-10.2.0/bin:/nix/store/vhivi799z583h2kf1b8lrr72h4h3vfcx-coreutils-9.1/bin:/nix/store/0q1jfjlwr4vig9cz7lnb5il9rg0y1n84-bash-5.2-p15/bin",
      "SOURCE": "/nix/store/6d2fcw88d9by4fz5xa9gdpbln73dlhdk-paperjam-1.2.tar.gz",
      "builder": "/nix/store/pbjj91f0qr8g14k58m744wdl9yvr2f5k-build-paperjam.sh",
      "name": "paperjam-fake",
      "out": "/nix/store/bcnyqizvcysqc1vy382wfx015mmwn3bd-paperjam-fake",
      "system": "aarch64-darwin"
    }
  }
}

This feels surprisingly easy to understand – you can see that there are a bunch of environment variables, our bash script, and the paths to our inputs.

the compilation helpers we’re not using: stdenv

Normally when you build a package with Nix, you don’t do all of this stuff yourself. Instead, you use a helper called stdenv, which seems to have two parts:

  1. a function called stdenv.mkDerivation which takes some arguments and generates a bunch of environment variables (it seems to be documented here)
  2. a 1600-line bash build script (setup.sh) that consumes those environment variables. This is like our build-paperjam.sh, but much more generalized.

Together, these two tools:

  • add LDFLAGS automatically for each C library you depend on
  • add CFLAGS automatically so that you can get your header files
  • run make
  • depend on clang and coreutils and bash and other core utilities so that you don’t need to add them yourself
  • set system to your current system
  • let you easily add custom bash code to run at various phases of your build
  • maybe also manage versions somehow? Not sure about this one.

and probably lots more useful things I don’t know about yet

let’s look at the derivation for jq

Let’s look at one more compiled derivation, for jq. This is quite long but there are some interesting things in here. I wanted to look at this because I wanted to see what a more typical derivation generated by stdenv.mkDerivation looked like.

$ nix show-derivation /nix/store/q9cw5rp0ibpl6h4i2qaq0vdjn4pyms3p-jq-1.6.drv
{
  "/nix/store/q9cw5rp0ibpl6h4i2qaq0vdjn4pyms3p-jq-1.6.drv": {
    "outputs": {
      "bin": { "path": "/nix/store/vabn35a2m2qmfi9cbym4z50bwq94fdzm-jq-1.6-bin" },
      "dev": { "path": "/nix/store/akda158i8gr0v0w397lwanxns8yrqldy-jq-1.6-dev" },
      "doc": { "path": "/nix/store/6qimafz8q88l90jwrzciwc27zhjwawcl-jq-1.6-doc" },
      "lib": { "path": "/nix/store/3wzlsin34l1cs70ljdy69q9296jnvnas-jq-1.6-lib" },
      "man": { "path": "/nix/store/dl1xf9w928jai5hvm5s9ds35l0m26m0k-jq-1.6-man" },
      "out": { "path": "/nix/store/ivzm5rrr7riwvgy2xcjhss6lz55qylnb-jq-1.6" }
    },
    "inputSrcs": [
      "/nix/store/6xg259477c90a229xwmb53pdfkn6ig3g-default-builder.sh",
      "/nix/store/jd98q1h1rxz5iqd5xs8k8gw9zw941lj6-fix-tests-when-building-without-regex-supports.patch"
    ],
    "inputDrvs": {
      "/nix/store/0lbzkxz56yhn4gv5z0sskzzdlwzkcff8-autoreconf-hook.drv": [ "out" ],
      "/nix/store/6wh5w7hkarfcx6fxsdclmlx097xsimmg-jq-1.6.tar.gz.drv": [ "out" ],
      "/nix/store/87a32xgqw85rxr1fx3c5j86y177hr9sr-oniguruma-6.9.8.drv": [ "dev" ],
      "/nix/store/gkpdv7xl39x9yxch0wjarq19mmv7j1pm-bash-5.2-p15.drv": [ "out" ],
      "/nix/store/xn1mjk78ly9wia23yvnsyw35q1mz4jqh-stdenv-darwin.drv": [ "out" ]
    },
    "system": "aarch64-darwin",
    "builder": "/nix/store/0q1jfjlwr4vig9cz7lnb5il9rg0y1n84-bash-5.2-p15/bin/bash",
    "args": [
      "-e",
      "/nix/store/6xg259477c90a229xwmb53pdfkn6ig3g-default-builder.sh"
    ],
    "env": {
      "__darwinAllowLocalNetworking": "",
      "__impureHostDeps": "/bin/sh /usr/lib/libSystem.B.dylib /usr/lib/system/libunc.dylib /dev/zero /dev/random /dev/urandom /bin/sh",
      "__propagatedImpureHostDeps": "",
      "__propagatedSandboxProfile": "",
      "__sandboxProfile": "",
      "__structuredAttrs": "",
      "bin": "/nix/store/vabn35a2m2qmfi9cbym4z50bwq94fdzm-jq-1.6-bin",
      "buildInputs": "/nix/store/xfnl6xqbvnpacx8hw9d99ca4mly9kp0h-oniguruma-6.9.8-dev",
      "builder": "/nix/store/0q1jfjlwr4vig9cz7lnb5il9rg0y1n84-bash-5.2-p15/bin/bash",
      "cmakeFlags": "",
      "configureFlags": "--bindir=${bin}/bin --sbindir=${bin}/bin --datadir=${doc}/share --mandir=${man}/share/man",
      "depsBuildBuild": "",
      "depsBuildBuildPropagated": "",
      "depsBuildTarget": "",
      "depsBuildTargetPropagated": "",
      "depsHostHost": "",
      "depsHostHostPropagated": "",
      "depsTargetTarget": "",
      "depsTargetTargetPropagated": "",
      "dev": "/nix/store/akda158i8gr0v0w397lwanxns8yrqldy-jq-1.6-dev",
      "doCheck": "",
      "doInstallCheck": "1",
      "doc": "/nix/store/6qimafz8q88l90jwrzciwc27zhjwawcl-jq-1.6-doc",
      "installCheckTarget": "check",
      "lib": "/nix/store/3wzlsin34l1cs70ljdy69q9296jnvnas-jq-1.6-lib",
      "man": "/nix/store/dl1xf9w928jai5hvm5s9ds35l0m26m0k-jq-1.6-man",
      "mesonFlags": "",
      "name": "jq-1.6",
      "nativeBuildInputs": "/nix/store/ni9k35b9llfc3hys8nv5qsipw8pfy1ln-autoreconf-hook",
      "out": "/nix/store/ivzm5rrr7riwvgy2xcjhss6lz55qylnb-jq-1.6",
      "outputs": "bin doc man dev lib out",
      "patches": "/nix/store/jd98q1h1rxz5iqd5xs8k8gw9zw941lj6-fix-tests-when-building-without-regex-supports.patch",
      "pname": "jq",
      "postInstallCheck": "$bin/bin/jq --help >/dev/null\n$bin/bin/jq -r '.values[1]' <<< '{\"values\":[\"hello\",\"world\"]}' | grep '^world$' > /dev/null\n",
      "preBuild": "rm -r ./modules/oniguruma\n",
      "preConfigure": "echo \"#!/bin/sh\" > scripts/version\necho \"echo 1.6\" >> scripts/version\npatchShebangs scripts/version\n",
      "propagatedBuildInputs": "",
      "propagatedNativeBuildInputs": "",
      "src": "/nix/store/ggjlgjx2fw29lngbnvwaqr6hiz1qhy8g-jq-1.6.tar.gz",
      "stdenv": "/nix/store/qrz2mnb2gsnzmw2pqax693daxh5hsgap-stdenv-darwin",
      "strictDeps": "",
      "system": "aarch64-darwin",
      "version": "1.6"
    }
  }
}

I thought it was interesting that some of the environment variables in here are actually bash scripts themselves – for example the postInstallCheck environment variable is a bash script. Those bash script environment variables are evaled in the main bash script (you can see that happening in setup.sh here)

The postInstallCheck environment variable in this particular derivation starts like this:

$bin/bin/jq --help >/dev/null
$bin/bin/jq -r '.values[1]' <<< '{"values":["hello","world"]}' | grep '^world$' > /dev/null

I guess this is a test to make sure that jq installed correctly.

finally: clean up

All of my compiler experiments used about 3GB of disk space, but nix-collect-garbage cleaned up all of it.

let’s recap the process!

I feel like I understand Nix a bit better after going through this. I still don’t feel very motivated to learn the Nix language, but now I have some idea of what Nix programs are actually doing under the hood!

My understanding is:

  1. First, .nix files get compiled into a .drv file, which is mostly a bunch of inputs and outputs and environment variables. This is where the Nix language stops being relevant.
  2. Then all the environment variables get passed to a build script, which is in charge of doing the actual build
  3. In the Nix standard environment (stdenv), some of those environment variables are themselves bash code, which gets evaled by the big build script setup.sh

That’s all! I probably made some mistakes in here, but this was kind of a fun rabbit hole.