Leaning on the package manager Nix and its declarative builds with pinned dependencies, NixOS is a Linux distribution that inherently supports side-by-side installation of package and system versions, deterministic deployment and atomic updates. This in principle makes it an excellent choice for many use cases, including embedded Linux systems, where reliability and reproducibility are crucial. Unfortunately, this currently comes at a cost: Among other things, NixOS installation sizes are large, and updates tend to incur a significant storage overhead, which is especially problematic for embedded Linux systems.
nixpkgs
, NixOS'es package collection, is generally aimed at usability. That is, package definitions are written to include optional features and bind optional dependencies, both either by default or always -- depending on packaging quality. Limited (but not negligible) manual effort has shown that removing some of those optional features and dependencies can shrink NixOS'es installation size from about 1.3 GiB to 150MiB.
Considering the sheer size of the nixpkgs
collection, the largest among all Linux distributions, however, an automatic approach is preferable. Given the definition of a nixpkgs
application or a whole NixOS system and a set of functionality requirements (tests), it should determine which existing package options and new transformations of package definitions can be applied to minimize the number and size of the required packages, while still allowing the system to function as intended.
One of Nix's key strengths, yet in the scope of this project a related problem, is that it forces dependencies between packages to be expressed explicitly. If a software isn't told, by absolute paths, where to find its dependencies, it won't be able discover them. At some level, the declarative system definition has to define precisely which dependencies are to be used. The common way this is done in nixpkgs
is that dependencies are passed into the build process, which grabs the paths and embeds them in the build output (e.g., binaries). As a result, updates to those dependencies will cause the package to be rebuilt with a different output. Minimizing package dependencies and output sizes helps with this, but more can be done.
An alternative approach, already sometimes used in nixpkgs
, is late binding. Instead of embedding the dependency paths in the heavy output of the actual packages, a lightweight wrapper packages is added that injects the dependencies at runtime (for example, using environment variables). This way, the actual package output remains unchanged between different dependency versions, and only the lightweight wrapper packages need to be rebuilt and updated. Here too, automatic mechanisms should be devised to transform package definitions to use late binding where possible, partially replacing the current mechanisms that automate early binding.
This work may be quite challenging, as the point is not to manually edit individual package definitions, but to find general, generic transformations, which requires a quite deep understanding of how nixpkgs
is structured and how software is built and executed. Targeted use of LLMs to improve existing package definitions (finding out which features and dependencies should have been made optional) may be a viable approach to prepare nixpkgs
for the actual minimization process -- but this is not a "throw AI at it" project. Where an internal package build can't be convinced to omit dependencies that remain unused at runtime, stubbing may be an option.
It remains to be seen which existing or even new approaches of dependency trimming/debloating (those seem to be search terms) and late binding can be applied in this context.
This project is an extension of this prior work:
- LCTES Conference B
reUpNix: Reconfigurable and Updateable Embedded Systems - Niklas Gollenstede, Ulf Kulau, Christian DietrichProceedings of the 24th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded SystemsACM Press2023.
PDF Slides Raw Data 10.1145/3589610.3596273 [BibTex] @inproceedings{gollenstede:23:lctes,
location = {New York, NY, USA},
author = {Gollenstede, Niklas and Kulau, Ulf and Dietrich, Christian},
booktitle = {Proceedings of the 24th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems},
doi = {10.1145/3589610.3596273},
entrysubtype = {Conference},
isbn = {979-8-4007-0174-0/23/06},
month = {June},
publisher = {ACM Press},
title = {{reUpNix}: Reconfigurable and Updateable Embedded Systems},
year = {2023},
}