-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
manual: More on stores, building and mounting in the file system #14699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -1,101 +1,187 @@ | ||||||
| # Building | ||||||
|
|
||||||
| ## Normalizing derivation inputs | ||||||
| As discussed in the [main page on derivations](./derivation/index.md): | ||||||
|
|
||||||
| - Each input must be [realised] prior to building the derivation in question. | ||||||
| > A derivation is a specification for running an executable on precisely defined input to produce one or more [store objects][store object]. | ||||||
| [realised]: @docroot@/glossary.md#gloss-realise | ||||||
| This page describes *building* a derivation, which is to say following the instructions in the derivation to actually run the executable. | ||||||
| In some cases the derivation is self-explanatory. | ||||||
| For example, the arguments specified in the derivation really are the arguments passed to the executable. | ||||||
| In other cases, however, there is additional procedure true for all derivations, which is therefore *not* specified in the derivation. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't really know what "procedure" refers to. |
||||||
| This page specifies this invariant procedure that is true for all derivations, too. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't understand this sentence either. What's the "invariant procedure"? |
||||||
|
|
||||||
| The chief design consideration for the building process is *determinism*. | ||||||
| Conventional operating systems are typically not designed with determinism in mind. | ||||||
| But determinism is needed to make Nix's caching a transparent abstraction. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
or maybe "substitution" |
||||||
|
|
||||||
| > **Explanation** | ||||||
| > | ||||||
| > For example, no one wants to slightly modify a derivation, and then find that it no longer builds for an unrelated reason, because the original derivation *also* doesn't build anymore, but the cache hit on the original derivation was hiding this. | ||||||
| > We want builds that once succeed to continue succeeding, to encourage fearless modification of old build recipes. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More importantly, it ensures that something that builds on one machine to build on another machine, e.g. two developers will get the same result. |
||||||
| > Determinism is what enables things that once worked to keep working. | ||||||
| The life cycle of a build can be broken down into 3 parts: | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is also step 0: build/substitute the dependencies. |
||||||
|
|
||||||
| 1. Spawn the builder process with the proper environment, including the correct process arguments, environment variables, and file system state. | ||||||
|
|
||||||
| 2. Wait for the standard output and error of the process to be closed and/or the process to exit. | ||||||
| (If the standard streams are closed but the process hasn't exited, Nix will kill the process.) | ||||||
|
|
||||||
| Nix also logs the standard output and error of the process, but this is just for human convenience and does not influence the behavior of the system. | ||||||
| (Builder processes have no idea what the consumer of their standard output and error does with those streams, only that they are indeed consumed so buffers do not fill up and writes to them will continue to succeed.) | ||||||
|
|
||||||
| 3. Processing the outputs after the builder has exited. | ||||||
|
|
||||||
| - Once this is done, the derivation is *normalized*, replacing each input deriving path with its store path, which we now know from realising the input. | ||||||
| The builder process on exit should have left beyond files for each output the derivation is supposed to produce. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Also, "files" is vague, why not say "it should have created each store path the derivation is supposed to produce". |
||||||
| The files must be processed to turn them into bona fide store objects. | ||||||
| If the processing suceeds, those store objects are associated with the derivation as (the results of) a successful build. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
We don't really associate them though. (There's the deriver field in the database but it's kind of useless.) This should probably say something like "the store paths are registered as valid in the Nix database". |
||||||
|
|
||||||
| ## Builder Execution {#builder-execution} | ||||||
| Step (3) happens externally, with just inert data since the process has exited or been killed by then. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "externally" is vague. External to what? |
||||||
| Step (1) however is best described not from Nix's perspective, but from the process's perspective. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Process" is also vague. The "build process" in general, or is that "process" in the OS sense? |
||||||
|
|
||||||
| > **Explanation** | ||||||
| > | ||||||
| > Ultimately, what matters for determinism is the behavior of IO operations that the process attempts (whether these are successes or failures), because of how they affect the output files, and how they affect the further execution of the builder process. | ||||||
| > From Nix (and the operating system)'s perspective, there are many, many different ways --- different implementation strategies --- of effecting the same I/O behavior, | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Not sure if markdown |
||||||
| > But from the process's perspective, there is only one correct behavior. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TBH I didn't understand this paragraph. |
||||||
| ## What derivations can be built | ||||||
|
|
||||||
| Actually only some derivations are ready to be built. | ||||||
| In particular, only [*resolved*](./resolution.md) derivations can be built. | ||||||
|
Comment on lines
+50
to
+51
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should probably read "Before derivations can be built, they need to be resolved". |
||||||
| That is to say, a derivation that depends on other derivations is not ready yet to be built, because those other derivations might not be built. | ||||||
| If the other derivations are indeed built, we can witness this fact by resolving the derivation, and converting all the derivation's input references into plain store paths. | ||||||
|
|
||||||
| > **Note** | ||||||
| > | ||||||
| > Note that [input-addressing](derivation/outputs/input-address.md) derivations are improperly resolved. | ||||||
| > As discussed on the linked page, the current input-addressing algorithm does not respect resolution-equivalence of derivations (\\(\\sim_\mathrm{Drv}\\)). | ||||||
| > That means that if Nix properly resolved an input-addressed derivation, the resolved derivation would have different input addresses, violating expectations. | ||||||
| > Nix therefore improperly resolves the derivation, keeping its original input address output paths, creating an invalid derivation that is both resolved and instructed to create the outputs at the originally expected paths. | ||||||
|
Comment on lines
+57
to
+60
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Improper" and "violating expectations" is a questionable choice of words, since this is how Nix has always behaved. By definition, how Nix does it, is the proper way of doing it. Also I don't really know what "resolution-equivalence of derivations" means or what "an invalid derivation that is both resolved and instructed to create the outputs at the originally expected paths" refers to. (Also, "invalid" in the context of store paths has a specific meaning, i.e. not registered as a valid path in the Nix database. Is that what it refers to?) |
||||||
| ## Environment of the builder process | ||||||
|
|
||||||
| The [`builder`](./derivation/index.md#builder) is executed as follows: | ||||||
|
|
||||||
| - A temporary directory is created where the build will take place. The | ||||||
| current directory is changed to this directory. | ||||||
| ### File system | ||||||
|
|
||||||
| The builder should have access to a limited file system where only certain objects are available. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
or more specifically, "only the store paths that it depends on". "may" because sandboxing is not required. |
||||||
| The most important exposed files are the inputs (other store objects) of the (resolved) derivation. | ||||||
| Additionally, some other files are exposed. | ||||||
|
|
||||||
| #### Store inputs | ||||||
|
|
||||||
| The builder will be run against a file system in which the [closure] of the inputs is mounted inside the [store directory][store directory path]. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The word "mounted" should be avoided because that's just one possible implementation. E.g. we don't do that on macOS or when sandboxing is disabled on Linux. |
||||||
| In particular, consider a store that just contains this closure. | ||||||
| That store may be exposed to the file system according to the rules specified in the [Exposing Store Objects in OS File Systems](./store-path.md#exposing) documentation. | ||||||
| This precisely defines the file system layout of the store that should be visible to the builder process. | ||||||
|
|
||||||
| > **Note** | ||||||
| > | ||||||
| > Historically, Nix exposed *at least* the following store contents to the builder, but also arbitrarily other store objects, due to limitations around operating systems' file system virtualization capabilities, and wanting to avoid copying or moving files. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We shouldn't use the word "Historically" if it's still the case today. We're not documenting platonic ideal Nix :-) |
||||||
| > It still can do this in so-called *unsandboxed* builds. | ||||||
| > | ||||||
| > Such builds should be considered an unsafe extension, but one that works less badly against non-malicious derivations than might be expected. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Unsafe" is not the right word here. It's just worse for determinism/reproducibility, but then we can never guarantee those entirely anyway. |
||||||
| > This is because store paths are relatively unpredictable, so a well-behaved program is unlikely to stumble upon a store object it wasn't supposed to know about. | ||||||
| > | ||||||
| > As operating systems developed better file system primitives, the need for disabling sandboxing has lessened greatly over the years, and this trend should continue into the future. | ||||||
| [realised]: @docroot@/glossary.md#gloss-realise | ||||||
| [closure]: @docroot@/glossary.md#gloss-closure | ||||||
| [store directory path]: ./store-path.md#store-directory-path | ||||||
|
|
||||||
| ### Other file system state | ||||||
|
|
||||||
| - The current working directory of the builder process will be a fresh temporary directory that is initially empty. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ... except for a couple of files. |
||||||
|
|
||||||
| See the per-store [`build-dir`](@docroot@/store/types/local-store.md#store-local-store-build-dir) setting for more information. | ||||||
|
|
||||||
| - The environment is cleared and set to the derivation attributes, as | ||||||
| specified above. | ||||||
| - Basic device nodes for essential operations (null device, random number generation, standard streams as a pseudo terminal) | ||||||
|
|
||||||
| (A pseudo terminal would not be strictly necessary since the standard streams are passively logging, not there to facilitate interaction. | ||||||
| But it is still useful to entice programs to do nicer logging with e.g. colors etc.) | ||||||
|
|
||||||
| - On Linux: Process information via `/proc` | ||||||
|
|
||||||
| - Minimal user and group identity information | ||||||
|
|
||||||
| - In addition, the following variables are set: | ||||||
| - A loopback-only network configuration with hostname set to `localhost` | ||||||
|
|
||||||
| - `NIX_BUILD_TOP` contains the path of the temporary directory for | ||||||
| this build. | ||||||
| > **Note** | ||||||
| > | ||||||
| > Fixed-output derivations have access to additional operating system state to facilitate communication with the outside world, such as network name resolution and TLS certificate verification. | ||||||
| > This is necessary because these derivations are allowed to access the network, unlike regular derivations which are fully sandboxed. | ||||||
| - Also, `TMPDIR`, `TEMPDIR`, `TMP`, `TEMP` are set to point to the | ||||||
| temporary directory. This is to prevent the builder from | ||||||
| accidentally writing temporary files anywhere else. Doing so | ||||||
| might cause interference by other processes. | ||||||
| ### Environment variables {#env-vars} | ||||||
|
|
||||||
| - `PATH` is set to `/path-not-set` to prevent shells from | ||||||
| initialising it to their built-in default value. | ||||||
| The environment is cleared and set to the derivation attributes, as | ||||||
| specified above. | ||||||
|
|
||||||
| - `HOME` is set to `/homeless-shelter` to prevent programs from | ||||||
| using `/etc/passwd` or the like to find the user's home | ||||||
| directory, which could cause impurity. Usually, when `HOME` is | ||||||
| set, it is used as the location of the home directory, even if | ||||||
| it points to a non-existent path. | ||||||
| For most derivations types this must contain at least: | ||||||
|
|
||||||
| - `NIX_STORE` is set to the path of the top-level Nix store | ||||||
| directory (typically, `/nix/store`). | ||||||
| - For each output declared in `outputs`, the corresponding environment variable is set to point to the intended path in the Nix store for that output. | ||||||
| Each output path is a concatenation of the cryptographic hash of all build inputs, the `name` attribute and the output name. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's probably better to say that each output path has the form |
||||||
| (The output name is omitted if it’s `out`.) | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| - `NIX_ATTRS_JSON_FILE` & `NIX_ATTRS_SH_FILE` if `__structuredAttrs` | ||||||
| is set to `true` for the derivation. A detailed explanation of this | ||||||
| behavior can be found in the | ||||||
| [section about structured attrs](@docroot@/language/advanced-attributes.md#adv-attr-structuredAttrs). | ||||||
| In addition, the following variables are set: | ||||||
|
|
||||||
| - For each output declared in `outputs`, the corresponding | ||||||
| environment variable is set to point to the intended path in the | ||||||
| Nix store for that output. Each output path is a concatenation | ||||||
| of the cryptographic hash of all build inputs, the `name` | ||||||
| attribute and the output name. (The output name is omitted if | ||||||
| it’s `out`.) | ||||||
| - `NIX_BUILD_TOP` contains the path of the temporary directory for this build. | ||||||
|
|
||||||
| - If an output path already exists, it is removed. Also, locks are | ||||||
| acquired to prevent multiple [Nix instances][Nix instance] from performing the same | ||||||
| build at the same time. | ||||||
| - Also, `TMPDIR`, `TEMPDIR`, `TMP`, `TEMP` are set to point to the temporary directory. | ||||||
| This is to prevent the builder from accidentally writing temporary files anywhere else. | ||||||
| Doing so might cause interference by other processes. | ||||||
|
|
||||||
| - A log of the combined standard output and error is written to | ||||||
| `/nix/var/log/nix`. | ||||||
| - `PATH` is set to `/path-not-set` to prevent shells from initialising it to their built-in default value. | ||||||
|
|
||||||
| - The builder is executed with the arguments specified by the | ||||||
| attribute `args`. If it exits with exit code 0, it is considered to | ||||||
| have succeeded. | ||||||
| - `HOME` is set to `/homeless-shelter`. | ||||||
| (Without sandboxing, this serves as "soft sandboxing" --- it discourages programs from using `/etc/passwd` or the like to find the user's home directory, which could cause impurity.) | ||||||
| Usually, when `HOME` is set, it is used as the location of the home directory, even if it points to a non-existent path. | ||||||
|
|
||||||
| - The temporary directory is removed (unless the `-K` option was | ||||||
| specified). | ||||||
| - `NIX_STORE` is set to the path of the top-level Nix [store directory path] (typically, `/nix/store`). | ||||||
|
|
||||||
| - `NIX_ATTRS_JSON_FILE` & `NIX_ATTRS_SH_FILE` if `__structuredAttrs` is set to `true` for the derivation. | ||||||
| A detailed explanation of this behavior can be found in the [section about structured attrs](@docroot@/language/advanced-attributes.md#adv-attr-structuredAttrs). | ||||||
|
|
||||||
| ## Builder Execution | ||||||
|
|
||||||
| - If an output path already exists, it is removed. | ||||||
| Also, locks are acquired to prevent multiple [Nix instances][Nix instance] from performing the same build at the same time. | ||||||
|
|
||||||
| - A log of the combined standard output and error is written to `/nix/var/log/nix`. | ||||||
|
|
||||||
| - The builder is executed with the arguments specified by the attribute `args`. | ||||||
| If it exits with exit code 0, it is considered to have succeeded. | ||||||
|
|
||||||
| - The temporary directory is removed (unless the [`--keep-failed`](@docroot@/command-ref/opt-common.md#opt-keep-failed) option was specified). | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the build succeeds it's always removed, regardless of |
||||||
|
|
||||||
| ## Processing outputs | ||||||
|
|
||||||
| If the builder exited successfully, the following steps happen in order to turn the output directories left behind by the builder into proper store objects: | ||||||
|
|
||||||
| - **Normalize the file permissions** | ||||||
|
|
||||||
| Nix sets the last-modified timestamp on all files | ||||||
| in the build result to 1 (00:00:01 1/1/1970 UTC), sets the group to | ||||||
| the default group, and sets the mode of the file to 0444 or 0555 | ||||||
| (i.e., read-only, with execute permission enabled if the file was | ||||||
| originally executable). Any possible `setuid` and `setgid` | ||||||
| bits are cleared. | ||||||
|
|
||||||
| > **Note** | ||||||
| > | ||||||
| > Setuid and setgid programs are not currently supported by Nix. | ||||||
| > This is because the Nix archives used in deployment have no concept of ownership information, | ||||||
| > and because it makes the build result dependent on the user performing the build. | ||||||
| The files must conform to the model described in the [Exposing in OS file systems](./file-system-object/os-file-system.md) section. | ||||||
| For example, timestamps and permissions must be forced to sentinel values. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure if "sentinel value" is the right term. More like "canonical value". |
||||||
|
|
||||||
| - **Calculate the references** | ||||||
|
|
||||||
| Nix scans each output path for | ||||||
| references to input paths by looking for the hash parts of the input | ||||||
| paths. Since these are potential runtime dependencies, Nix registers | ||||||
| them as dependencies of the output paths. | ||||||
| Nix scans each output path for references to input store objects by looking for the store path digests of each input. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think "hash part" is more common in our docs than "store path digest". At least I don't remember ever seeing that terminology. |
||||||
| (The name part is ignored when scanning; an input's hash part that is not followed by a `-` and the correct name part still scans as a reference. | ||||||
| Likewise, a digest not preceded by the [store directory path] also still scans as a reference.) | ||||||
| Since these are potential runtime dependencies, Nix will register them as references of the output store object they occur in. | ||||||
|
|
||||||
| Nix also scans for references to other outputs' paths in the same way, because outputs are allowed to refer to each other. | ||||||
| Nix also scans for references from one output to another in the same way, because outputs are allowed to refer to each other. | ||||||
| If the outputs' references to each other form a cycle, this is an error, because the references of store objects much be acyclic. | ||||||
|
|
||||||
| In the case of derivations with fixed in advance output paths (i.e. [input-addressing] derivations, or [fixed content-addressing] derivations), the actual final store path to each output is used during the build. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| For [floating content-addressing] derivations, however, the final store path is not known in advance by definition. | ||||||
| Scratch store paths must therefore be used instead. | ||||||
| Scanning will use those scratch paths, but then any output-to-be that contains such a scanned scratch path must be rewritten to instead use the final (content-addressed) path of the output in question. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better to avoid the passive voice here, i.e. "Nix rewrites the outputs..." instead of "... must be rewritten". |
||||||
|
|
||||||
| At this point, the file system data is in the proper form, and the valid acyclic reference data for each output is also calculated, so the outputs can be registered as proper store objects, and associated with the derivation in the [build trace] in the record for a successful build. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
(dropping the "build trace" since I don't know what that is) |
||||||
|
|
||||||
| [Nix instance]: @docroot@/glossary.md#gloss-nix-instance | ||||||
| [input-addressing]: ./derivation/outputs/input-address.md | ||||||
| [fixed content-addressing]: ./derivation/outputs/content-address.md#fixed | ||||||
| [floating content-addressing]: ./derivation/outputs/content-address.md#floating | ||||||
| [build trace]: ./build-trace.md | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.