-
-
Notifications
You must be signed in to change notification settings - Fork 16.5k
buildDenoPackage: init #419255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
buildDenoPackage: init #419255
Conversation
In general we want to do as little processing as possible in FOD derivations, so that we have little reason to change them. The fundamental thing only a FOD can do is download resources from the internet. It’s possible to adapt things to whatever layout a new version of Deno is expecting in the non‐FOD build, but it’s not possible to change existing FODs. It’s of course acceptable to change the format for new lock file versions, since there’s never any existing FODs using those. So, for instance, My suggestion would be to do the bare minimum of computing the downloads that need to be done and placing them in the output directory without any further processing unless absolutely necessary. I’m not sure exactly what the format looks like for JSR packages, but for NPM maybe we could adapt the existing fetcher codebase so that there’s a lower layer that can take the relevant information for a single package directly rather than requiring it to be in As long as the downloads are unprocessed and placed in a naming scheme that we don’t expect to need to change (e.g. one consideration that has been required with Cargo is ensuring that we can support multiple versions of the same package name in a dependency tree), it should be pretty safe. |
Deno maintainer dsherret pointed me to some rust crates, which the deno cli uses, like https://github.com/denoland/deno_cache_dir and https://crates.io/crates/deno_npm_cache Here comes the idea: What do you think about that? Am I missing something? |
If we don’t want to break FODs, we would have to keep the fixed Deno version and all its dependencies building forever, likely multiple versions as we add more to support newer lock file formats, and run into the fact that the versions will inevitably become EOL and develop security vulnerabilities (that may even be relevant to FOD fetching – e.g. TLS issues). Breaking FODs loudly is an option, but a pretty frustrating one for downstream users depending on how often it happens. I am not sure that |
And that is not the case for a custom fetcher, because we can bump the dependencies of the custom fetcher (like the compiler) without changing the hashes of the FODs it produces? Anyway, the perk of breaking hashes way less often, by decoupling the fetcher and transformer, convinced me of the solution. Thanks you for your patience, I needed a little bit of time to fully understand the implications. I'm gonna write the custom fetcher with go, since I'm not really acquainted with rust yet. I also looked into a way to have an equivalent to Edit: should be possible |
Is there a good reason to have the option to build the dependencies with a hash, or would it be sufficient to just have a fetcher that imports all hashes from the lockfile? And another question, in the case of deno and jsr, it seems that individual files are downloaded and hashed, instead of whole tar balls. So if I implement an |
So my plan currently is to just build an And it works like this: NOTE: since there are 3 kinds of dependencies with deno (npm, jsr and direct url), there are usually 3 cases to consider
With preprocessing and postprocessing, the FOD's won't change often, if ever, since they are decoupled from any breaking changes that deno could introduce. Still it's possible that something in step 2 will have to be changed, so I am unsure about wether 2.3. is a good idea, since it introduces a dependency on the You said
But I'm not sold on that yet either. There are multiple individual rust crates that contain the logic for the things we need to do For deno.lock: https://crates.io/crates/deno_lockfile and some more. From what I can tell they are not really doing what we need. On paper it seems like a good idea to use them, but somehow they don't really fit into the architecture I lined out now. Ideas? Objections? |
Update: implementing the jsr side of things like this went fine. the url dependencies proved to be more convoluted than expected, since they require custom transformation logic per domain and I haven't fully understood the details yet. |
Update: I've hit a snare with the importLock implementation. Deno requires to know about some response headers of fetched files. While you can use the curl flag This seems to be inevitable. It's imaginable to do this with a hack, by hard coding those headers in certain situations, since they are only required in rare cases, but there will probably be problems with that. So I think it's not possible to do the importLock cleanly, which means I'm gonna stick to the approach which requires an explicit hash in the nix derivation. |
@emilazy while talking about internals with deno maintainers on their discord, a flatpak contributor told me, they also did a deno build helper for flatpak recently. I looked at the code and it got me thinking. At flatpak they have a standardized interface for the whole custom fetcher problem. I thought it would be cool, if nixpkgs could also have a standardized interface. Surely it won't work in every case, but unifying most of the language build helpers could improve maintainability and at the same time encourage to do it the right way, the way you told me to do it now, to decouple the FODs from any upstream changes. Single FOD custom fetcherA non-importLock custom fetcher could look something like this:
{
hash="";
packages= {
"<unique package or file name>" = {
url="<url>";
path=null; # in step 2., the url is transformed into a unique path within the derivation, maybe by creating a hash of the url
curlOpts=""; # per package curl opts
meta={}; # object of arbitrary shape that is passed through
};
...
};
curlOpts=""; # global curl opts
derivation=null; # filled in, in step 2.
}
importLock custom fetcherAn importLock interface could look something like this:
{
packages= {
"<unique package or file name>" = {
hash="<hash>";
url="<url>";
derivation=null; # filled in, in step 2.
fetchurlArgs={}; # per package args sent to fetchurl
meta={}; # object of arbitrary shape that is passed through
};
...
};
fetchurlArgs={}; # global args sent to fetchurl
}
Probably there could be some useful helper functions for problems that commonly occur. What do you think about this? |
If we come up with a generic fetcher interface for npm and cdn/jsr dependencies there is still the gap of the custom files and metadata both package managers expect. Did you manage to figure out some way of doing that? Because from my understanding thats equally complex. I imagine deno could have a cli command to pre-populate its cache with offline packages, then run deno install or similar and it would produce its own structure as needed (In a non fod) as an idea. Since you seemed to be quite active with the deno maintainers could be worth asking if they offer something like that or endorse the idea. I imagine flatpak could need something similar |
I think you misunderstood me there. I proposed a generic interface for all custom fetchers in nixpkgs. But the only thing that is generalizable is the fetching itself. So there would be 3 steps to creating a custom fetcher:
So this is somewhat similar to writing 2 adapters. The point of this would be to encourage a common architecture across custom fetchers, which is endorsed by the nixpkgs maintainers. The format I have in mind has enough room to do things like meta data. I have iterated on it last week and its use will become apparent in the code and documentation, when I push it here this week.
I'm not sure I understand your idea. The problem we have to solve is that we have to do the fetching with our custom code and then we need an interface provided by deno where we can feed in our nix store paths with the fetched data and it will produce the directory structure it needs. Last week I finally managed to do exactly that for the jsr and http packages using deno's provided library. It took a while to find it and how to use it, though. |
8516e5b
to
983c0a6
Compare
Yes i assumed this.
Thats what i ment with the original comment. How do you think of solving this? My proposal was to try using deno itself, to avoid complexity, unsure if deno has some interfaces to support it. |
@@ -204,7 +204,7 @@ rustPlatform.buildRustPackage (finalAttrs: { | |||
|
|||
postInstall = '' | |||
# Remove non-essential binaries like denort and test_server | |||
find $out/bin/* -not -name "deno" -delete | |||
find $out/bin/* -not -name "deno" -not -name "denort" -delete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deno already builds the denort binary but it's removed. The correct solution for us would probably have been to add an extra output & keep denort.
@06kellyjac is this what you meant in the previous PR?
As I said in my comment before:
This interface only works for https and jsr imports, though. Yesterday @rorosen helped me to write a small rust wrapper around that interface, which you can look at here: https://github.com/NixOS/nixpkgs/pull/419255/files#diff-657a96d5ebec4ff9ca7cf304c40ec07d4eb6f05b12b910577aa4719377f14e10 I had a wrapper written in deno before, but it proved difficult packaging that wrapper cleanly with nix, without this very PR (chicken and egg problem). Currently, I handle npm imports manually. Which means there is also only limited support. I outlined all the features that I am aware of, both missing and implemented, in the readme, I wrote for the fetcher |
Hi, I haven’t had time to read the whole thread (sorry), but just to belatedly reply to
We unfortunately can’t import locks from fetched sources because that would be IFD (import‐from‐derivation – a misleading term for “evaluation depending on the contents of derivation outputs”), and in general people aren’t too happy about having to vendor lock files in Nixpkgs to work around this (#327064). So I’m not sure what the current state of things is, but I would suggest prioritizing a hash‐based fetcher, as it’s likely what we’d want to use for most packages that have lock files checked in upstream. |
@emilazy thanks for checking in
Hm. I'm not sure I understand. What are
Does this not inevitably happen? I need to parse the lockfile at least for the list of packages. 'depending" on that I construct FODs. Why does it matter wether I use the hashes from the lockfile or not? |
Update on the progress: The build for the fresh-init-cli works now, which is certainly cool, since it is a large project with many dependencies. Currently binary builds for packages with imports from When building the fresh-init-cli, I noticed that the way this builder works currently is extremely slow, since it will create separate derivations per file of a jsr dependency. This means a lot of disk io for many imports of large jsr packages. |
Co-authored-by: rorosen
It's used by the denoBuildPackage build helper to compile deno projects into a binary.
8fd2775
to
bd54b5e
Compare
@hsjobeki This is a reopen of the reverted PR #407434 This PR is ready for review. You don't need to read the PR comments above, as that was mainly about me understanding nixpkgs' requirements and emilazy explaining them to me You can build the tests locally with nix build -f . --option allow-import-from-derivation false tests.build-deno-package but mind that it will take a long time, since this PR adapts the Compared to the previous PR, mainly the custom fetcher changed. To understand what is going on in the fetcher, you can read this documentation I wrote about the features, the fetcher needs to have: And to understand the general architecture of the fetcher, you can read this documentation, which is also a collection of general problems that language build helpers need to solve, along with a proposal for a standardized architecture, which could be improved upon and added to the nixpkgs documentation: |
@aMOPel Were issues raised with Deno about the missing info in the lockfile? |
I made this issue. Maybe I should have made it the main deno repo. If you're referring to how this is handled in this build helper, then the fact is, that there is no "import from lock" functionality implemented, among other reasons, because of the missing files in the lockfile. So a nix build with this build helper will always need a hash specified in nix. |
First of all, thank you for your work @aMOPel
Running a $ deno install --vendor --entrypoint ./silverbullet.ts
# . . .
$ ls -lrt vendor/deno.land/x/fuse@v6.4.1/dist/
total 24
-rw-r--r-- 1 aorith aorith 15426 ago 9 07:59 fuse.esm.min.js
-rw-r--r-- 1 aorith aorith 6894 ago 9 07:59 fuse.d.ts Whereas if I add the same debug
|
@aorith thank you for your testing. I am going to fix that on monday |
@aorith this is indeed an edge case I have not handled in my fetcher so far. I asked in the deno discord now what the logic behind it is, since the |
It is imported in a
This is very frustrating to add as a feature, because these types are only in the source files, not in the lock, but the fetcher already needs to know about them. I could either pass the source files along with the lock file to the fetcher, but this creates problems with the updating of the hash. Right now there is a check that tells you that the lock file has changed and you need to update the hash of the denoDeps, which will then result in a refetch. Doing the same thing for the whole source code would mean refetching on every code change while developing. Instead what I need is another derivation, before the fetching step, that takes the src code as input and returns a json with all the As far as I understand this would not create an IFD so this should be fine. |
missing from the lockfile
I added the feature now. with this: silver-bullet = buildDenoPackage {
pname = "silver-bullet";
version = "";
denoDepsHash = "sha256-o/maeRY67ddiuErolu3Qq13ZX1s+HQdv2LrQANz0VEY=";
src = fetchFromGitHub {
owner = "silverbulletmd";
repo = "silverbullet";
rev = "d77f0286d8eb43c241aba8597d29d7e09d51d7b3";
hash = "sha256-h0OEp6oDSqo2pFGlth7piR521o+K/Jl5Ap65qpHBYfI=";
};
binaryEntrypointPath = "silverbullet.ts";
targetSystem = "x86_64-linux";
}; the build continues past the type check, which you got the error from. the silver bullet build also needs to do some other pre build steps. those are out of scope for me, though. Mind that you could also have achieved that by just adding Please tell me if the @yacinehmito I guess this is interesting for you too, since this now implemented fetching the EDIT: issue |
I get the same I did a quick test and I'm not sure if this is required for a proper build without the type checking error, but after adding the following to
It complains again about a dependency:
Which indeed is not present:
(The build requires |
Thanks for the quick feedback. That is of course another edge case 😄 there seems to be no end. I'm gonna look into it. |
@aorith is there a very good reason why you wouldn't use "--no-check"? IMO the type checking does not have to happen in the nix-build. The user can do it on their machine if they so desire. I asked the deno maintainers, and they don't plan to add the type files to the lockfile. There are many different complicated edge cases associated with this and the required implementation is still error prone sometimes. Just dropping support from type checking in this build helper would greatly reduce the fetching logic and reduce the the maintainance burden. |
@aMOPel No reason at all! I actually thought that it was something that you would like to add to the builder. I still cannot build the package, even with |
we are using the builder successfully here |
@wanderer that is great to hear! Thanks for your testing. |
Since this PR is already a 5k lines of change, we should discuss a review and merge strategy. I'd prefer giving reviewers smaller badges, if possible. Since the PR descriptions says "this is in ready to merge state" could you give another simple overview of the final architecture maybe we decide to split this into sub-prs so its easier for us to review? |
It says "ready for review", not merge. In fact I plan to take out a little bit of code in the fetcher again as i mentioned in a previous comment, probably next week.
The two markdown files that I linked in the description try to do that. One explains the architecture and its reasoning. One explains the features that the fetcher has and doesn't have, the technical details of the deno dependency system. I can try to rewrite them if they are not up to your standard.
I don't see how, but I'm open to be suprised. You can split out the documentation, sure. But it doesn't really make sense to split the code into pieces, since the build helper simply needs all the pieces to function. Also then you would need to write new tests for the pieces individually. Currently there are only e2e tests. |
First of all, thank you for the huge amount of work you’ve put into this PR — it’s clear you’ve invested a lot of thought and effort. What I’m saying below comes from a place of support for this PR. That said, a PR of this size is very challenging for us to review effectively. Even with its good documentation, 5k lines of code in one go is almost unreviewable, and it increases the risk of bugs slipping in unnoticed. For us to maintain quality, we really need changes to be structured in smaller chunks. You mentioned that the build helper needs all the pieces to function and has only e2e tests — but from our perspective, this suggests the code could benefit from more modularity. Smaller, well-defined units with corresponding unit tests would not only make it easier for us to review but also make the system more maintainable in the long term. One possible path would be:
If we can break this into smaller pieces, it’ll make it much easier for the community to give it the review it deserves and get it merged faster. |
Looking at the architecture graph i have a few immediate questions:
|
I believe 1.2k is lockfile. The one rust crate has a very big lockfile, and the other lockfiles are from the various e2e tests.
Can do. I don't see how it makes the review easier, since the amount of code doesn't change, but can do.
Yes. That would make the review easier. Unit tests would be good in a couple of places. How many separate PRs were you thinking of? Too much splitting would probably create a bunch additional effort. I would like to find a good balance. I would like to split the builder and the fetcher and build tests for the multiple scripts, so the "lockfile transformer", "fetcher" and the two "file transformers". The tests would test each of those as a blackbox and just look at input and output. That way we also get a sort of spec for the formats. Then I would create the fetcher PR first, then the builder PR and lastly the docs PR. Does that sound acceptable?
No, i tried to implement an "import from lockfile" functionality, but ultimately it was infeasible.
It is. The links I posted lead to the markdown files from the very branch this pr stems from. I just used permalinks so they wouldn't break. That doesn't mean the files are not on this branch anymore.
I talk about this differentiation, because the build helper can not do any IFD. So I try to visualize what steps happen during the evaluation time of nix code (the nix realm) vs what happens inside a nix build container.
I believe I did not do a good job making that clear in the markdown files. Yes. It is custom format. It stemmed from my effort generalizing build helper architecture. One point of it is, that we need to decouple the format that the fetcher consumes from deno's lockfile format. Since the latter is subject to change and we only want to break the fetcher as few times as possible, to avoid refetching.
Fair. Yes. I believe so
Yes. I believe what is missing in the documentation, is how this particular build helper's architecture relates to the general one. I can add that. |
Yes that's right. I am aware that it creates another additional effort from your side. But ultimately reviewability has to win, if we don't want this PR to sit around forever.
Yes, agreed - I also think the right order would be:
In every PR include a reference to this PR and vice versa.
I think the design documents need to be updated to strictly show whats part of the PR to avoid any possible confusion. |
we are getting a build error now in https://github.com/masslbs/safewallet-to-beancount
is the underscore in string_decoder causing a problem? |
Oh wow. Apparently I messed up the template string of the error message. You can try going one commit back. I think the last commit might be the culprit. In fact I am gonna put this PR back on draft, since I have multiple plans for changes as mentioned in the previous comments. |
This is a reopen of the reverted PR #407434
EDIT:
This PR is ready for review.
You don't need to read the PR the older comments below, as that was mainly about me understanding nixpkgs' requirements and emilazy explaining them to me
You can build the tests locally with
but mind that it will take a long time, since this PR adapts the
pkgs.deno
package, which means you have to fully rebuild it.Compared to the previous PR, mainly the custom fetcher changed.
To understand what is going on in the fetcher, you can read this documentation I wrote about the features, the fetcher needs to have:
https://github.com/NixOS/nixpkgs/blob/bd54b5e47f0a7858336b38fa30aaea7541841eb5/pkgs/build-support/deno/fetch-deno-deps/readme.md
And to understand the general architecture of the fetcher, you can read this documentation, which is also a collection of general problems that language build helpers need to solve, along with a proposal for a standardized architecture, which could be improved upon and added to the nixpkgs documentation:
https://github.com/NixOS/nixpkgs/blob/bd54b5e47f0a7858336b38fa30aaea7541841eb5/pkgs/build-support/deno/fetch-deno-deps/generic-fetcher-architecture-docs.md
EDITEND
Origonal text
Discussion
We need to discuss specifics.
@emanueljg
@06kellyjac
@emilazy
@hsjobeki
Previous Problem
@emilazy brought up concerns about the previous PR.
Namely it's an issue that FODs are built using the Deno CLI as a dependency.
Since the Deno CLI considers the format of the dependency cache as an implementation detail, when we bump the Deno CLI version in nixpkgs, the Deno CLI could suddenly produce a FOD with a different hash. However nix caches FODs and until somebody would manually bump the output hash of the FOD, the cached version would be used, even if a rebuild would produce a different result. This can produce a mass breakage with a large delay, which can be very hard to debug.
The issue is fundamental to nix's FODs. The inputs (in this case the deno cli) are intentionally not checked when it comes to cache invalidation.
This is why build helpers in nixpkgs implement custom fetchers. This way nixpkgs has full control over when the FOD will change. If the expected dependency cache format from the Deno CLI would change, we would still produce the same FOD,
however the Deno CLI would tell us, that the format is invalid and throw an error. Then the custom fetcher will have to be adapted.
Dependency cache format viewed as an interface
I see 3 parts to this Interface
The lock file has a version, currently
version: 5
. So the lock file is "versioned".The format of how the directory tree for the dependencies looks like, that deno expects, is not stable or documented.
On top of that deno injects many stateful files into that directory tree, to cache work.
3.1. Some files can simply be deleted and will be regenerated by the deno cli.
3.2. Some files can't be deleted (
registry.json
andmeta.json
files), since Deno won't recognize the dependencies, if they are missing.Statement from Deno maintainers
I asked the maintainers in the Deno discord about the (in)stability of the interface.
Breaking changes survey
I surveyed previous versions of Deno and compared the interface pairwise:
Versions:
deno 1.46.3 ab7b6889ae9d484eed2876868209e33eb262511d 2024-09-26
deno 2.1.1 882842d2a908700540d206baa79efb922ac1c33d 2024-11-25
deno 2.2.4 c5dd43934613ae0f8ff37c59f61c507c2e8f980d 2025-03-25
deno 2.2.12 4684fd6b0c01e4b7d99027a34c93c2e09ecafee2 2025-05-24
deno 2.3.6 3078b9a 2025-06-23
Diffs:
The changes were all not monumental, so it could be feasible to keep a custom fetcher up to date.
Before v2.0.0, the deno cli worked quite differently. For example,
deno install
would not generate a lock file. This highlights an aspect that I didn't think about before. When using the Deno CLI in the FOD, the CLI itself is also something that could experience breaking changes.Building a custom fetcher
It would be possible to circumvent building the custom fetcher, if we make the deno version part of the
denoDeps
build'sname
. That way, when the deno cli version changes, the FOD has to be rebuilt and we don't run into the cache issue. However we will have to do more rebuilds of thedenoDeps
FOD's than necessary, since it appears that not every version bump does break the interface.Implementing the fetcher could prove difficult due to the complexity of Deno's package system. Deno supports multiple package sources.
jsr:
from the jsr registrynpm:
from the npm registryEach of these sources has their own distinct format.
NPM has a special role here, since NPM package generally can execute scripts after being installed to do some post-install work, like downloading some big asset. This also has to happen inside the FOD.
Unfortunately, we can't just reuse the
buildNpmPackage
build helper for that part without extra effort, since it expects apackage-lock.json
file, and we just have thedeno.lock
file, and conversion with transitive dependencies could prove difficult from what I have glimpsed so far.What do you think about this?
Should I implement the custom fetcher or is it not worth it?
The way I see it, the maintenance burden is probably gonna be larger with the custom fetcher, since it will rely on more implementation details from Deno. And the upfront cost to implement the custom fetcher could be large.
Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)