Skip to content

buildDenoPackage: init #419255

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft

Conversation

aMOPel
Copy link
Contributor

@aMOPel aMOPel commented Jun 23, 2025

This is a reopen of the reverted PR #407434

EDIT:

This PR is ready for review.

You don't need to read the PR the older comments below, as that was mainly about me understanding nixpkgs' requirements and emilazy explaining them to me

You can build the tests locally with

nix build -f . --option allow-import-from-derivation false tests.build-deno-package

but mind that it will take a long time, since this PR adapts the pkgs.deno package, which means you have to fully rebuild it.

Compared to the previous PR, mainly the custom fetcher changed.

To understand what is going on in the fetcher, you can read this documentation I wrote about the features, the fetcher needs to have:

https://github.com/NixOS/nixpkgs/blob/bd54b5e47f0a7858336b38fa30aaea7541841eb5/pkgs/build-support/deno/fetch-deno-deps/readme.md

And to understand the general architecture of the fetcher, you can read this documentation, which is also a collection of general problems that language build helpers need to solve, along with a proposal for a standardized architecture, which could be improved upon and added to the nixpkgs documentation:

https://github.com/NixOS/nixpkgs/blob/bd54b5e47f0a7858336b38fa30aaea7541841eb5/pkgs/build-support/deno/fetch-deno-deps/generic-fetcher-architecture-docs.md

EDITEND

Origonal text

Discussion

We need to discuss specifics.

@emanueljg
@06kellyjac
@emilazy
@hsjobeki

Previous Problem

@emilazy brought up concerns about the previous PR.

Namely it's an issue that FODs are built using the Deno CLI as a dependency.
Since the Deno CLI considers the format of the dependency cache as an implementation detail, when we bump the Deno CLI version in nixpkgs, the Deno CLI could suddenly produce a FOD with a different hash. However nix caches FODs and until somebody would manually bump the output hash of the FOD, the cached version would be used, even if a rebuild would produce a different result. This can produce a mass breakage with a large delay, which can be very hard to debug.

The issue is fundamental to nix's FODs. The inputs (in this case the deno cli) are intentionally not checked when it comes to cache invalidation.

This is why build helpers in nixpkgs implement custom fetchers. This way nixpkgs has full control over when the FOD will change. If the expected dependency cache format from the Deno CLI would change, we would still produce the same FOD,
however the Deno CLI would tell us, that the format is invalid and throw an error. Then the custom fetcher will have to be adapted.

Dependency cache format viewed as an interface

I see 3 parts to this Interface

  1. The lock file has a version, currently version: 5. So the lock file is "versioned".

  2. The format of how the directory tree for the dependencies looks like, that deno expects, is not stable or documented.

  3. On top of that deno injects many stateful files into that directory tree, to cache work.

    3.1. Some files can simply be deleted and will be regenerated by the deno cli.
    3.2. Some files can't be deleted (registry.json and meta.json files), since Deno won't recognize the dependencies, if they are missing.

Statement from Deno maintainers

I asked the maintainers in the Deno discord about the (in)stability of the interface.

That's correct, it's a moving target and an implementation detail, we don't give any stability guarantees around it, because we need to be able to change that installation structure if needed.
--bartlomieju

there's no documentation. It's somewhat stable but kind of unstable. I'd recommend looking into using the Rust crates like https://github.com/denoland/deno_cache_dir and https://crates.io/crates/deno_npm_cache
--dsherret

Breaking changes survey

I surveyed previous versions of Deno and compared the interface pairwise:

Versions:

Diffs:

  • v1.46.3_v2.1.1: point 1. changed, point 2. changed
  • v2.1.1_v2.2.4: no changes
  • v2.2.4_v2.2.12: no changes
  • v2.2.12_v2.3.6: point 1. changed, point 3.2 changed

The changes were all not monumental, so it could be feasible to keep a custom fetcher up to date.

Before v2.0.0, the deno cli worked quite differently. For example, deno install would not generate a lock file. This highlights an aspect that I didn't think about before. When using the Deno CLI in the FOD, the CLI itself is also something that could experience breaking changes.

Building a custom fetcher

It would be possible to circumvent building the custom fetcher, if we make the deno version part of the denoDeps build's name. That way, when the deno cli version changes, the FOD has to be rebuilt and we don't run into the cache issue. However we will have to do more rebuilds of the denoDeps FOD's than necessary, since it appears that not every version bump does break the interface.

Implementing the fetcher could prove difficult due to the complexity of Deno's package system. Deno supports multiple package sources.

  1. jsr: from the jsr registry
  2. npm: from the npm registry
  3. A URL, directly fetching from a url, e.g. github or a cdn.

Each of these sources has their own distinct format.
NPM has a special role here, since NPM package generally can execute scripts after being installed to do some post-install work, like downloading some big asset. This also has to happen inside the FOD.
Unfortunately, we can't just reuse the buildNpmPackage build helper for that part without extra effort, since it expects a package-lock.json file, and we just have the deno.lock file, and conversion with transitive dependencies could prove difficult from what I have glimpsed so far.

What do you think about this?

Should I implement the custom fetcher or is it not worth it?

The way I see it, the maintenance burden is probably gonna be larger with the custom fetcher, since it will rely on more implementation details from Deno. And the upfront cost to implement the custom fetcher could be large.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • Nixpkgs 25.11 Release Notes (or backporting 24.11 and 25.05 Nixpkgs Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
  • NixOS 25.11 Release Notes (or backporting 24.11 and 25.05 NixOS Release notes)
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other contributing documentation in corresponding paths.

@github-actions github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. labels Jun 23, 2025
@emilazy
Copy link
Member

emilazy commented Jun 23, 2025

In general we want to do as little processing as possible in FOD derivations, so that we have little reason to change them. The fundamental thing only a FOD can do is download resources from the internet. It’s possible to adapt things to whatever layout a new version of Deno is expecting in the non‐FOD build, but it’s not possible to change existing FODs. It’s of course acceptable to change the format for new lock file versions, since there’s never any existing FODs using those.

So, for instance, fetchCargoVendor just downloads tarballs straight from crates.io without unpacking, and uses nix-prefetch-git for Git dependencies. Unpacking things and assembling the actual Cargo vendor directory is postponed to a later stage which isn’t a FOD and can be adapted as needed.

My suggestion would be to do the bare minimum of computing the downloads that need to be done and placing them in the output directory without any further processing unless absolutely necessary. I’m not sure exactly what the format looks like for JSR packages, but for NPM maybe we could adapt the existing fetcher codebase so that there’s a lower layer that can take the relevant information for a single package directly rather than requiring it to be in package-lock.json format?

As long as the downloads are unprocessed and placed in a naming scheme that we don’t expect to need to change (e.g. one consideration that has been required with Cargo is ensuring that we can support multiple versions of the same package name in a dependency tree), it should be pretty safe.

@aMOPel
Copy link
Contributor Author

aMOPel commented Jun 24, 2025

Deno maintainer dsherret pointed me to some rust crates, which the deno cli uses, like https://github.com/denoland/deno_cache_dir and https://crates.io/crates/deno_npm_cache
So I thought, it seems very reasonable to just write the fetcher in rust and use those crates as to get the logic straight from the source. Then I thought, since the deno cli is using those crates, I might as well look into the codebase of deno and copy the relevant code for the fetcher. At that point I am just pointlessly duplicating the deno cli, though.

Here comes the idea:
What if our custom fetcher is the deno cli, BUT not the version pinned in nixpkgs, but a different (older) one, just used for the fetcher.
This way we have decoupled the deno versions. And the problem you were worried about, @emilazy, could not occur anymore.
When pkgs.deno breaks compatibility with the current FODs dependency cache format, then we have to update the hashes.
We could enforce having to update the hashes, by making the deno fetcher version part of the FOD's name. Either way, there will be an error, since the old incompatible FODs were created with the old incompatible deno cli.

What do you think about that? Am I missing something?

@emilazy
Copy link
Member

emilazy commented Jun 24, 2025

If we don’t want to break FODs, we would have to keep the fixed Deno version and all its dependencies building forever, likely multiple versions as we add more to support newer lock file formats, and run into the fact that the versions will inevitably become EOL and develop security vulnerabilities (that may even be relevant to FOD fetching – e.g. TLS issues). Breaking FODs loudly is an option, but a pretty frustrating one for downstream users depending on how often it happens.

I am not sure that deno_cache_dir would be necessary for a custom fetcher, though? As I said, we don’t have to match Deno’s format in the FOD output. We only need to download things. Processing it into the format expected by Deno can happen in a non‐FOD derivation without the same compatibility constraints.

@aMOPel
Copy link
Contributor Author

aMOPel commented Jun 24, 2025

we would have to keep the fixed Deno version and all its dependencies building forever, likely multiple versions as we add more to support newer lock file formats, and run into the fact that the versions will inevitably become EOL and develop security vulnerabilities (that may even be relevant to FOD fetching – e.g. TLS issues)

And that is not the case for a custom fetcher, because we can bump the dependencies of the custom fetcher (like the compiler) without changing the hashes of the FODs it produces?


Anyway, the perk of breaking hashes way less often, by decoupling the fetcher and transformer, convinced me of the solution.

Thanks you for your patience, I needed a little bit of time to fully understand the implications.

I'm gonna write the custom fetcher with go, since I'm not really acquainted with rust yet.


I also looked into a way to have an equivalent to importNpmLock, but so far it looks like it's not possible, due to how jsr and deno work together. Will keep you updated.

Edit: should be possible

@aMOPel
Copy link
Contributor Author

aMOPel commented Jun 25, 2025

@emilazy

Is there a good reason to have the option to build the dependencies with a hash, or would it be sufficient to just have a fetcher that imports all hashes from the lockfile?

And another question, in the case of deno and jsr, it seems that individual files are downloaded and hashed, instead of whole tar balls. So if I implement an importLock functionality, it would create a lot of derivations, for each file of a dependency individually. Is that a problem?

@aMOPel
Copy link
Contributor Author

aMOPel commented Jun 25, 2025

So my plan currently is to just build an importLock functionality, no custom fetcher using a hash.

And it works like this:

NOTE: since there are 3 kinds of dependencies with deno (npm, jsr and direct url), there are usually 3 cases to consider

  1. parse the lockfile into a common format (preprocessing step)
    1. get a deno.lock file as an argument.
    2. there is a parser (written in nix) for the lockfile that
      a. reads the version of the lockfile
      b. chooses the appropriate converter for the version
      c. converts the packages information from the lockfile into our custom common lockfile format
    3. there is a npm lockfile converter that converts the npm section of the common lockfile format into a format that npm accepts
  2. fetch deps (FOD step)
    1. there is a custom fetcher for jsr dependencies. it receives the common lockfile format as an argument and
      a. fetches the files (using nixpkgs fetchurl), where each file needs its own derivation, since that is what we have the hashes for
      b. the derivations are added to the lockfile attrset
    2. there is a custom fetcher for url dependencies. it receives the common lockfile format as an argument and
      a. fetches the files (using nixpkgs fetchurl), where each file needs its own derivation, since that is what we have the hashes for
      b. the derivations are added to the lockfile attrset
    3. the generated npm lock file is fed to the fetchNpmDeps fetcher and the resulting node_modules folder is our common npm deps folder structure
  3. convert folder structure and metadata (postprocessing step)
    1. there are 3 converters, they run inside a mkDerication buildPhase
      a. there is a common jsr deps folder structure converter
      b. there is a common url deps folder structure converter
      c. there is a common npm deps folder structure converter
    2. they convert our common formats into the formats that the current deno version expects, they also creates any necessary meta data files etc.
      this could be more involved logic, using information from the lockfile, which might be difficult to do well in just shell script, so this should probably be done in some stronger language.

With preprocessing and postprocessing, the FOD's won't change often, if ever, since they are decoupled from any breaking changes that deno could introduce.
If the lockfile changes, then we write a new preprocessing converter.
If the folder strucuture changes, then we write a new postprocessing converter.

Still it's possible that something in step 2 will have to be changed, so
the version of that fetcher should be part of the name of the FODs.


I am unsure about wether 2.3. is a good idea, since it introduces a dependency on the fetchNpmDeps helper.

You said

but for NPM maybe we could adapt the existing fetcher codebase so that there’s a lower layer that can take the relevant information for a single package directly rather than requiring it to be in package-lock.json format

But I'm not sold on that yet either.


There are multiple individual rust crates that contain the logic for the things we need to do

For deno.lock: https://crates.io/crates/deno_lockfile
For npm cache: https://crates.io/crates/deno_npm_installer https://crates.io/crates/deno_npm_cache

and some more.

From what I can tell they are not really doing what we need.

On paper it seems like a good idea to use them, but somehow they don't really fit into the architecture I lined out now.


Ideas? Objections?

@aMOPel
Copy link
Contributor Author

aMOPel commented Jun 26, 2025

Update:

implementing the jsr side of things like this went fine.

the url dependencies proved to be more convoluted than expected, since they require custom transformation logic per domain and I haven't fully understood the details yet.

@aMOPel
Copy link
Contributor Author

aMOPel commented Jun 30, 2025

Update:

I've hit a snare with the importLock implementation.

Deno requires to know about some response headers of fetched files.

While you can use the curl flag -D to get the headers, to access the headers, they need to be added to $out, at which point the hash of the FOD changes and the hash of the lockfile can't be used anymore. This defeats the whole point of the importLock implementation. After all it's supposed to build everything just using the hashes in the lockfile, no need to specify a hash in the nix derivation.

This seems to be inevitable. It's imaginable to do this with a hack, by hard coding those headers in certain situations, since they are only required in rare cases, but there will probably be problems with that.

So I think it's not possible to do the importLock cleanly, which means I'm gonna stick to the approach which requires an explicit hash in the nix derivation.

@aMOPel
Copy link
Contributor Author

aMOPel commented Jun 30, 2025

@emilazy while talking about internals with deno maintainers on their discord, a flatpak contributor told me, they also did a deno build helper for flatpak recently. I looked at the code and it got me thinking.

At flatpak they have a standardized interface for the whole custom fetcher problem.

I thought it would be cool, if nixpkgs could also have a standardized interface. Surely it won't work in every case, but unifying most of the language build helpers could improve maintainability and at the same time encourage to do it the right way, the way you told me to do it now, to decouple the FODs from any upstream changes.

Single FOD custom fetcher

A non-importLock custom fetcher could look something like this:

  1. The custom fetcher needs to parse the lock and transform it into our standard interface:
{
  hash="";
  packages= {
    "<unique package or file name>" = {
        url="<url>";
        path=null; # in step 2., the url is transformed into a unique path within the derivation, maybe by creating a hash of the url
        curlOpts=""; # per package curl opts
        meta={}; # object of arbitrary shape that is passed through
      };
      ...
   };
   curlOpts=""; # global curl opts
   derivation=null; # filled in, in step 2.
}
  1. That data is given to a nixpkgs lib function which creates a FOD using curl to download the packages from the urls and put them where the paths point. All those paths are within $out.
  2. The custom fetcher then restructures the folder structure as necessary in a new derivation using the FOD from step 2. and the paths and the meta data.

importLock custom fetcher

An importLock interface could look something like this:

  1. The build helper needs to parse the lock and transform it into our standard interface.
{
  packages= {
    "<unique package or file name>" = {
        hash="<hash>";
        url="<url>";
        derivation=null; # filled in, in step 2.
        fetchurlArgs={}; # per package args sent to fetchurl
        meta={}; # object of arbitrary shape that is passed through
      };
    ...
  };
  fetchurlArgs={}; # global args sent to fetchurl
}
  1. That data is given to a nixpkgs lib function which realizes all those FODs individually using fetchurl and fills in the derivations.
  2. The custom fetcher then assembles the desired folder structure in a new derivation using all those derivations and the meta data.

Probably there could be some useful helper functions for problems that commonly occur.

What do you think about this?

@hsjobeki
Copy link
Contributor

hsjobeki commented Jul 4, 2025

If we come up with a generic fetcher interface for npm and cdn/jsr dependencies there is still the gap of the custom files and metadata both package managers expect. Did you manage to figure out some way of doing that? Because from my understanding thats equally complex.

I imagine deno could have a cli command to pre-populate its cache with offline packages, then run deno install or similar and it would produce its own structure as needed (In a non fod) as an idea.

Since you seemed to be quite active with the deno maintainers could be worth asking if they offer something like that or endorse the idea. I imagine flatpak could need something similar

@aMOPel
Copy link
Contributor Author

aMOPel commented Jul 7, 2025

If we come up with a generic fetcher interface for npm and cdn/jsr dependencies there is still the gap of the custom files and metadata both package managers expect. Did you manage to figure out some way of doing that? Because from my understanding thats equally complex.

I think you misunderstood me there.

I proposed a generic interface for all custom fetchers in nixpkgs. But the only thing that is generalizable is the fetching itself.

So there would be 3 steps to creating a custom fetcher:

  1. write custom logic to transform a package lock file into the generalized format
  2. feed the data in the generalized format into the fetcher, which produces a generalized output format
  3. write custom logic to transform the file paths in the generalized output format into the file structure required by the language's package manager.

So this is somewhat similar to writing 2 adapters. The point of this would be to encourage a common architecture across custom fetchers, which is endorsed by the nixpkgs maintainers.

The format I have in mind has enough room to do things like meta data. I have iterated on it last week and its use will become apparent in the code and documentation, when I push it here this week.

I imagine deno could have a cli command to pre-populate its cache with offline packages, then run deno install or similar and it would produce its own structure as needed (In a non fod) as an idea.

I'm not sure I understand your idea. The problem we have to solve is that we have to do the fetching with our custom code and then we need an interface provided by deno where we can feed in our nix store paths with the fetched data and it will produce the directory structure it needs.

Last week I finally managed to do exactly that for the jsr and http packages using deno's provided library. It took a while to find it and how to use it, though.

@aMOPel aMOPel force-pushed the feat/buildDenoPackage-second branch from 8516e5b to 983c0a6 Compare July 10, 2025 10:59
@nixpkgs-ci nixpkgs-ci bot added 8.has: changelog This PR adds or changes release notes 6.topic: nodejs Node.js is a free, open-source, cross-platform JavaScript runtime environment 8.has: documentation This PR adds or changes documentation 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. and removed 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. labels Jul 10, 2025
@hsjobeki
Copy link
Contributor

  • write custom logic to transform a package lock file into the generalized format

  • feed the data in the generalized format into the fetcher, which produces a generalized output format

  • write custom logic to transform the file paths in the generalized output format into the file structure required by the language's package manager.

Yes i assumed this.

need an interface provided by deno where we can feed in our nix store paths with the fetched data and it will produce the directory structure it needs.

Thats what i ment with the original comment. How do you think of solving this? My proposal was to try using deno itself, to avoid complexity, unsure if deno has some interfaces to support it.

@@ -204,7 +204,7 @@ rustPlatform.buildRustPackage (finalAttrs: {

postInstall = ''
# Remove non-essential binaries like denort and test_server
find $out/bin/* -not -name "deno" -delete
find $out/bin/* -not -name "deno" -not -name "denort" -delete
Copy link
Contributor Author

@aMOPel aMOPel Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deno already builds the denort binary but it's removed. The correct solution for us would probably have been to add an extra output & keep denort.

@06kellyjac is this what you meant in the previous PR?

@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. and removed 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Jul 10, 2025
@aMOPel
Copy link
Contributor Author

aMOPel commented Jul 11, 2025

@hsjobeki

Thats what i ment with the original comment. How do you think of solving this? My proposal was to try using deno itself, to avoid complexity, unsure if deno has some interfaces to support it.

As I said in my comment before:

Last week I finally managed to do exactly that for the jsr and http packages using deno's provided library. It took a while to find it and how to use it, though.

This interface only works for https and jsr imports, though. Yesterday @rorosen helped me to write a small rust wrapper around that interface, which you can look at here: https://github.com/NixOS/nixpkgs/pull/419255/files#diff-657a96d5ebec4ff9ca7cf304c40ec07d4eb6f05b12b910577aa4719377f14e10

I had a wrapper written in deno before, but it proved difficult packaging that wrapper cleanly with nix, without this very PR (chicken and egg problem).

Currently, I handle npm imports manually. Which means there is also only limited support.

I outlined all the features that I am aware of, both missing and implemented, in the readme, I wrote for the fetcher

https://github.com/aMOPel/nixpkgs/blob/feat/buildDenoPackage-second/pkgs/build-support/deno/fetch-deno-deps/readme.md

@emilazy
Copy link
Member

emilazy commented Jul 11, 2025

Hi, I haven’t had time to read the whole thread (sorry), but just to belatedly reply to

So my plan currently is to just build an importLock functionality, no custom fetcher using a hash.

We unfortunately can’t import locks from fetched sources because that would be IFD (import‐from‐derivation – a misleading term for “evaluation depending on the contents of derivation outputs”), and in general people aren’t too happy about having to vendor lock files in Nixpkgs to work around this (#327064). So I’m not sure what the current state of things is, but I would suggest prioritizing a hash‐based fetcher, as it’s likely what we’d want to use for most packages that have lock files checked in upstream.

@aMOPel
Copy link
Contributor Author

aMOPel commented Jul 11, 2025

@emilazy thanks for checking in

We unfortunately can’t import locks from fetched sources because that would be IFD (import‐from‐derivation – a misleading term for “evaluation depending on the contents of derivation outputs”), and in general people aren’t too happy about having to vendor lock files in Nixpkgs to work around this (#327064). So I’m not sure what the current state of things is, but I would suggest prioritizing a hash‐based fetcher, as it’s likely what we’d want to use for most packages that have lock files checked in upstream.

Hm. I'm not sure I understand. What are buildRustPackage and buildNpmPackage doing then, if not import locks from fetched sources?

“evaluation depending on the contents of derivation outputs”

Does this not inevitably happen? I need to parse the lockfile at least for the list of packages. 'depending" on that I construct FODs. Why does it matter wether I use the hashes from the lockfile or not?

@aMOPel
Copy link
Contributor Author

aMOPel commented Jul 11, 2025

Update on the progress:

The build for the fresh-init-cli works now, which is certainly cool, since it is a large project with many dependencies.

Currently binary builds for packages with imports from esm.sh don't work. I'm planning to fix that.

When building the fresh-init-cli, I noticed that the way this builder works currently is extremely slow, since it will create separate derivations per file of a jsr dependency. This means a lot of disk io for many imports of large jsr packages.
I chose this method until now, since it allows completely inferring the hashes from the deno.lock file for jsr packages. Also it makes the caching extremely granular, which results in less refetching. But the price might be too steep. Also the way emilazy makes it sound, importing the hashes from the lockfile, does not seem to be a viable solution anyways.

aMOPel added 4 commits August 4, 2025 16:06
Co-authored-by: rorosen
It's used by the denoBuildPackage build helper to compile deno projects
into a binary.
@aMOPel aMOPel force-pushed the feat/buildDenoPackage-second branch from 8fd2775 to bd54b5e Compare August 4, 2025 14:07
@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 4, 2025

@hsjobeki
@philiptaron
@winterqt
@GetPsyched
@fricklerhandwerk
@ofalvai
@06kellyjac
@emilazy

This is a reopen of the reverted PR #407434

This PR is ready for review.

You don't need to read the PR comments above, as that was mainly about me understanding nixpkgs' requirements and emilazy explaining them to me

You can build the tests locally with

nix build -f . --option allow-import-from-derivation false tests.build-deno-package

but mind that it will take a long time, since this PR adapts the pkgs.deno package, which means you have to fully rebuild it.

Compared to the previous PR, mainly the custom fetcher changed.

To understand what is going on in the fetcher, you can read this documentation I wrote about the features, the fetcher needs to have:

https://github.com/NixOS/nixpkgs/blob/bd54b5e47f0a7858336b38fa30aaea7541841eb5/pkgs/build-support/deno/fetch-deno-deps/readme.md

And to understand the general architecture of the fetcher, you can read this documentation, which is also a collection of general problems that language build helpers need to solve, along with a proposal for a standardized architecture, which could be improved upon and added to the nixpkgs documentation:

https://github.com/NixOS/nixpkgs/blob/bd54b5e47f0a7858336b38fa30aaea7541841eb5/pkgs/build-support/deno/fetch-deno-deps/generic-fetcher-architecture-docs.md

@yacinehmito
Copy link
Contributor

@aMOPel Were issues raised with Deno about the missing info in the lockfile?
(e.g. TypeScript declaration files provided in comments or in headers from esms.sh)

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 5, 2025

@aMOPel Were issues raised with Deno about the missing info in the lockfile? (e.g. TypeScript declaration files provided in comments or in headers from esms.sh)

I made this issue. Maybe I should have made it the main deno repo.

If you're referring to how this is handled in this build helper, then the fact is, that there is no "import from lock" functionality implemented, among other reasons, because of the missing files in the lockfile.

So a nix build with this build helper will always need a hash specified in nix.

@aorith
Copy link
Member

aorith commented Aug 9, 2025

First of all, thank you for your work @aMOPel
I was trying to build silverbullet on this commit silverbulletmd/silverbullet@d77f028 using buildDenoPackage from this PR, but it seems that its not fetching all the dependencies:

 > error: Import 'https://deno.land/x/fuse@v6.4.1/dist/fuse.d.ts' failed

Running a deno install locally produces the following:

$ deno install --vendor --entrypoint ./silverbullet.ts
# . . .
$ ls -lrt vendor/deno.land/x/fuse@v6.4.1/dist/
total 24
-rw-r--r-- 1 aorith aorith 15426 ago  9 07:59 fuse.esm.min.js
-rw-r--r-- 1 aorith aorith  6894 ago  9 07:59 fuse.d.ts

Whereas if I add the same debug ls on deno-build-hook.sh the content is:

       > total 16
       > -rw-r--r-- 1 nixbld nixbld 15426 Aug  9 06:04 fuse.esm.min.js

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 9, 2025

@aorith thank you for your testing. I am going to fix that on monday

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 11, 2025

@aorith this is indeed an edge case I have not handled in my fetcher so far. I asked in the deno discord now what the logic behind it is, since the .d.ts file neither in a x-typescript-types response header, nor is it imported in the fuse.esm.min.js file.

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 11, 2025

It is imported in a @deno-types declaration (now called @ts-types)

web/fuse_search.ts
1:// @deno-types="https://deno.land/x/fuse@v6.4.1/dist/fuse.d.ts"

This is very frustrating to add as a feature, because these types are only in the source files, not in the lock, but the fetcher already needs to know about them.

I could either pass the source files along with the lock file to the fetcher, but this creates problems with the updating of the hash.

Right now there is a check that tells you that the lock file has changed and you need to update the hash of the denoDeps, which will then result in a refetch.

Doing the same thing for the whole source code would mean refetching on every code change while developing.

Instead what I need is another derivation, before the fetching step, that takes the src code as input and returns a json with all the @ts-types urls in the source code (just by using grep). this json is passed to the fetcher and the urls in there are also fetched. the json can then be checked every time when building and if it changes, you need to update the deps hash.

As far as I understand this would not create an IFD so this should be fine.

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 11, 2025

@aorith

I added the feature now.

with this:

  silver-bullet = buildDenoPackage {
    pname = "silver-bullet";
    version = "";
    denoDepsHash = "sha256-o/maeRY67ddiuErolu3Qq13ZX1s+HQdv2LrQANz0VEY=";
    src = fetchFromGitHub {
      owner = "silverbulletmd";
      repo = "silverbullet";
      rev = "d77f0286d8eb43c241aba8597d29d7e09d51d7b3";
      hash = "sha256-h0OEp6oDSqo2pFGlth7piR521o+K/Jl5Ap65qpHBYfI=";
    };
    binaryEntrypointPath = "silverbullet.ts";
    targetSystem = "x86_64-linux";
  };

the build continues past the type check, which you got the error from. the silver bullet build also needs to do some other pre build steps. those are out of scope for me, though.

Mind that you could also have achieved that by just adding denoCompileFlags = ["--no-check"].

Please tell me if the denoDepsHash should be different for you, since that would be a bug.

@yacinehmito I guess this is interesting for you too, since this now implemented fetching the @ts-types files that would not appear in lockfiles (same for <reference types=... and deno.json compilerOptions.types. I am going to make an issue for that at the deno repo.

EDIT: issue

@aorith
Copy link
Member

aorith commented Aug 11, 2025

I get the same denoDepsHash as you.

I did a quick test and I'm not sure if this is required for a proper build without the type checking error, but after adding the following to denoCompileFlags:

    "--include=./build_plugs.ts"
    "--include=./build_bundle.ts"
    "--include=./build_web.ts"
    "--include=./silverbullet.ts"

It complains again about a dependency:

error: Import 'https://deno.land/x/esbuild@v0.25.1/mod.d.ts' failed.

Which indeed is not present:

-rw-r--r-- 1 nixbld nixbld 74643 Aug 11 17:58 mod.js
$ deno install --vendor --entrypoint ./silverbullet.ts ./build_web.ts ./build_bundle.ts ./build_plugs.ts
# . . .

$ rg mod.d
vendor/deno.land/x/esbuild@v0.25.1/mod.js
1:/// <reference types="./mod.d.ts" />

(The build requires git on nativeBuildInputs)

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 11, 2025

Thanks for the quick feedback.

That is of course another edge case 😄 there seems to be no end. I'm gonna look into it.

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 14, 2025

@aorith is there a very good reason why you wouldn't use "--no-check"?

IMO the type checking does not have to happen in the nix-build. The user can do it on their machine if they so desire.

I asked the deno maintainers, and they don't plan to add the type files to the lockfile.

There are many different complicated edge cases associated with this and the required implementation is still error prone sometimes.

Just dropping support from type checking in this build helper would greatly reduce the fetching logic and reduce the the maintainance burden.

@philiptaron philiptaron changed the title Feat/build deno package (second attempt) buildDenoPackage: init Aug 14, 2025
@aorith
Copy link
Member

aorith commented Aug 14, 2025

@aMOPel No reason at all! I actually thought that it was something that you would like to add to the builder. I still cannot build the package, even with --no-check, but that's probably a mix of my limited knowledge and the project's somewhat unfriendly structure towards nix.

@wanderer
Copy link
Contributor

we are using the builder successfully here

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 15, 2025

@wanderer that is great to hear! Thanks for your testing.

@hsjobeki
Copy link
Contributor

hsjobeki commented Aug 16, 2025

Since this PR is already a 5k lines of change, we should discuss a review and merge strategy.

I'd prefer giving reviewers smaller badges, if possible. Since the PR descriptions says "this is in ready to merge state" could you give another simple overview of the final architecture maybe we decide to split this into sub-prs so its easier for us to review?

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 16, 2025

Since the PR descriptions says "this is in ready to merge state"

It says "ready for review", not merge.

In fact I plan to take out a little bit of code in the fetcher again as i mentioned in a previous comment, probably next week.

could you give another simple overview of the final architecture

The two markdown files that I linked in the description try to do that. One explains the architecture and its reasoning. One explains the features that the fetcher has and doesn't have, the technical details of the deno dependency system.

I can try to rewrite them if they are not up to your standard.

maybe we decide to split this into sub-prs so its easier for us to review?

I don't see how, but I'm open to be suprised.

You can split out the documentation, sure.

But it doesn't really make sense to split the code into pieces, since the build helper simply needs all the pieces to function. Also then you would need to write new tests for the pieces individually. Currently there are only e2e tests.

@hsjobeki
Copy link
Contributor

hsjobeki commented Aug 17, 2025

First of all, thank you for the huge amount of work you’ve put into this PR — it’s clear you’ve invested a lot of thought and effort. What I’m saying below comes from a place of support for this PR.

That said, a PR of this size is very challenging for us to review effectively. Even with its good documentation, 5k lines of code in one go is almost unreviewable, and it increases the risk of bugs slipping in unnoticed. For us to maintain quality, we really need changes to be structured in smaller chunks.

You mentioned that the build helper needs all the pieces to function and has only e2e tests — but from our perspective, this suggests the code could benefit from more modularity. Smaller, well-defined units with corresponding unit tests would not only make it easier for us to review but also make the system more maintainable in the long term.

One possible path would be:

  • Extract the documentation as its own PR (as you suggested).
  • Break the implementation into logical modules that can be tested and merged independently. (seems the architecture was planned in a modular manner after all)
  • Start with the core building block(s), with unit tests, and then layer additional functionality in follow-up PRs.

If we can break this into smaller pieces, it’ll make it much easier for the community to give it the review it deserves and get it merged faster.

@hsjobeki
Copy link
Contributor

hsjobeki commented Aug 17, 2025

Looking at the architecture graph i have a few immediate questions:

  • Does this PR introduce two different lockfile transformers and two different fetchers?

  • Why is the architecture document not part of this PR? Including it here would make it much easier for us (and future maintainers) to understand and maintain the builder, rather than having the plan only in a separate commit on your fork.

  • I find the build/eval-time line a bit confusing — though this may just be a matter of taste. Could you clarify how you see that boundary?

  • What do you mean by “Nix realm”?

  • What is common lockfile is that a custom format just for this pr? Sounds like the transformer(s) could have a spec. and unit tests.

  • Some terms would benefit from clarity. I.e. dependency cache -> is this the deno cache?

  • You could make all term specific to your problem, language-package-manager -> deno. finding a generic abstraction for all problems of the same kind is not necessarily a concern for this pr.

Also, since we’re actively discussing the design doc, it feels like it really should be added to this PR so the implementation and its rationale live together. EDIT: just realized the file was there, i just happened to scroll over it in the long list.

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 17, 2025

5k lines

I believe 1.2k is lockfile. The one rust crate has a very big lockfile, and the other lockfiles are from the various e2e tests.

Extract the documentation as its own PR (as you suggested).

Can do. I don't see how it makes the review easier, since the amount of code doesn't change, but can do.

Break the implementation into logical modules that can be tested and merged independently. (seems the architecture was planned in a modular manner after all)

Yes. That would make the review easier. Unit tests would be good in a couple of places.

How many separate PRs were you thinking of?

Too much splitting would probably create a bunch additional effort. I would like to find a good balance.

I would like to split the builder and the fetcher and build tests for the multiple scripts, so the "lockfile transformer", "fetcher" and the two "file transformers". The tests would test each of those as a blackbox and just look at input and output. That way we also get a sort of spec for the formats.

Then I would create the fetcher PR first, then the builder PR and lastly the docs PR.

Does that sound acceptable?

Does this PR introduce two different lockfile transformers and two different fetchers?

No, i tried to implement an "import from lockfile" functionality, but ultimately it was infeasible.

Why is the architecture document not part of this PR?

It is. The links I posted lead to the markdown files from the very branch this pr stems from.

I just used permalinks so they wouldn't break. That doesn't mean the files are not on this branch anymore.

I find the build/eval-time line a bit confusing — though this may just be a matter of taste. Could you clarify how you see that boundary?

What do you mean by “Nix realm”?

I talk about this differentiation, because  the build helper can not do any IFD. So I try to visualize what steps happen during the evaluation time of nix code (the nix realm) vs what happens inside a nix build container.

What is common lockfile is that a custom format just for this pr

I believe I did not do a good job making that clear in the markdown files.

Yes. It is custom format. It stemmed from my effort generalizing build helper architecture. One point of it is, that we need to decouple the format that the fetcher consumes from deno's lockfile format. Since the latter is subject to change and we only want to break the fetcher as few times as possible, to avoid refetching.

Some terms would benefit from clarity. I.e. dependency cache -> is this the deno cache

Fair.

Yes. I believe so

You could make all term specific to your problem, language-package-manager -> deno. finding a generic abstraction for all problems of the same kind is not necessarily a concern for this pr.

Yes. I believe what is missing in the documentation, is how this particular build helper's architecture relates to the general one. I can add that.

@hsjobeki
Copy link
Contributor

hsjobeki commented Aug 18, 2025

Too much splitting would probably create a bunch additional effort. I would like to find a good balance.

Yes that's right. I am aware that it creates another additional effort from your side. But ultimately reviewability has to win, if we don't want this PR to sit around forever.

Then I would create the fetcher PR first, then the builder PR and lastly the docs PR.

Yes, agreed - I also think the right order would be:

  1. fetcher PR, but include the architecture design document for it in order for people see how it all comes together.
    This is probably the most important one, and should take extra care, since it introduces a new FOD fetcher.
  2. builder PR
  3. docs PR

In every PR include a reference to this PR and vice versa.

No, i tried to implement an "import from lockfile" functionality, but ultimately it was infeasible.

I think the design documents need to be updated to strictly show whats part of the PR to avoid any possible confusion.
Its good to have a plan for the future, but i would leave that out, or make it a seperate picture.

@wanderer
Copy link
Contributor

we are getting a build error now in https://github.com/masslbs/safewallet-to-beancount

      > Running phase: unpackPhase
       > Running phase: patchPhase
       > Running phase: updateAutotoolsGnuConfigScriptsPhase
       > Running phase: configurePhase
       > no configure script, doing nothing
       > Running phase: buildPhase
       > error: Uncaught (in promise) Error: Invalid name@version format (version is required) in: "string" in string "string_decoder@1.3.0"
       >     throw new Error(`Invalid name@version format (version is required) in: "${split[0]}" in string "${fullString}"`);
       >           ^

is the underscore in string_decoder causing a problem?

@aMOPel
Copy link
Contributor Author

aMOPel commented Aug 20, 2025

Oh wow. Apparently I messed up the template string of the error message.

You can try going one commit back. I think the last commit might be the culprit.

In fact I am gonna put this PR back on draft, since I have multiple plans for changes as mentioned in the previous comments.

@aMOPel aMOPel marked this pull request as draft August 20, 2025 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: nodejs Node.js is a free, open-source, cross-platform JavaScript runtime environment 8.has: changelog This PR adds or changes release notes 8.has: documentation This PR adds or changes documentation 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants