Skip to content

Add support for defining builtin host functions at compile-time #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

fitzgen
Copy link
Member

@fitzgen fitzgen commented May 28, 2025

Add support for defining builtin host functions at compile-time. Because these functions are early-bound at compile-time -- rather than late-bound at instantiation-time, like regular imports -- the Wasm's compilation can be specialized for these exact imports, enabling inlining without just-in-time compilation, for example. This is the rough equivalent of the js-string-builtins proposal but for Wasmtime's API and Wasmtime embedding environments rather than JavaScript's WebAssembly API and JavaScript execution environments.

Rendered

@cfallin
Copy link
Member

cfallin commented May 29, 2025

Thanks for writing this up -- I think this will be a really powerful ability once we have it, and will be extremely important for certain kinds of applications!

When I was originally mulling over this design space for the zero-copy buffer use-case, I had been imagining something like a raw CLIF interface, but I agree that that's got a lot of downsides and is pretty much fully subsumed by the other options. I'm happy to see we're moving toward the "just define the logic in Wasm" idea (and also not the "special sublanguage" idea, though I liked that when we talked about it too) -- this cleans up a lot of duplication.

I think my input here comes in two major lines:

  1. I suspect that the ability to make slow-path calls is going to be essential to many of the real use-cases we have imagined for this feature (certainly for zero-copy buffers: they have the "append and maybe grow" behavior that Vec does, so a mutable API will need this, and even read-only accessors will have more complex paths for e.g. moving to the next rope segment that may or may not be practical to write inline). So I think this

    Should we allow self-hosted Wasm to import and call non-intrinsic functions?

    might not be punt-able as

    I think this is something we will want eventually, but I'd like to get something basic working first before tackling this more-complicated use case.

    if we want to use the thing in our embedding.

  2. I find myself going back and forth on whether the compile-time-builtin Wasm module should be "special" (with constraints as this RFC describes) or more general and which is actually more in the spirit of the standards -- I'll address this one first as I think it addresses the first question if we resolve it another way.

Preliminarily: yes 100% to

We must not deviate from the WebAssembly language semantics.

and also to the more subjective "let's not encourage people to use a nonstandard extension" (and to clarify to any readers: yes that concern is still in scope even if we are literally standards-compliant by presenting function imports, because a sufficiently tempting set of function imports can become a de-facto standard).

However I find myself wondering whether we shouldn't do something a little more general and provide

  1. The load/store imports this RFC defines, and
  2. The ability to fix a module import (of any general Wasm module) while compiling another Wasm module, and implement some (simple one-level?) inlining from that module, with none of the other restrictions this RFC names.

I'll call this the "privileged adapter module" approach.


On the "negative motivation" side first (against special Wasm subset for "self-hosted Wasm"): defining restrictions on which subset of Wasm modules is acceptable for a compile-time-builtin module could be seen as restricting/subsetting the standard somewhat. I can absolutely see the logic for it (as the RFC says, discouraging general use, but also in particular: it's simpler if we don't have a VMContext for the special module that is separate from the module that is using it) but isn't it also defining a "Wasm-prime" in the other direction?

On the "positive side" now (for fully general Wasm for "self-hosted Wasm"): the spirit of virtualization and precedent elsewhere (e.g., WASI virtualization, and also places that we talk about adapter modules, such as the debug RFC) suggests to me at least that there isn't too much danger in defining a more privileged interface (here: "load/store host memory") and then allowing any general Wasm module to be an adapter module that provides a higher-level "safe" service on top of it. (In fact we know of folks doing this in production with components.) It is still standard Wasm -- it just has a particular API available to it, and one that the embedder must enable for a specific module. If someone wanted to implement this "peek/poke API" today as ordinary host functions, they could; we are saying that we recognize the need for it because we want to move more logic into Wasm for inlinability. Philosophically, I think the notion that we can never have privileged interfaces imported into a Wasm module (because someone might write Wasm that always requires the privileged interfaces and misuse the specialized environment as a general environment) sits less well with me: it says that Wasm is somehow not universal, and can't be used to implement some parts of the system.

Said another way: "root privilege" (arbitrary load/store) then virtualization is more or less directly aligned, I think, with how capability systems are supposed to work. The idea is that the danger is in plumbing the wrong capabilities to the wrong modules, and safety requires us to put the right access-filtering or -subsetting modules inline with powerful capabilities. But this is already true, and we already trust our embedders to "wire things up right" because one can write arbitrary hostcall implementations or grant the wrong pre-opens or whatnot. Right now in Wasmtime I think we haven't seen this situation much because we have host-native filtering of most of the privileges we grant (e.g. WASI APIs) but I think there's nothing fundamental about that.


And finally, if we restrict ourselves to (i) these load/store intrinsics, provided when configured to a privileged adapter module, and (ii) early binding of this adapter module in a way that enables inlining, then all of the open design questions are addressed, as far as I can tell:

  • It addresses the "should we allow self-hosted Wasm to {call non-intrinsic functions, have tables for indirect calls, have memories, ...}" questions definitely: yes, it's Wasm, so it has function calls, tables, memories, etc. (As an optimization, if it has none of these, perhaps it doesn't need its own VMContext?)

  • It addresses the question of how to encapsulate "slow-path functions" and which functions are available: they can be ordinary imports to the adapter module, and not provided to the main module.

  • It potentially addresses the Winch question: Winch can compile the adapter module normally (no inlining needed); we only need to implement the intrinsics or provide polyfills as imported hostcalls.


Anyway -- all strong opinions, relatively weakly held -- happy to discuss further!

@abrown
Copy link
Member

abrown commented Jun 5, 2025

I don't have a strong opinion on whether we need two-level inlining (intrinsics -> self-hosted -> component) or just a single level (intrinsics -> component) — @cfallin is already saying some of the things I was thinking. But I do want to point out how much inlining we're doing and make a plug for making that easier. In either case, we want to inline some CLIF instructions for each intrinsic, right? I'm not sure I caught where the intrinsics were to be specified (are they proposed additions to the component model?) but, in any case, I was imagining there would be some compiler code that converted a call to these special imports into some CLIF, like we currently do for CM built-ins and trampolines (?). And this is what I was hoping could become easier: I know you were kind of discarding the first idea, "exposing CLIF", but it seems helpful for this kind of problem: we tell the compiler "here are the CLIF instructions for calls to this import". I understood from your RFC the danger of misuse, so perhaps it should not be a public, embedder-accessible API, but just having an easy way to inline intrinsics could make it easier to pursue the self-hosted functions?

@fitzgen
Copy link
Member Author

fitzgen commented Jun 13, 2025

@abrown

I don't have a strong opinion on whether we need two-level inlining (intrinsics -> self-hosted -> component) or just a single level (intrinsics -> component) — @cfallin is already saying some of the things I was thinking. But I do want to point out how much inlining we're doing and make a plug for making that easier. In either case, we want to inline some CLIF instructions for each intrinsic, right? I'm not sure I caught where the intrinsics were to be specified (are they proposed additions to the component model?) but, in any case, I was imagining there would be some compiler code that converted a call to these special imports into some CLIF, like we currently do for CM built-ins and trampolines (?). And this is what I was hoping could become easier: I know you were kind of discarding the first idea, "exposing CLIF", but it seems helpful for this kind of problem: we tell the compiler "here are the CLIF instructions for calls to this import". I understood from your RFC the danger of misuse, so perhaps it should not be a public, embedder-accessible API, but just having an easy way to inline intrinsics could make it easier to pursue the self-hosted functions?

Yes, there will be two kinds of inlining:

  1. The compile-time builtins need access to intrinsics for reading/writing native memory, and these instrinsics need to be inlined to meet our performance goals. This will happen with a very ham-fisted approach during Wasm-to-CLIF translation where we immediately turn calls to these intrinsics into the relevant CLIF instrucitons.
  2. We need to inline the compile-time builtins into the Wasm application that imports them. Depending on the approach we take, and the constraints we put on the shape of compile-time builtins, this could be done with the same ham-fisted approach used for (1). My thinking recently, reflected in the last Wasmtime meeting's discussion but not yet in this RFC, has been to instead make a more general inliner-as-a-library kind of thing for Cranelift that still allows Cranelift embedders to drive overall compilation like they do today but provides hooks for them to do inlining and use their own inlining heuristics. This would allow us to also do things like inlining calls across components and their fused adapters.

fitzgen added a commit to fitzgen/wasmtime that referenced this pull request Jul 3, 2025
This allows you to map some functions, described by the given
`InstructionMapper`, over each of the entitities in an instruction, producing a
new `InstructionData`.

I intend to use this as part of an inliner API for Cranelift that I am
developing as part of prototyping [Wasmtime's compile-time
builtins](bytecodealliance/rfcs#43).
github-merge-queue bot pushed a commit to bytecodealliance/wasmtime that referenced this pull request Jul 9, 2025
* Cranelift: Generate an `InstructionData::map` method

This allows you to map some functions, described by the given
`InstructionMapper`, over each of the entitities in an instruction, producing a
new `InstructionData`.

I intend to use this as part of an inliner API for Cranelift that I am
developing as part of prototyping [Wasmtime's compile-time
builtins](bytecodealliance/rfcs#43).

* cargo fmt

* fix clippy
@fitzgen
Copy link
Member Author

fitzgen commented Jul 29, 2025

Heads up: I've updated this RFC with a more-concrete proposal after the discussion in this issue and at various Cranelift and Wasmtime meetings.

I've also implemented general function inlining for Wasmtime in bytecodealliance/wasmtime#11283. Right now, that is useful for inlining calls between components (including their generated adapters) but with compile-time builtins can also be reused to inline the definitions of compile-time builtins into their callers. And we shouldn't really have to do anything special to get that inlining, it should just happen For Free when inlining is enabled. (I do expect we will need to tweak the inlining heuristics as time goes on, but that is a separate discussion.)

I think there is just one last open question we need to resolve before moving forward: exactly how to implement the resource.address intrinsic (or something similar / equally powerful).

Please take a look at the updated RFC and let me know what you think and any ideas you have for that last open question!

array indirection: index into the array of resource tables, then index into
the array of table elements?

Anyone have any other ideas?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could use LLVM intrinsics as inspiration and have the import's name tell you which type and resources to use, e.g. have resource.address.t1 for a resource in table 1 or something.

@fitzgen
Copy link
Member Author

fitzgen commented Jul 30, 2025

Okay so I actually had flawed assumptions with the way that resources work in the component model, and this nicely nullifies that last open question.

The resource-definer gets to use an arbitrary u32 as their internal representation of a resource. Resource tables are only involved for other components, and their handles to that resource. The resource-definer always gets access to their u32 representation directly.

So given all that, the u32 resource representation can be an index into some embedder-defined table in the T in a Store<T> and compile-time builtins can inline accesses to those embedder-defined tables if we give them an intrinsic like store.data_address that returns a *mut T (and the embedder's T is repr(C)).

Will update the RFC shortly.

@fitzgen
Copy link
Member Author

fitzgen commented Jul 30, 2025

Will update the RFC shortly.

RFC updated accordingly.

I think that resolves all open questions for this RFC. I will give it a little bit of time for some more discussion before officially starting the motion to merge, just to give people an extra chance to read the updated RFC first and provide feedback.

invalid resources and out-of-bounds resource table accesses.
The final intrinsic, `store.data_address`, gives a `*mut T` pointer containing
the address of the `T` host data in the `wasmtime::Store<T>`. As long as
embedders ensure that their `T` type is `#[repr(C)]`, then their compile-time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also have to watch out for primitive type alignment changing offsets, e.g. in #[repr(C)] struct S(u8, f64) the f64 field is at offset 4 on i686-unknown-linux-gnu but at offset 8 on x86_64-unknown-linux-gnu

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true and I expect we will need to add some wording to our documentation for this feature around what exactly the safety conditions will need to be, such that the compile-time builtins can portably access the host data. My plan right now is to document these things in detail as I prototype and dog-food the feature.

@fitzgen
Copy link
Member Author

fitzgen commented Aug 5, 2025

Okay, I think this is ready and we've had a bit of time for folks to look it over, so let's officially start the process.

Motion to Finalize

Disposition: Merge


As always, details on the RFC process can be found here: https://github.com/bytecodealliance/rfcs/blob/main/accepted/rfc-process.md#making-a-decision-merge-or-close

@alexcrichton
Copy link
Member

I second! (don't think I can re-approve)

@fitzgen
Copy link
Member Author

fitzgen commented Aug 6, 2025

As there has been sign-off from representatives of two different BA stakeholder organizations, this RFC is now entering its 10-day

Final Comment Period

and the last day to raise concerns before this RFC merges is 2025-08-16.

Thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants