Skip to content

x64: generate rules to pick between SSE or AVX instructions #10942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 9, 2025

Conversation

abrown
Copy link
Member

@abrown abrown commented Jun 5, 2025

The new assembler gives us the opportunity to generate some extra ISLE rules to eliminate boilerplate like the following, hand-written around 180 times throughout inst.isle:

(decl x64_... (Xmm XmmMem) Xmm)
(rule 1 (x64_... src1 src2)
        (if-let true (use_avx))
        (<emit avx version>))
(rule 0 (x64_... src1 src2) (<emit sse version>))

This PR only attempts to solve that exact pattern (i.e., Xmm XmmMem) but the approach could scale to more patterns in the future.

abrown added 2 commits June 5, 2025 15:20
This adds the ability for an instruction to track an alternate version
of itself. This is helpful for connecting SSE instructions to their
upgraded AVX version. The logic here compares the mnemonic name minus
the `v` prefix and checks that the opcodes are the same. This
auto-assignment cautiously limits us to SSE instructions like `rw(xmm),
r(xmm_m*)`, which have a most regular SSE-AVX relationship.

To see the list of instructions that have been assigned alternates, from
the `cranelift/assembler-x64/meta` directory, run:

```console
$ cargo run | grep alternate
```
This change updates `gen_asm.rs` to emit an extra rule for instructions
that have an AVX alternate: `x64_<inst>_or_avx`. The rule logic has the
same shape as what we currently hand-write ~180 times in `inst.isle`
(!).
@abrown abrown requested a review from a team as a code owner June 5, 2025 22:21
@abrown abrown requested review from cfallin and removed request for a team June 5, 2025 22:21
// Automatically assign AVX alternates to SSE instructions (see
// `Inst::alternate`). This allows later code generation to
// instruction-select between the AVX and SSE versions of an instruction.
assign_avx_alternates(&mut all);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation (below) could all go live in a separate module (?).

@@ -504,6 +504,38 @@ pub fn generate_isle_inst_decls(f: &mut Formatter, inst: &Inst) {
f,
"(rule ({rule_name} {param_names}) ({convert} ({raw_name} {implicit_params} {param_names})))"
);

if let Some(avx_name) = &inst.alternate {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not terribly happy about dropping this here; it will work fine but at some point we may need to refactor the function this is in. It is just quite handy that all the parameter types and names have been pre-calculated for us.

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks -- neat that this was made to work! Some thoughts below. If this PR turns out to be the best way I'm fine with it going in but I'm curious what you think of the below...

/// opcode if one is available and the primary otherwise.
pub fn opcode(&self) -> u8 {
match self {
Encoding::Rex(rex) => match rex.opcodes.secondary {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this nested match in an opcode() method on rex.opcodes?

Also, it's not totally clear to me what "primary opcode" means in the doc-comment above and why we return the secondary for the "primary" if it exists -- I read the doc-comment for Opcodes below too and I'm still a little fuzzy. Could we say what we mean here by "primary" and why the secondary is the primary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manual uses "primary" and "secondary" to explain all this but, yes, it kind of feels like the "primary" opcode should always be the last one. I'll add Opcodes::opcode and try to explain things there a bit better.

all
}

/// Assigns AVX alternates to SSE instructions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's neat that this was made to work automagically -- all the same, I worry a tiny bit that it may be brittle (unexpectedly determine a match when some subtle condition means they're not quite equivalent; or on the other side, miss a match because of something). The note below about rounding and the fact this limits things to just one format for now as a result seems to bear this out a little.

In the spirit of the explicit DSL, I wonder if it would be better to have a clause on the instruction builder that sets the alternate? Something like

  inst("vpadd", ...).avx_for("padd"),
  inst("padd", ...),
  ...

(avx_for, upgrades, equiv, pick a verb phrase...)

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did consider adding an explicit version: inst(...).alt("vpadd"). I can't remember what dissuaded me, but probably laziness (there will be a lot of places to update!). But it is more clear so I'm in favor of that — less magic. I'll rework this to use a DSL function and use the current "check the SSE-to-AVX link" logic to verify that the user has specified .alt(...) correctly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd agree with @cfallin as well in that I'm in favor of explicitly saying "this is the AVX thing for that other instruction" which then generates a panic/assertion/etc if something doesn't match. (I realize y'all've already reached this conclusion but wanted to echo my opinion here too to voice support)

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. labels Jun 6, 2025
abrown added 3 commits June 9, 2025 09:41
This change uses `Inst::alt()` to explicitly mark which instructions
have an AVX alternate. This retains the strict checks preventing us from
manually making a mistake. It also passes along the `Feature` in an
`Alternate` structure to make this easier to extend later.
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

@abrown abrown added this pull request to the merge queue Jun 9, 2025
Merged via the queue into bytecodealliance:main with commit 09cfb90 Jun 9, 2025
53 checks passed
@abrown abrown deleted the asm-auto-alt branch June 9, 2025 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants