Skip to content

Conversation

taiki-e
Copy link
Owner

@taiki-e taiki-e commented Sep 1, 2024

This supports 128-bit atomics for riscv64 and 64-bit atomics for riscv32 with Zacas extension (ratified more than half a year ago) enabled.

Note:

  • Currently Zacas extension support is marked as experimental in LLVM: so -C target-feature=+experimental-zacas instead of -C target-feature=+zacas is needed for now.
    • As the name indicates, support for this extension on our part at this time will also be somewhat experimental.
    • Perhaps we cannot merge this PR until LLVM marks this as non-experimental.
      • Now marked as experimental on our end too. And now only enabled for LLVM 19 because it is more experimental in LLVM 17/18 and "experimental-zacas" feature may no longer exist when it is marked as non-experimental in LLVM 20.
    • UPDATE: marked as non-experimental in LLVM 20 riscv: Zacas extension is no longer experimental in LLVM #206
  • The core part of runtime CPU feature detection has already been implemented for Linux/Android and tested for Linux on QEMU, but is test-only until zacas is marked as non-experimental by LLVM.
  • Zacas extension provides DWCAS, but there is no DW load/store instruction in the ratified atomic-related extensions AFAIK, so a load/store implementation at this time would not be very efficient.

FYI @ibraheemdev

@taiki-e taiki-e added the O-riscv Target: RISC-V architecture label Sep 1, 2024
@taiki-e taiki-e force-pushed the riscv64-zacas branch 3 times, most recently from aef1268 to cb69baa Compare September 1, 2024 05:49
@taiki-e taiki-e force-pushed the main branch 5 times, most recently from 2ea2814 to 0818c81 Compare September 1, 2024 14:06
@taiki-e taiki-e force-pushed the riscv64-zacas branch 2 times, most recently from 6f63126 to 4e95659 Compare September 7, 2024 07:49
taiki-e added a commit that referenced this pull request Sep 7, 2024
@taiki-e taiki-e force-pushed the main branch 3 times, most recently from 3076fb9 to 79adf66 Compare September 8, 2024 07:03
@taiki-e taiki-e force-pushed the main branch 5 times, most recently from ba6e035 to 0fad63a Compare September 16, 2024 19:01
@taiki-e taiki-e force-pushed the riscv64-zacas branch 3 times, most recently from bb3edf4 to 00722a4 Compare September 18, 2024 17:18
@taiki-e taiki-e changed the title riscv64: Support 128-bit atomics (Zacas extension) riscv: Support 128-bit atomics on RV64 and 64-bit atomics on RV32 (Zacas extension) Sep 18, 2024
@taiki-e taiki-e changed the title riscv: Support 128-bit atomics on RV64 and 64-bit atomics on RV32 (Zacas extension) riscv: Support 128-bit atomics for RV64 and 64-bit atomics for RV32 (Zacas extension) Sep 18, 2024
@taiki-e taiki-e force-pushed the riscv64-zacas branch 2 times, most recently from e0ec885 to bcb83bf Compare September 18, 2024 17:50
@taiki-e taiki-e force-pushed the riscv64-zacas branch 8 times, most recently from 4769824 to 910f2a9 Compare September 18, 2024 19:14
@taiki-e taiki-e merged commit 343d10c into main Sep 18, 2024
105 checks passed
@taiki-e taiki-e deleted the riscv64-zacas branch September 18, 2024 21:15
compiler-errors added a commit to compiler-errors/rust that referenced this pull request Feb 25, 2025
rustc_target: Add more RISC-V atomic-related features

This is a continuation of rust-lang#130877 and adds a few target features, including `zacas`, which was experimental in LLVM 19 and marked non-experimental in LLVM 20.

This adds the following target features to unstable riscv_target_feature:

- `za64rs` (Za64rs Extension 1.0): Reservation Set Size of at Most 64 Bytes
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L227-L228), [available since LLVM 18](llvm/llvm-project@8649328))
- `za128rs` (Za128rs Extension 1.0): Reservation Set Size of at Most 128 Bytes
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L230-L231), [available since LLVM 18](llvm/llvm-project@8649328))
  - IIUC, `za*rs` can be referenced when implementing helpers to reduce contention in synchronization primitives, like [`crossbeam_utils::CachePadded`](https://docs.rs/crossbeam-utils/latest/crossbeam_utils/struct.CachePadded.html). (relevant discussion: riscv/riscv-profiles#79)
- `zacas` (Zacas Extension 1.0): Atomic Compare-And-Swap Instructions (`amocas.{w,d,q}{,.aq,.rl,.aqrl}` and `amocas.{b,h}{,.aq,.rl,.aqrl}` when `zabha` is also enabled)
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L240-L243), [available as non-experimental since LLVM 20](llvm/llvm-project@614aeda))
  - This implies `zaamo`.
  - This is used to optimize CAS in existing atomics and/or implement 64-bit/128-bit atomics on riscv32/riscv64 (e.g., taiki-e/portable-atomic#173).
  - Note that [LLVM does not automatically use this instruction for 64-bit/128-bit atomics on riscv32/riscv64 even if this feature is enabled, because doing it changes the ABI](https://github.com/llvm/llvm-project/blob/876174ffd7533dc220f94721173bb767b659fa7f/llvm/docs/RISCVUsage.rst#riscv-zacas-note). (If the ability to do that is provided by LLVM in the future, it should probably be controlled by another ABI feature similar to `forced-atomics`.)
- `zama16b` (Zama16b Extension 1.0): Atomic 16-byte misaligned loads, stores and AMOs
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L255-L256), [available since LLVM 19](llvm/llvm-project@b090569))
  - IIUC, unlike AArch64 FEAT_LSE2 which also makes 16-byte aligned ldp ({i,u}128 load) atomic, this extension only affects instructions that already considered atomic if they were naturally aligned. i.e., fld (f64 load) on riscv32 would not be atomic with or without this extension ([relevant QEMU code](https://github.com/qemu/qemu/blob/b69801dd6b1eb4d107f7c2f643adf0a4e3ec9124/target/riscv/insn_trans/trans_rvd.c.inc#L50-L62)).
- `zawrs` (Zawrs Extension 1.0): Wait on Reservation Set (`wrs.nto` and `wrs.sto`)
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L258), [available as non-experimental since LLVM 17](llvm/llvm-project@d41a73a))
  - This is used to optimize synchronization primitives (e.g., Linux uses this for spinlocks (torvalds/linux@b8ddb0d)).

Btw, the question of whether `zaamo` is implied by `zabha` or not, which was discussed in rust-lang#130877, has been resolved in LLVM 20, since LLVM now treats `zaamo` as implied by `zabha`/`zacas` (llvm/llvm-project#115694), just like GCC and rustc.

r? `@Amanieu`

`@rustbot` label +O-riscv +A-target-feature
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Feb 25, 2025
Rollup merge of rust-lang#137417 - taiki-e:riscv-atomic, r=Amanieu

rustc_target: Add more RISC-V atomic-related features

This is a continuation of rust-lang#130877 and adds a few target features, including `zacas`, which was experimental in LLVM 19 and marked non-experimental in LLVM 20.

This adds the following target features to unstable riscv_target_feature:

- `za64rs` (Za64rs Extension 1.0): Reservation Set Size of at Most 64 Bytes
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L227-L228), [available since LLVM 18](llvm/llvm-project@8649328))
- `za128rs` (Za128rs Extension 1.0): Reservation Set Size of at Most 128 Bytes
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L230-L231), [available since LLVM 18](llvm/llvm-project@8649328))
  - IIUC, `za*rs` can be referenced when implementing helpers to reduce contention in synchronization primitives, like [`crossbeam_utils::CachePadded`](https://docs.rs/crossbeam-utils/latest/crossbeam_utils/struct.CachePadded.html). (relevant discussion: riscv/riscv-profiles#79)
- `zacas` (Zacas Extension 1.0): Atomic Compare-And-Swap Instructions (`amocas.{w,d,q}{,.aq,.rl,.aqrl}` and `amocas.{b,h}{,.aq,.rl,.aqrl}` when `zabha` is also enabled)
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L240-L243), [available as non-experimental since LLVM 20](llvm/llvm-project@614aeda))
  - This implies `zaamo`.
  - This is used to optimize CAS in existing atomics and/or implement 64-bit/128-bit atomics on riscv32/riscv64 (e.g., taiki-e/portable-atomic#173).
  - Note that [LLVM does not automatically use this instruction for 64-bit/128-bit atomics on riscv32/riscv64 even if this feature is enabled, because doing it changes the ABI](https://github.com/llvm/llvm-project/blob/876174ffd7533dc220f94721173bb767b659fa7f/llvm/docs/RISCVUsage.rst#riscv-zacas-note). (If the ability to do that is provided by LLVM in the future, it should probably be controlled by another ABI feature similar to `forced-atomics`.)
- `zama16b` (Zama16b Extension 1.0): Atomic 16-byte misaligned loads, stores and AMOs
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L255-L256), [available since LLVM 19](llvm/llvm-project@b090569))
  - IIUC, unlike AArch64 FEAT_LSE2 which also makes 16-byte aligned ldp ({i,u}128 load) atomic, this extension only affects instructions that already considered atomic if they were naturally aligned. i.e., fld (f64 load) on riscv32 would not be atomic with or without this extension ([relevant QEMU code](https://github.com/qemu/qemu/blob/b69801dd6b1eb4d107f7c2f643adf0a4e3ec9124/target/riscv/insn_trans/trans_rvd.c.inc#L50-L62)).
- `zawrs` (Zawrs Extension 1.0): Wait on Reservation Set (`wrs.nto` and `wrs.sto`)
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L258), [available as non-experimental since LLVM 17](llvm/llvm-project@d41a73a))
  - This is used to optimize synchronization primitives (e.g., Linux uses this for spinlocks (torvalds/linux@b8ddb0d)).

Btw, the question of whether `zaamo` is implied by `zabha` or not, which was discussed in rust-lang#130877, has been resolved in LLVM 20, since LLVM now treats `zaamo` as implied by `zabha`/`zacas` (llvm/llvm-project#115694), just like GCC and rustc.

r? `@Amanieu`

`@rustbot` label +O-riscv +A-target-feature
RalfJung pushed a commit to RalfJung/miri that referenced this pull request Feb 25, 2025
rustc_target: Add more RISC-V atomic-related features

This is a continuation of rust-lang/rust#130877 and adds a few target features, including `zacas`, which was experimental in LLVM 19 and marked non-experimental in LLVM 20.

This adds the following target features to unstable riscv_target_feature:

- `za64rs` (Za64rs Extension 1.0): Reservation Set Size of at Most 64 Bytes
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L227-L228), [available since LLVM 18](llvm/llvm-project@8649328))
- `za128rs` (Za128rs Extension 1.0): Reservation Set Size of at Most 128 Bytes
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L230-L231), [available since LLVM 18](llvm/llvm-project@8649328))
  - IIUC, `za*rs` can be referenced when implementing helpers to reduce contention in synchronization primitives, like [`crossbeam_utils::CachePadded`](https://docs.rs/crossbeam-utils/latest/crossbeam_utils/struct.CachePadded.html). (relevant discussion: riscv/riscv-profiles#79)
- `zacas` (Zacas Extension 1.0): Atomic Compare-And-Swap Instructions (`amocas.{w,d,q}{,.aq,.rl,.aqrl}` and `amocas.{b,h}{,.aq,.rl,.aqrl}` when `zabha` is also enabled)
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L240-L243), [available as non-experimental since LLVM 20](llvm/llvm-project@614aeda))
  - This implies `zaamo`.
  - This is used to optimize CAS in existing atomics and/or implement 64-bit/128-bit atomics on riscv32/riscv64 (e.g., taiki-e/portable-atomic#173).
  - Note that [LLVM does not automatically use this instruction for 64-bit/128-bit atomics on riscv32/riscv64 even if this feature is enabled, because doing it changes the ABI](https://github.com/llvm/llvm-project/blob/876174ffd7533dc220f94721173bb767b659fa7f/llvm/docs/RISCVUsage.rst#riscv-zacas-note). (If the ability to do that is provided by LLVM in the future, it should probably be controlled by another ABI feature similar to `forced-atomics`.)
- `zama16b` (Zama16b Extension 1.0): Atomic 16-byte misaligned loads, stores and AMOs
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L255-L256), [available since LLVM 19](llvm/llvm-project@b090569))
  - IIUC, unlike AArch64 FEAT_LSE2 which also makes 16-byte aligned ldp ({i,u}128 load) atomic, this extension only affects instructions that already considered atomic if they were naturally aligned. i.e., fld (f64 load) on riscv32 would not be atomic with or without this extension ([relevant QEMU code](https://github.com/qemu/qemu/blob/b69801dd6b1eb4d107f7c2f643adf0a4e3ec9124/target/riscv/insn_trans/trans_rvd.c.inc#L50-L62)).
- `zawrs` (Zawrs Extension 1.0): Wait on Reservation Set (`wrs.nto` and `wrs.sto`)
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L258), [available as non-experimental since LLVM 17](llvm/llvm-project@d41a73a))
  - This is used to optimize synchronization primitives (e.g., Linux uses this for spinlocks (torvalds/linux@b8ddb0d)).

Btw, the question of whether `zaamo` is implied by `zabha` or not, which was discussed in rust-lang/rust#130877, has been resolved in LLVM 20, since LLVM now treats `zaamo` as implied by `zabha`/`zacas` (llvm/llvm-project#115694), just like GCC and rustc.

r? `@Amanieu`

`@rustbot` label +O-riscv +A-target-feature
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-riscv Target: RISC-V architecture
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant