-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Improve 2q block collection via 1q quaternion-based collection #13649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1q gates are members of the group U(2), which we can represent as a scalar phase term and a member of SU(2). The members of SU(2) can be represented by versors (also called unit quaternions, but I got tired of typing that all the time...). This adds a representation of versors and the group action to the Rust code, and ways to convert from matrix-based forms to the them. This commit introduces `nalgebra` as a dependency, to use its quaternion logic. This is a relatively heavy dependency, especially for something as simple as quaternions, but some of this is in anticipation of moving more matrix code to the static matrices of `nalgebra`, rather than the too-dynamic-for-our-needs ones of `ndarray`; `faer` also offers static matrices, but its APIs continue to heavily fluctuate between versions, and it requires ever higher MSRVs.
Switch the inner algorithm of `ConsolidateBlocks` to use the quaternion form for single-qubit matrix multiplications. This offered a few percentage-points speedup for the block collection for typical `rz-sx-rz-sx-rz`-type runs in quantum-volume-like collections.
Switch the lookup logic for the qargs in the block collector to determine the qargs ordering without additional heap allocations. This offers another modest (~4%) improvement in collection performance for large runs.
Producing the Kronecker product of the two single-qubit matrices from the versor representation is trivially calculable, and can be written into an existing allocation. Similarly, switching the qubit order of a 2q matrix involves only six swaps, and does not need to allocate a new matrix if one is already available. There are lots of places remaining in this code where more matrix allocations could be avoided. If nothing else, it should be possible to allocate only three 2q matrices in total, and keep shuffling the labelling of them when doing `A.B -> C`. `ndarray` does not make this easy, though; `nalgebra` and `faer` both have better interfaces for doing this, but currently all our matrix code in the `Operation` trait is in terms of `ndarray`.
One or more of the following people are relevant to this code:
|
There's also more places we could make use of the versor representation, like in 1q gate optimisation, but that pass currently involves passing matrices through many different parts of its interface with itself, so it'll be more complicated to modify. |
... which were necessary because I'd borked the matrix calculations and forgotten to write any tests of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some quick high level comments. I missed the update and don't want these lost by github weirdness. I'll review in more depth later.
crates/accelerate/src/qi/mod.rs
Outdated
// copyright notice, and modified files need to carry a notice indicating | ||
// that they have been altered from the originals. | ||
|
||
//! Quantum-information and linear-algebra related functionality, typically used as drivers for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to move over some linear algebra functionality like: https://github.com/Qiskit/qiskit/blob/main/crates/accelerate/src/utils.rs and https://github.com/Qiskit/qiskit/blob/main/crates/accelerate/src/synthesis/linear/utils.rs (although that first one might not be needed anymore).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm happy in a follow-up to move a few other bits over. I think there's other loose files and bits and bobs that could probably move into it too, just to keep things a bit more localised.
Pull Request Test Coverage Report for Build 14660431723Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
It's useful to be able to represent elements both of $SU(2)$ and $U(2)$; even the initial version of this PR used representations of $SU(2)$ with a shared phase in the only location the new types were used. This adds a `VersorSU2` struct to make this more easily possible, and splits the interface in the right places to make it possible to use both from other modules.
I fixed the merge conflicts, and also I pivoted the structs a little bit to expose both |
This commit builds off of the native rust representation of a UnitaryGate added in Qiskit#13759 and uses the native representation everywhere we were using UnitaryGate in rust via python previously: the quantum_volume() function and consolidate blocks. One future item is consolidate blocks can be updated to use nalgebra types internally instead of ndarray as for the 1 and 2q cases we know the fixed size of the array ahead of time. However the block consolidation code is built using ndarray currently and later synthesis code also works in ndarray so there isn't any real benefit yet, and we'd just add unecessary conversions and allocations. However, once Qiskit#13649 merges this will change and it would make more sense to add the unitary gate with a Matrix4. But this can be handled separately after this merges.
This commit builds off of the native rust representation of a UnitaryGate added in Qiskit#13759 and uses the native representation everywhere we were using UnitaryGate in rust via python previously: the quantum_volume() function and consolidate blocks. One future item is consolidate blocks can be updated to use nalgebra types internally instead of ndarray as for the 1 and 2q cases we know the fixed size of the array ahead of time. However the block consolidation code is built using ndarray currently and later synthesis code also works in ndarray so there isn't any real benefit yet, and we'd just add unecessary conversions and allocations. However, once Qiskit#13649 merges this will change and it would make more sense to add the unitary gate with a Matrix4. But this can be handled separately after this merges.
This commit builds off of the native rust representation of a UnitaryGate added in Qiskit#13759 and uses the native representation everywhere we were using UnitaryGate in rust via python previously: the quantum_volume() function, consolidate blocks, split2qunitaries, and unitary synthesis. One future item is consolidate blocks can be updated to use nalgebra types internally instead of ndarray as for the 1 and 2q cases we know the fixed size of the array ahead of time. However the block consolidation code is built using ndarray currently and later synthesis code also works in ndarray so there isn't any real benefit yet, and we'd just add unecessary conversions and allocations. However, once Qiskit#13649 merges this will change and it would make more sense to add the unitary gate with a Matrix4. But this can be handled separately after this merges.
This commit builds off of the native rust representation of a UnitaryGate added in #13759 and uses the native representation everywhere we were using UnitaryGate in rust via python previously: the quantum_volume() function, consolidate blocks, split2qunitaries, and unitary synthesis. One future item is consolidate blocks can be updated to use nalgebra types internally instead of ndarray as for the 1 and 2q cases we know the fixed size of the array ahead of time. However the block consolidation code is built using ndarray currently and later synthesis code also works in ndarray so there isn't any real benefit yet, and we'd just add unecessary conversions and allocations. However, once #13649 merges this will change and it would make more sense to add the unitary gate with a Matrix4. But this can be handled separately after this merges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only reviewed the quantum info piece so far. But it looks good to me. I'll follow up tomorrow reviewing the changes to the block converter.
Ok(VersorSU2::identity().with_phase(phase)) | ||
} | ||
StandardGate::HGate => { | ||
debug_assert!(params.is_empty()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should get in the habit of doing this more for underlying assumptions like this.. Not that we build a ton in debug mode, but we do in some places so this would be good to sanity check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up with the convert_2q_block_matrix.rs
file this looks good. Just a couple more small comments.
match ( | ||
qubits_1q.map(|sep| sep.matrix_into(&mut work)), | ||
output_matrix, | ||
) { | ||
(Some(sep), Some(state)) => Ok(sep.dot(&state)), | ||
(None, Some(state)) => Ok(state), | ||
(Some(sep), None) => Ok(sep.to_owned()), | ||
// This shouldn't actually ever trigger, because we expect blocks to be non-empty, but it's | ||
// trivial to handle anyway. | ||
(None, None) => Ok(arr2(&TWO_QUBIT_IDENTITY)), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but it'll be great to have this all done via Matrix4
after this merges, so we can just consolidate to a UnitaryGate
with ArrayType::TwoQ
and avoid allocating a matrix altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah absolutely - I stopped in this PR at the point that I started interacting with the 2q matrices because I wasn't prepared to put too much more time into things I knew were both in flux and needed more major changes.
Should be updated and comments addressed now. There's still quite a lot of inefficiency in how this PR handles 2q matrices, but I thought it's better to leave that til a follow-up and do it more completely throughout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM now thanks for doing this, I'm looking forward to using this and also the better usage of UnitaryGate
in the pass in a follow-up. I left two questions inline one isn't relevant for this PR itself but more about a separate follow-up. The other is about the code, but feel free to enqueue this to merge if you don't want to change it.
fn all_1q_gates() -> Vec<StandardGate> { | ||
(0..STANDARD_GATE_SIZE as u8) | ||
.filter_map(|x| { | ||
::bytemuck::checked::try_cast::<_, StandardGate>(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually have code somewhere that I added a impl TryFrom<u8> for StandardGate
and just made a static [StandardGate; STANDARD_GATE_SIZE]
array to do the conversion. It was part of some code I didn't end up finish or using for anything. But I can dig up the branch and push that as a standalone if you think it would be useful (I added it for a similar use case to this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TryFrom<u8> for StandardGate
feels wrong to me, but not super strongly. It feels nicer somehow using bytemuck
to say "I know I'm doing something a little bit odd", since it feels like we're kind of conjuring semantics out of nowhere - the fact that StandardGate
is backed by integers is (imo) an implementation detail, albeit a fixed one.
If we wanted to expose an iterator over all gates, I'd just expose your [StandardGate; Standard_GATE_SIZE]
as a static
publicly, then you can iterate through it directly. That's what I did with BitTerm
in the tests of SparseObservable
's lookup tables.
Summary
This is a small series of patches, which could be split into 2-3 separate PRs if preferred. There are two main goals:
ConsolidateBlocks
This doesn't do everything that could be done for
ConsolidateBlocks
, but I've stopped at the point where the most natural changes to me now are quite a bit larger.Using this script:
I see a modest (~10%) improvement in the pass time, going from ~117ms to ~106ms.
Details and comments
Individual commit notes:
Add versor-based representation of 1q gates
1q gates are members of the group U(2), which we can represent as a scalar phase term and a member of SU(2). The members of SU(2) can be represented by versors (also called unit quaternions, but I got tired of typing that all the time...).
This adds a representation of versors and the group action to the Rust code, and ways to convert from matrix-based forms to the them.
This commit introduces
nalgebra
as a dependency, to use its quaternion logic. This is a relatively heavy dependency, especially for something as simple as quaternions, but some of this is in anticipation of moving more matrix code to the static matrices ofnalgebra
, rather than the too-dynamic-for-our-needs ones ofndarray
;faer
also offers static matrices, but its APIs continue to heavily fluctuate between versions, and it requires ever higher MSRVs.Use quaternions in 1q block collection
Switch the inner algorithm of
ConsolidateBlocks
to use the quaternion form for single-qubit matrix multiplications. This offered a few percentage-points speedup for the block collection for typicalrz-sx-rz-sx-rz
-type runs in quantum-volume-like collections.Avoid unnecessary allocations in qargs lookup
Switch the lookup logic for the qargs in the block collector to determine the qargs ordering without additional heap allocations. This offers another modest (~4%) improvement in collection performance for large runs.
Avoid allocations in simple matrix operations
Producing the Kronecker product of the two single-qubit matrices from the versor representation is trivially calculable, and can be written into an existing allocation. Similarly, switching the qubit order of a 2q matrix involves only six swaps, and does not need to allocate a new matrix if one is already available.
There are lots of places remaining in this code where more matrix allocations could be avoided. If nothing else, it should be possible to allocate only three 2q matrices in total, and keep shuffling the labelling of them when doing
A.B -> C
.ndarray
does not make this easy, though;nalgebra
andfaer
both have better interfaces for doing this, but currently all our matrix code in theOperation
trait is in terms ofndarray
.