Skip to content

Conversation

mtreinish
Copy link
Member

Summary

This commit migrates the inner loop for apply_operation_back and
apply_operation_front to leverage retworkx's
insert_node_between_multiple PyDiGraph method. This retworkx method
performs essentially the same loop internally but executes faster.

Details and comments

Depends on Qiskit/rustworkx#181 being included in
a release.

This commit migrates the inner loop for apply_operation_back and
apply_operation_front to leverage retworkx's
insert_node_between_multiple PyDiGraph method. This retworkx method
performs essentially the same loop internally but executes faster.

Depends on Qiskit/rustworkx#181 being included in
a release.
@mtreinish mtreinish requested a review from a team as a code owner October 20, 2020 22:07
@mtreinish mtreinish added on hold Can not fix yet performance labels Oct 20, 2020
@mtreinish
Copy link
Member Author

Just to show some initial numbers on this. It's showing a ~2X speed improvement for apply_operation_back() when running circuit_to_dag() on a random_circuit() with 65 qubits and a depth of 4096, For that test it goes from this on master:

Screenshot_2020-10-21_15-06-32

To this with this PR applied (and retworkx patched):

Screenshot_2020-10-21_15-06-06

The interesting things are after this PR the bottlenecks for apply operation back are a 3 way tie between calling __repr__ on Bit objects which is ~15% of the time spent on this cumulatively (I address this in #5272), the list comprehension to get the node_ids of the output nodes (at L391) which is ~16% of the function time, and calling _check_bitswhich is ~15% of the run time (which is spending most of it's time doing Bit.__eq__ and Register.__eq__).

I'm thinking that after this and #5272 we might be able to speed up the other 2 bottlenecks by changing the Bit.__eq__ and Register.__eq__ to do a single repr comparison since they'll be pre-generated and it's a single comparison. But, we'll have to benchmark and compare to see if that makes a big difference or not.

@mtreinish
Copy link
Member Author

mtreinish commented Oct 21, 2020

I added the __eq__ change to use _repr in #5272 as I mentioned in the last comment and this continues the process of speeding up apply_operation_back further. With this PR and #5272 together with the previous example apply_operation_back() takes cumulatively ~1.7 seconds (down from ~4.6 sec on master):

Screenshot_2020-10-21_16-39-53

@mtreinish mtreinish removed the on hold Can not fix yet label Nov 12, 2020
@mtreinish
Copy link
Member Author

retworkx 0.6.0 was released this morning, so this is no longer blocked.

@mtreinish mtreinish added this to the 0.17 milestone Nov 12, 2020
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Nov 16, 2020
In sabre_swap the comprehension in _score_heuristic() with the basic
heuristic starts to become a bottleneck as the size of the coupling map
increases. This probably becomes the top bottleneck in a transpilation
after Qiskit#5316, Qiskit#5183, Qiskit#5294, Qiskit#5267, and Qiskit#5272 are merged. This commit is
an attempt to try and mitigate that a bit by using numpy native
operations instead of a python comprehension in sum(). If this is
insufficient we'll likely have to either avoid doing a sum like this or
drop down to a lower level with cython or rust and operate on the array
directly to remove this bottleneck.
@mergify mergify bot merged commit 4e03ad1 into Qiskit:master Nov 25, 2020
mergify bot added a commit that referenced this pull request Nov 25, 2020
* Use numpy sum instead of comprehension in _score_heuristic

In sabre_swap the comprehension in _score_heuristic() with the basic
heuristic starts to become a bottleneck as the size of the coupling map
increases. This probably becomes the top bottleneck in a transpilation
after #5316, #5183, #5294, #5267, and #5272 are merged. This commit is
an attempt to try and mitigate that a bit by using numpy native
operations instead of a python comprehension in sum(). If this is
insufficient we'll likely have to either avoid doing a sum like this or
drop down to a lower level with cython or rust and operate on the array
directly to remove this bottleneck.

* Make distance matrix a coupling map property

* Fix tests

* Fix lint

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
@mtreinish mtreinish deleted the retworkx-apply-operation-dagcircuit branch November 26, 2020 15:52
@kdk kdk added the Changelog: None Do not include in changelog label Mar 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants