-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[unitaryhack] DAGDependency performance improvement #8081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[unitaryhack] DAGDependency performance improvement #8081
Conversation
Thank you for opening a new pull request. Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient. While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone. One or more of the the following people are requested to review this:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on @Sebastian-Brandhofer ! It looks like there were some test failures, though all within the Operator
tests, which I wouldn't expect to be impacted by the changes here: https://dev.azure.com/qiskit-ci/qiskit-terra/_build/results?buildId=36448&view=logs&j=e4ea27c2-3a16-5b32-7be8-0bf30fe9e828&t=25d46f44-c831-5e21-7716-9054a5afcf2d&l=17434
Would it be possible to compare performance between this PR and main over some random circuits, similar to what was done in #8078 ?
Sure, the runtime is roughly halved:
|
The code seems to have some issues with the test runner though. E.g. the tests
|
Thanks @Sebastian-Brandhofer. I don't fully understand the reason for test failures above, but calling
copies the array, while setting
modifies the array in place. Which means that your code may inadvertently modify matrices for already existing gates. Which might explain why the test failures are only exhibited when running multiple tests and not just the last problematic test. It's also interesting that executing the following code:
gives the output:
That is, the first call to And a bit of an off-topic question. What did you use to produce such wonderful graphics above? I have recently started using SnakeViz, but the graphics above look better. Additionally, are your runtimes with profiler enabled or disabled? |
Thanks @alexanderivrii, that is probably the reason for the test fails. I expected to_matrix to return a copy. The runtimes are with a profiler disabled (and also without your code) and the graphs are called flame graphs ( https://github.com/baverman/flameprof ) which you can generate out of cProfile's output. :-) |
New runtime using np.reshape instead of .shape
|
qiskit/dagcircuit/dagdependency.py
Outdated
return np.allclose(op12, op21) | ||
maybe_zero = np.abs(op12 - op21) | ||
maybe_zero[np.abs(maybe_zero) < 1e-7] = 0 | ||
return not maybe_zero.any() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change require?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.allclose seems to be much slower in this use case than lines 644-646.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the double np.abs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! Yeah, thanks for spotting this! :)
Pull Request Test Coverage Report for Build 2415450922
💛 - Coveralls |
@@ -567,6 +568,12 @@ def merge_no_duplicates(*iterables): | |||
yield val | |||
|
|||
|
|||
@lru_cache(maxsize=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe with a maxsize?
name refactoring Co-authored-by: Luciano Bello <bel@zurich.ibm.com>
Summary
A few quick fixes for the general case of checking the commutation of two gates. This reduces the runtime to roughly a third on my machine. np.allclose and np.reshape seem to be particularly slow with the used tensor structures. I also think the matrix multiplication should be refactored -- it is certainly beneficial to write them up like this for large matrices but it seems for smaller matrices, as typical for quantum gates, normal tensor and matrix products are faster.
See also #8020.
Details and comments
Original runtime:

Updated:
