-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Fix MergeAdjacentBarriers
to scale with number of barriers
#10900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
One or more of the the following people are requested to review this:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in here look sensible to me, one copy
aside.
Fundamentally, unless we can "jump" to only the barriers, I don't see how we could implement this as linear in the number of barriers rather than the number of all operations, because we have to inspect each operation to find the barriers. The current setup is the construction-space improvement we could have expected, though.
There's a larger-scale change we could consider making in a more wide-ranging PR: add an analysis pass that adds name: set[DAGOpNode]
data to the property set for given names, and then update relevant passes to say whether they preserve that information or not. That doesn't solve the problem of the analysis step being linear, but it's possible that it could ameliorate the cost if we know we might need to run the same sort of transformation pass multiple times. It's not clear to me that that would be helpful in this case, but maybe it's worth thinking about (in the context of an entire transpilation pipeline).
Pull Request Test Coverage Report for Build 6312989890
💛 - Coveralls |
releasenotes/notes/improve-performance-merge-adjacent-barriers-pass-c0cd064bdd059811.yaml
Outdated
Show resolved
Hide resolved
This is true, I'm also thinking shorter term we might be able to rely on |
|
Yeah exactly, it's still linear time overhead just hopefully a bit faster by doing more work in rust. |
I can give |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me now. I was concerned at first about the determinism of the output, since we appear to be removing the sorting, but DAGCircuit.replace_block_with_op
is required to impose a canonical order on the qargs/cargs (which it does via sorting using wire_pos_map
) in order to have general-case non-symmetric replacements work. We also have an explicit test for non-determinism of the pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, just one small inline performance tweak we can make to slightly improve it further.
Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
This is somewhat off-topic, but we could reimplement
Under the hood this probably does a bit more work than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sasha's point is interesting to me, and as we scale up, I think we might want to start trying to consolidate more of our passes into general utilities than having specific individual passes, if not only to lower the maintenance burden. This pass shouldn't be performance critical, so a minor regression on speed in favour of less to maintain sounds like it could make sense to me.
That said, this PR is good to go following Matt's last review and Elena's changes, so let's merge and see if we want to pull on the other thread later.
…10900) * Modify scaling * Remove copy, add reno * Fix lint * Fix reno * Update qiskit/transpiler/passes/utils/merge_adjacent_barriers.py Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * Fix black --------- Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
Summary
Proposed fix for #10860. I found that the easiest path would be to return the nodes to be replaced in the
_collect_potential_merges
method, which I also modified to match the documented return type, as it can now return barriers directly (it was returning the barrier qubits before).Details and comments
I tried profiling the before/after with this circuit (modifying the number of hadamards):
And the new version is faster in general but still scales linearly with the number of gates, I guess this is still somewhat expected?