-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Improve efficiency of vf2 pass search with free nodes #9148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit improves the efficiency of the VF2Layout and VF2PostLayout pass when there are qubits that only contain single qubit operations. Previously these passes would model these qubits as free nodes in the interaction graph. This would cause an explosion in the number of possible isomorphic mappings because vf2_mapping() will return a potential subgraph for every permutation of free nodes on the coupling graph. This means we end up spending a potentially huge amount of time scoring these permutations when we can more effectively place these free nodes as part of a dedicated step after placing the subgraph. This is only true for strict_direction=False because for strict_direction=True with a Target we need to rely on the subgraph isomorphism to ensure we're fully evaluating subgraphs comply with local operation availablility. When there are free nodes in an interaction graph this changes the interaction graph to strip those out of the graph before checking for isomorphic subgraphs. This greatly reduces the search space and should speed up that iterative scoring. After we've selected the best subgraph of nodes with 2q interactions before returning the final layout we evaluate the free nodes and pick an available qubit with lowest 1q error for each free node.
Thank you for opening a new pull request. Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient. While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone. One or more of the the following people are requested to review this:
|
Pull Request Test Coverage Report for Build 3952363540
💛 - Coveralls |
Since I saw this pop up again, it might be nice to pick the free qubits based on the error that is worst. Namely, the 1Q errors are much smaller on IBM hardware than the measurement errors. So the latter as a score would make more sense. But that is not true in general, so one might have to query the device for which is worse. As most sane use of the transpiler would collapse multiple 1Q gates down, basing things on the meas error makes a lot of sense in most cases. |
Paul: 1q are much smaller, but a given circuit is also likely to have many 1q gates per measurement, so even in the situation you're describing, it's not necessarily a done deal that measurement will dominate, I think? For efficiency reasons in scoring, I think (but could be wrong) that it's impractical for us to produce the completely accurate score that One way is we could potentially look at treating "measure" and "1q gate" separately in the scoring, although as you say, measures aren't necessarily much worse on non-superconducting hardware. Alternatively, we could weight the average error per qubit proportional to the number of each instruction applied in the whole circuit. |
So, unless I am misunderstanding, this PR addresses individual qubits with nothing other than 1Q gates on them, and measurements (without the measurement I could care less that the qubit is there). The thing is that in 99% of cases, the transpiler is going to take your 1Q gates and collapse them to a U3 and spit it back out in terms of at most 2 sx gates (on IBM hardware). sx gate error is on the order of 1e-4, where as measurements are ~1e-2. So if one worried about errors, in most cases measurements are the thing to worry about in this case. |
Oh yeah, that is true - I wasn't actually thinking about the context of this PR requiring each of those qubits to be non-interacting. We do include the measurement in the weight (arithmetic mean of each of the 1q operations inc measure available on that qubit), though, and if the measurements dominate the gates as you expect, then the heuristic we use will largely decay to what you're suggesting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code-wise this seems fine to me, minor question about scope aside. Do you want to add a performance "feature" release note?
for node_index in bit_map.values(): | ||
bit_list[node_index] = sum(im_graph[node_index].values()) | ||
if strict_direction: | ||
for node_index in bit_map.values(): | ||
bit_list[node_index] = sum(im_graph[node_index].values()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and the bit in build_average_error_map
) look like a slightly separate bugfix. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, although it's semi related I think (it's been a while since I looked at this patch). The strict direction case doesn't use the common base between the passes in the same way (since vf2post has to do a lot more checking to ensure direcitonality constraints and target basis are preserved). Splitting out the 1q piece from things broke things IIRC (only vaguely remember) so we needed the if condition to branch the logic on that otherwise it failed.
If your taking the product of the 1Q and meas errors then all good. It looks like I was just interpreting this statement to mean only the latter was evaluated:
|
I didn't really feel it was necessary. I was treating this more of a small bugfix than a performance improvement TBH. But I can add one if you think it's necessary. |
Yeah this was me just being a bit too precise and not explaining the context, I meant the 1q terms in the average error map that gets built internally. As Jake said by default that's the arithmetic mean of all the 1q operations (including measurement errors) reported by the backend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine without a release note too, no worries.
In Qiskit#9148 a bug was introduced into the vf2 scoring code. In that PR the vf2 passes were changed to treat standalone qubits as a special case to improve the algorithmic efficiency of the pass. However, that PR broke the scoring algorithm so that it was no longer factoring in the 1q component of the interaction graph when scoring a layout. This meant that when scoring a potential layout only the 2q error rates were been factored into the score and the pass was potentially selecting worse performing qubits. This commit fixes this error and ensures the scoring is looking at all the error rates.
* Fix interaction graph vf2 scoring to include 1q component In #9148 a bug was introduced into the vf2 scoring code. In that PR the vf2 passes were changed to treat standalone qubits as a special case to improve the algorithmic efficiency of the pass. However, that PR broke the scoring algorithm so that it was no longer factoring in the 1q component of the interaction graph when scoring a layout. This meant that when scoring a potential layout only the 2q error rates were been factored into the score and the pass was potentially selecting worse performing qubits. This commit fixes this error and ensures the scoring is looking at all the error rates. * Update qiskit/transpiler/passes/layout/vf2_utils.py
* Fix interaction graph vf2 scoring to include 1q component In #9148 a bug was introduced into the vf2 scoring code. In that PR the vf2 passes were changed to treat standalone qubits as a special case to improve the algorithmic efficiency of the pass. However, that PR broke the scoring algorithm so that it was no longer factoring in the 1q component of the interaction graph when scoring a layout. This meant that when scoring a potential layout only the 2q error rates were been factored into the score and the pass was potentially selecting worse performing qubits. This commit fixes this error and ensures the scoring is looking at all the error rates. * Update qiskit/transpiler/passes/layout/vf2_utils.py (cherry picked from commit 0e44c5e)
…10086) * Fix interaction graph vf2 scoring to include 1q component In #9148 a bug was introduced into the vf2 scoring code. In that PR the vf2 passes were changed to treat standalone qubits as a special case to improve the algorithmic efficiency of the pass. However, that PR broke the scoring algorithm so that it was no longer factoring in the 1q component of the interaction graph when scoring a layout. This meant that when scoring a potential layout only the 2q error rates were been factored into the score and the pass was potentially selecting worse performing qubits. This commit fixes this error and ensures the scoring is looking at all the error rates. * Update qiskit/transpiler/passes/layout/vf2_utils.py (cherry picked from commit 0e44c5e) Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
* Fix interaction graph vf2 scoring to include 1q component In Qiskit#9148 a bug was introduced into the vf2 scoring code. In that PR the vf2 passes were changed to treat standalone qubits as a special case to improve the algorithmic efficiency of the pass. However, that PR broke the scoring algorithm so that it was no longer factoring in the 1q component of the interaction graph when scoring a layout. This meant that when scoring a potential layout only the 2q error rates were been factored into the score and the pass was potentially selecting worse performing qubits. This commit fixes this error and ensures the scoring is looking at all the error rates. * Update qiskit/transpiler/passes/layout/vf2_utils.py
This commit address some of the panics and failures seen in scheduling tests. The source of the errors was two fold, the first was the layout definition coming from the mapping is sparse but we were using nlayout which doesn't handle that well so we ended up with several sentinel values persiting to scoring when scoring the layout which caused errors. The second issue was we were missing the special handling of free qubits that was added as a performance optimization in Qiskit#9148 which was causing other failures since some of the code was expecting that.
* Port the full vf2layout pass to Rust This commit provides a full path Rust implementation of the VF2Layout pass that performs the full pass logic in Rust. We had most of the pass in Rust already by leveraging rustworkx for its vf2 implementation and the scoring logic in Rust. So all that we did in Python was create the graphs as inputs to vf2, building the error map and other rust view objects (since the scoring in Rust predates our rust data model in Qiskit) and then process each found layout for final output. This means the performance improvement expected by this commit is not large since most of the heavy lifting was already in Rust. Instead what this really enables is running the pass in a Python-free context, mainly in a C API for executing this pass which will come in a follow up. This rust implementation does not cover the full feature set of what the Python implementation offers. Mainly in the case of no target, coupling graph randomization (which is a shame we forgot to remove in 2.0, since it almost always produces worse results), or a user provided error map we rely on the existing Python implementation. This is a tradeoff because these use cases are non-standard and having a Rust interface for running in that mode provides little benefit for the interfaces we're building. Fixes #12277 * Fix idle qubit handling in layout and free qubit mapping This commit address some of the panics and failures seen in scheduling tests. The source of the errors was two fold, the first was the layout definition coming from the mapping is sparse but we were using nlayout which doesn't handle that well so we ended up with several sentinel values persiting to scoring when scoring the layout which caused errors. The second issue was we were missing the special handling of free qubits that was added as a performance optimization in #9148 which was causing other failures since some of the code was expecting that. * Fix circuit->dag conversion to preserve vars * Normalize internal scoring on typed mapping This commit fixes several issues all stemming from the same root cause. There are multiple mappings of indices to qubits to keep track of and the previous commits in this PR branch were not leveraging the type system to keep these sane. This resulted in getting the mapping between representations incorrect at various points that was causing failures and/or panics when the mappings were not resolved correctly. This commit fixes the issue by moving to a typed representation of the layout as soon as posible. This forces all the places that reference a potential layout to reason about the various mappings correctly. * Remove log assertions for rust code path Now that the pass runs in rust we don't emit python logs for the running of the pass. This commit removes the test log assertions since they're no longer present. * Fix free qubit mapping This commit fixes the logic in the free qubit mapping and also skips running vf2 if there are no qubits in the interaction graph and there are free qubits. * Make Target public * Use average error map instead of repeated target querying The earlier approaches were querying the target for each gate while iterating over all the trials. This ends up doing far more hashmap lookups (2x for each gate on a node or edge for each layout) this ends up dominating the runtime. This also fixes an inconsistency in how the scoring worked because we weren't correctly accounting for the number of interactions in the circuit. * Don't panic in free qubits if there are no error rates * Reverse free qubit sort order to use lowest error first This commit fixes a bug that slipped in during an earlier fix where the sort order for the free qubits mapping was accidentally flipped. The intent is to put the lowest error free qubit at the end of the vec so it's the first popped from the stack when mapping it to a qubit with solely single qubit operations. However the comparison got reversed accidentally and that was causing the highest error rate qubits to be used instead. This commit fixes this oversight and fixes the one fialing visualization test that was using the wrong qubit compared to the expected one. * Rename internal scoring function since it doesn't use the target directly anymore * Stop caching sort key in free qubit mapping * Fix typ in comment
Summary
This commit improves the efficiency of the VF2Layout and VF2PostLayout pass when there are qubits that only contain single qubit operations. Previously these passes would model these qubits as free nodes in the interaction graph. This would cause an explosion in the number of possible isomorphic mappings because vf2_mapping() will return a potential subgraph for every permutation of free nodes on the coupling graph. This means we end up spending a potentially huge amount of time scoring these permutations when we can more effectively place these free nodes as part of a dedicated step after placing the subgraph. This is only true for strict_direction=False because for strict_direction=True with a Target we need to rely on the subgraph isomorphism to ensure we're fully evaluating subgraphs comply with local operation availablility.
When there are free nodes in an interaction graph this changes the interaction graph to strip those out of the graph before checking for isomorphic subgraphs. This greatly reduces the search space and should speed up that iterative scoring. After we've selected the best subgraph of nodes with 2q interactions before returning the final layout we evaluate the free nodes and pick an available qubit with lowest 1q error for each free node.
Details and comments
This will have merge conflicts with #9026, I'll rebase either or depending on which merges first.