Skip to content

Conversation

mtreinish
Copy link
Member

Summary

This commit improves the efficiency of the VF2Layout and VF2PostLayout pass when there are qubits that only contain single qubit operations. Previously these passes would model these qubits as free nodes in the interaction graph. This would cause an explosion in the number of possible isomorphic mappings because vf2_mapping() will return a potential subgraph for every permutation of free nodes on the coupling graph. This means we end up spending a potentially huge amount of time scoring these permutations when we can more effectively place these free nodes as part of a dedicated step after placing the subgraph. This is only true for strict_direction=False because for strict_direction=True with a Target we need to rely on the subgraph isomorphism to ensure we're fully evaluating subgraphs comply with local operation availablility.

When there are free nodes in an interaction graph this changes the interaction graph to strip those out of the graph before checking for isomorphic subgraphs. This greatly reduces the search space and should speed up that iterative scoring. After we've selected the best subgraph of nodes with 2q interactions before returning the final layout we evaluate the free nodes and pick an available qubit with lowest 1q error for each free node.

Details and comments

This will have merge conflicts with #9026, I'll rebase either or depending on which merges first.

This commit improves the efficiency of the VF2Layout and VF2PostLayout
pass when there are qubits that only contain single qubit operations.
Previously these passes would model these qubits as free nodes in the
interaction graph. This would cause an explosion in the number of
possible isomorphic mappings because vf2_mapping() will return a
potential subgraph for every permutation of free nodes on the coupling
graph. This means we end up spending a potentially huge amount of time
scoring these permutations when we can more effectively place these free
nodes as part of a dedicated step after placing the subgraph. This is
only true for strict_direction=False because for strict_direction=True
with a Target we need to rely on the subgraph isomorphism to ensure
we're fully evaluating subgraphs comply with local operation
availablility.

When there are free nodes in an interaction graph this changes the
interaction graph to strip those out of the graph before checking for
isomorphic subgraphs. This greatly reduces the search space and should
speed up that iterative scoring. After we've selected the best subgraph
of nodes with 2q interactions before returning the final layout we
evaluate the free nodes and pick an available qubit with lowest 1q
error for each free node.
@mtreinish mtreinish added performance Changelog: None Do not include in changelog mod: transpiler Issues and PRs related to Transpiler labels Nov 16, 2022
@mtreinish mtreinish requested a review from a team as a code owner November 16, 2022 22:38
@qiskit-bot
Copy link
Collaborator

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core

@coveralls
Copy link

coveralls commented Nov 16, 2022

Pull Request Test Coverage Report for Build 3952363540

  • 41 of 50 (82.0%) changed or added relevant lines in 5 files are covered.
  • 3 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.04%) to 84.837%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/transpiler/passes/layout/vf2_utils.py 25 26 96.15%
src/error_map.rs 4 7 57.14%
src/vf2_layout.rs 8 13 61.54%
Files with Coverage Reduction New Missed Lines %
qiskit/pulse/library/waveform.py 3 91.67%
Totals Coverage Status
Change from base Build 3952362369: -0.04%
Covered Lines: 65932
Relevant Lines: 77716

💛 - Coveralls

@mtreinish mtreinish added this to the 0.23.0 milestone Jan 12, 2023
@mtreinish mtreinish linked an issue Jan 12, 2023 that may be closed by this pull request
@jakelishman jakelishman self-assigned this Jan 12, 2023
@mtreinish mtreinish added the Rust This PR or issue is related to Rust code in the repository label Jan 12, 2023
@nonhermitian
Copy link
Contributor

Since I saw this pop up again, it might be nice to pick the free qubits based on the error that is worst. Namely, the 1Q errors are much smaller on IBM hardware than the measurement errors. So the latter as a score would make more sense. But that is not true in general, so one might have to query the device for which is worse. As most sane use of the transpiler would collapse multiple 1Q gates down, basing things on the meas error makes a lot of sense in most cases.

@jakelishman
Copy link
Member

Paul: 1q are much smaller, but a given circuit is also likely to have many 1q gates per measurement, so even in the situation you're describing, it's not necessarily a done deal that measurement will dominate, I think? For efficiency reasons in scoring, I think (but could be wrong) that it's impractical for us to produce the completely accurate score that Target does make available, but as long as the calculation of the (qargs -> estimated error) map that we score off is done only once, I think there's plenty of scope for tweaking the heuristics that go into the "estimated error". It'd be best to keep that outside this PR, though - this doesn't change the actual scoring numbers, it just removes isolated 1q islands from the connectivity graph to be laid out, and assigns them in n lg n time based on the same "estimated error".

One way is we could potentially look at treating "measure" and "1q gate" separately in the scoring, although as you say, measures aren't necessarily much worse on non-superconducting hardware. Alternatively, we could weight the average error per qubit proportional to the number of each instruction applied in the whole circuit.

@nonhermitian
Copy link
Contributor

So, unless I am misunderstanding, this PR addresses individual qubits with nothing other than 1Q gates on them, and measurements (without the measurement I could care less that the qubit is there). The thing is that in 99% of cases, the transpiler is going to take your 1Q gates and collapse them to a U3 and spit it back out in terms of at most 2 sx gates (on IBM hardware). sx gate error is on the order of 1e-4, where as measurements are ~1e-2. So if one worried about errors, in most cases measurements are the thing to worry about in this case.

@jakelishman
Copy link
Member

Oh yeah, that is true - I wasn't actually thinking about the context of this PR requiring each of those qubits to be non-interacting. We do include the measurement in the weight (arithmetic mean of each of the 1q operations inc measure available on that qubit), though, and if the measurements dominate the gates as you expect, then the heuristic we use will largely decay to what you're suggesting.

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise this seems fine to me, minor question about scope aside. Do you want to add a performance "feature" release note?

Comment on lines -101 to +116
for node_index in bit_map.values():
bit_list[node_index] = sum(im_graph[node_index].values())
if strict_direction:
for node_index in bit_map.values():
bit_list[node_index] = sum(im_graph[node_index].values())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and the bit in build_average_error_map) look like a slightly separate bugfix. Is that right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, although it's semi related I think (it's been a while since I looked at this patch). The strict direction case doesn't use the common base between the passes in the same way (since vf2post has to do a lot more checking to ensure direcitonality constraints and target basis are preserved). Splitting out the 1q piece from things broke things IIRC (only vaguely remember) so we needed the if condition to branch the logic on that otherwise it failed.

@nonhermitian
Copy link
Contributor

@jakelishman

If your taking the product of the 1Q and meas errors then all good. It looks like I was just interpreting this statement to mean only the latter was evaluated:

we evaluate the free nodes and pick an available qubit with lowest 1q error for each free node

@mtreinish
Copy link
Member Author

Code-wise this seems fine to me, minor question about scope aside. Do you want to add a performance "feature" release note?

I didn't really feel it was necessary. I was treating this more of a small bugfix than a performance improvement TBH. But I can add one if you think it's necessary.

@mtreinish
Copy link
Member Author

If your taking the product of the 1Q and meas errors then all good. It looks like I was just interpreting this statement to mean only the latter was evaluated:

we evaluate the free nodes and pick an available qubit with lowest 1q error for each free node

Yeah this was me just being a bit too precise and not explaining the context, I meant the 1q terms in the average error map that gets built internally. As Jake said by default that's the arithmetic mean of all the 1q operations (including measurement errors) reported by the backend.

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine without a release note too, no worries.

@mergify mergify bot merged commit d4a63d9 into Qiskit:main Jan 18, 2023
@mtreinish mtreinish deleted the vf2-free-node-optimization branch March 21, 2023 19:08
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request May 5, 2023
In Qiskit#9148 a bug was introduced into the vf2 scoring code. In that PR the
vf2 passes were changed to treat standalone qubits as a special case to
improve the algorithmic efficiency of the pass. However, that PR
broke the scoring algorithm so that it was no longer factoring in the 1q
component of the interaction graph when scoring a layout. This meant
that when scoring a potential layout only the 2q error rates were been
factored into the score and the pass was potentially selecting worse
performing qubits. This commit fixes this error and ensures the scoring
is looking at all the error rates.
kevinhartman pushed a commit that referenced this pull request May 8, 2023
* Fix interaction graph vf2 scoring to include 1q component

In #9148 a bug was introduced into the vf2 scoring code. In that PR the
vf2 passes were changed to treat standalone qubits as a special case to
improve the algorithmic efficiency of the pass. However, that PR
broke the scoring algorithm so that it was no longer factoring in the 1q
component of the interaction graph when scoring a layout. This meant
that when scoring a potential layout only the 2q error rates were been
factored into the score and the pass was potentially selecting worse
performing qubits. This commit fixes this error and ensures the scoring
is looking at all the error rates.

* Update qiskit/transpiler/passes/layout/vf2_utils.py
mergify bot pushed a commit that referenced this pull request May 8, 2023
* Fix interaction graph vf2 scoring to include 1q component

In #9148 a bug was introduced into the vf2 scoring code. In that PR the
vf2 passes were changed to treat standalone qubits as a special case to
improve the algorithmic efficiency of the pass. However, that PR
broke the scoring algorithm so that it was no longer factoring in the 1q
component of the interaction graph when scoring a layout. This meant
that when scoring a potential layout only the 2q error rates were been
factored into the score and the pass was potentially selecting worse
performing qubits. This commit fixes this error and ensures the scoring
is looking at all the error rates.

* Update qiskit/transpiler/passes/layout/vf2_utils.py

(cherry picked from commit 0e44c5e)
mtreinish added a commit that referenced this pull request May 8, 2023
…10086)

* Fix interaction graph vf2 scoring to include 1q component

In #9148 a bug was introduced into the vf2 scoring code. In that PR the
vf2 passes were changed to treat standalone qubits as a special case to
improve the algorithmic efficiency of the pass. However, that PR
broke the scoring algorithm so that it was no longer factoring in the 1q
component of the interaction graph when scoring a layout. This meant
that when scoring a potential layout only the 2q error rates were been
factored into the score and the pass was potentially selecting worse
performing qubits. This commit fixes this error and ensures the scoring
is looking at all the error rates.

* Update qiskit/transpiler/passes/layout/vf2_utils.py

(cherry picked from commit 0e44c5e)

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
king-p3nguin pushed a commit to king-p3nguin/qiskit-terra that referenced this pull request May 22, 2023
* Fix interaction graph vf2 scoring to include 1q component

In Qiskit#9148 a bug was introduced into the vf2 scoring code. In that PR the
vf2 passes were changed to treat standalone qubits as a special case to
improve the algorithmic efficiency of the pass. However, that PR
broke the scoring algorithm so that it was no longer factoring in the 1q
component of the interaction graph when scoring a layout. This meant
that when scoring a potential layout only the 2q error rates were been
factored into the score and the pass was potentially selecting worse
performing qubits. This commit fixes this error and ensures the scoring
is looking at all the error rates.

* Update qiskit/transpiler/passes/layout/vf2_utils.py
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Mar 20, 2025
This commit address some of the panics and failures seen in scheduling
tests. The source of the errors was two fold, the first was the layout
definition coming from the mapping is sparse but we were using nlayout
which doesn't handle that well so we ended up with several sentinel
values persiting to scoring when scoring the layout which caused errors.
The second issue was we were missing the special handling of free qubits
that was added as a performance optimization in Qiskit#9148 which was causing
other failures since some of the code was expecting that.
github-merge-queue bot pushed a commit that referenced this pull request Apr 7, 2025
* Port the full vf2layout pass to Rust

This commit provides a full path Rust implementation of the VF2Layout
pass that performs the full pass logic in Rust. We had most of the
pass in Rust already by leveraging rustworkx for its vf2
implementation and the scoring logic in Rust. So all that we did in
Python was create the graphs as inputs to vf2, building the error map
and other rust view objects (since the scoring in Rust predates our
rust data model in Qiskit) and then process each found layout for final
output. This means the performance improvement expected by this commit
is not large since most of the heavy lifting was already in Rust.
Instead what this really enables is running the pass in a Python-free
context, mainly in a C API for executing this pass which will come in a
follow up.

This rust implementation does not cover the full feature set of what the
Python implementation offers. Mainly in the case of no target, coupling
graph randomization (which is a shame we forgot to remove in 2.0, since it
almost always produces worse results), or a user provided error map we
rely on the existing Python implementation. This is a tradeoff because
these use cases are non-standard and having a Rust interface for running
in that mode provides little benefit for the interfaces we're building.

Fixes #12277

* Fix idle qubit handling in layout and free qubit mapping

This commit address some of the panics and failures seen in scheduling
tests. The source of the errors was two fold, the first was the layout
definition coming from the mapping is sparse but we were using nlayout
which doesn't handle that well so we ended up with several sentinel
values persiting to scoring when scoring the layout which caused errors.
The second issue was we were missing the special handling of free qubits
that was added as a performance optimization in #9148 which was causing
other failures since some of the code was expecting that.

* Fix circuit->dag conversion to preserve vars

* Normalize internal scoring on typed mapping

This commit fixes several issues all stemming from the same root cause.
There are multiple mappings of indices to qubits to keep track of and
the previous commits in this PR branch were not leveraging the type
system to keep these sane. This resulted in getting the mapping between
representations incorrect at various points that was causing failures
and/or panics when the mappings were not resolved correctly. This commit
fixes the issue by moving to a typed representation of the layout as
soon as posible. This forces all the places that reference a potential
layout to reason about the various mappings correctly.

* Remove log assertions for rust code path

Now that the pass runs in rust we don't emit python logs for the running
of the pass. This commit removes the test log assertions since they're
no longer present.

* Fix free qubit mapping

This commit fixes the logic in the free qubit mapping and also skips
running vf2 if there are no qubits in the interaction graph and there
are free qubits.

* Make Target public

* Use average error map instead of repeated target querying

The earlier approaches were querying the target for each gate while
iterating over all the trials. This ends up doing far more hashmap
lookups (2x for each gate on a node or edge for each layout) this ends
up dominating the runtime. This also fixes an inconsistency in how the
scoring worked because we weren't correctly accounting for the number of
interactions in the circuit.

* Don't panic in free qubits if there are no error rates

* Reverse free qubit sort order to use lowest error first

This commit fixes a bug that slipped in during an earlier fix where the
sort order for the free qubits mapping was accidentally flipped. The
intent is to put the lowest error free qubit at the end of the vec so
it's the first popped from the stack when mapping it to a qubit with
solely single qubit operations. However the comparison got reversed
accidentally and that was causing the highest error rate qubits to be
used instead. This commit fixes this oversight and fixes the one fialing
visualization test that was using the wrong qubit compared to the
expected one.

* Rename internal scoring function since it doesn't use the target directly anymore

* Stop caching sort key in free qubit mapping

* Fix typ in comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog mod: transpiler Issues and PRs related to Transpiler performance priority: medium Rust This PR or issue is related to Rust code in the repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Long transpile time for simple circuits
5 participants