-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Improve Sabre heuristic defaults #14458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit removes the scaling of the basic heuristic of Sabre inversely proportional to the front-layer population in all of Qiskit's defaults. Previously, the "basic" heuristic of Sabre had a scaling factor applied to it such that the _total_ contribution of the front layer to the full score of a swap remained approximately the same magnitude (scaled by the sums of the distances), no matter how many gates were in the layer. However, since the implementation of relative scoring[^1], we realise that only the score delta caused by a swap matters. A swap can affect at most two gates in the front layer, so scaling down the impact of this based on the population of the front layer only has the effect of allowing other heuristics to entirely dominate the "basic" component, when really the basic heuristic is the most critical for forwards progress in the algorithm. In the worst-case scenario of a front-layer populated with more than 40 gates, the extended set with a fixed size of 20 gates, a relative weight of 0.5 and similar population-based scaling would entirely override the front-layer heuristic, which would only be used as a tie-break. For such circuits, this caused the release-valve mechanism to trigger on almost every gate in the worst cases, completely destroying the performance and quality of the lookahead methods. This was observed in currently unrealistic circuits like a 1081-width, 10-depth volume circuit, and 400+ qubit QFTs, but is likely to have had some effect at smaller circuits. [^1]: 02a1939: Calculate relative, not absolute, scores in SabreSwap
One or more of the following people are relevant to this code:
|
Pull Request Test Coverage Report for Build 15309357816Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM. the change makes sense to me. I'll spin up a benchpress run with this and compare it against the 2.0.2 data I generated for testing it fixed #14402
heuristic = ( | ||
Heuristic(attempt_limit=10 * coupling_map.size()) | ||
.with_basic(1.0, SetScaling.Size) | ||
.with_basic(1.0, SetScaling.Constant) | ||
.with_lookahead(0.5, 20, SetScaling.Size) | ||
.with_decay(0.001, 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're still using the decay here? I'm surprised you didn't want to remove it because I thought it didn't help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't done the research legwork to properly make that claim, so I'd rather leave it and do it in a separate PR once I can back it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a benchpress run locally and compared it to my 2.0.2 run from last week and compared to the 2q gate count:
For the most part this looks like things are behaving the same, it's hard to say for certain though without a statistical sample over multiple seeds which benchpress doesn't provide. But, I think the analysis in the commit message and PR summary is convincing that this is a good change to make even if it doesn't impact the swap count substantially in some benchmarks. There isn't a clear regression highlighted and that's more important.
The most outlying improvement dot, in line ~1000 2q gates in this PR and ~2600 gates in 2.0.2, is |
The other notable outlier (at ~75k 2q gates in this PR, ~150k 2q gates in 2.0.2) is |
* Improve Sabre heuristic defaults This commit removes the scaling of the basic heuristic of Sabre inversely proportional to the front-layer population in all of Qiskit's defaults. Previously, the "basic" heuristic of Sabre had a scaling factor applied to it such that the _total_ contribution of the front layer to the full score of a swap remained approximately the same magnitude (scaled by the sums of the distances), no matter how many gates were in the layer. However, since the implementation of relative scoring[^1], we realise that only the score delta caused by a swap matters. A swap can affect at most two gates in the front layer, so scaling down the impact of this based on the population of the front layer only has the effect of allowing other heuristics to entirely dominate the "basic" component, when really the basic heuristic is the most critical for forwards progress in the algorithm. In the worst-case scenario of a front-layer populated with more than 40 gates, the extended set with a fixed size of 20 gates, a relative weight of 0.5 and similar population-based scaling would entirely override the front-layer heuristic, which would only be used as a tie-break. For such circuits, this caused the release-valve mechanism to trigger on almost every gate in the worst cases, completely destroying the performance and quality of the lookahead methods. This was observed in currently unrealistic circuits like a 1081-width, 10-depth volume circuit, and 400+ qubit QFTs, but is likely to have had some effect at smaller circuits. [^1]: 02a1939: Calculate relative, not absolute, scores in SabreSwap * Restore test coverage and comments * Change release note category
This commit removes the scaling of the basic heuristic of Sabre inversely proportional to the front-layer population in all of Qiskit's defaults.
Previously, the "basic" heuristic of Sabre had a scaling factor applied to it such that the total contribution of the front layer to the full score of a swap remained approximately the same magnitude (scaled by the sums of the distances), no matter how many gates were in the layer. However, since the implementation of relative scoring1, we realise that only the score delta caused by a swap matters. A swap can affect at most two gates in the front layer, so scaling down the impact of this based on the population of the front layer only has the effect of allowing other heuristics to entirely dominate the "basic" component, when really the basic heuristic is the most critical for forwards progress in the algorithm.
In the worst-case scenario of a front-layer populated with more than 40 gates, the extended set with a fixed size of 20 gates, a relative weight of 0.5 and similar population-based scaling would entirely override the front-layer heuristic, which would only be used as a tie-break. For such circuits, this caused the release-valve mechanism to trigger on almost every gate in the worst cases, completely destroying the performance and quality of the lookahead methods. This was observed in currently unrealistic circuits like a 1081-width, 10-depth volume circuit, and 400+ qubit QFTs, but is likely to have had some effect at smaller circuits.
Summary
Details and comments
Benchmarking a small selection of indicative quality trackers:
The ~15% regression in a coupling of the utility-scale benchmarks are not concerning; this is single-seed runs, so there's some variation. What's we care about is the improvement of
lookahead
(and to a lesser degreedecay
, whose flaws are now more exposed) on the large-scale benchmarks.This graph is showing where the improvement is coming from. for something approximately equivalent to the
LargeQuantumVolumeMapping
suite. Top graph is this PR, bottom is its parent. The x-axis is "iterations", which is the number of times the swap-scorer is invoked (so it's a measure of progress through the algorithm). Solid traces are the number of candidate swaps (that line was me tracing information for another change I want to make). The dashed traces are the number of gates remaining to route. Vertical dotted lines mark points where the release valve triggers.The takeaway is that the previous
lookahead
anddecay
defaults were completely broken at this scale; they stymie the algorithm from actually making forwards progress. The new form fixes them.I have several more takeaways and improvements to follow on from this preliminary analysis, but this is a first step that fixes our super-scaling.
Footnotes
02a1939: Calculate relative, not absolute, scores in SabreSwap ↩