Skip to content

Conversation

mtreinish
Copy link
Member

@mtreinish mtreinish commented Sep 17, 2021

Summary

This commit updates the layout and coupling map classes to tune them for
performance when used by transpiler passes. Right now the user facing
api is doing a bunch of extra validation to provide helpful user
messages and the internal data structures of both classes are hidden
behind this api. This can add a bunch of overhead to transpiler passes
(specifically routing passes) that are using these APIs in a loop. To
address this commit changes these internal properties to public
attributes and adds slots to the classes to speed up direct attribute
access. This will enable us to tune transpiler passes to access these
attributes more easily and faster. The follow on step here is to update
any transpiler passes using the user facing api to access the attributes
directly.

Details and comments

Fixes #7035

TODO:

  • Benchmark passes using attributes (namely stochastic swap
  • Update other passes to use new attributes

This commit updates the layout and coupling map classes to tune them for
performance when used by transpiler passes. Right now the user facing
api is doing a bunch of extra validation to provide helpful user
messages and the internal data structures of both classes are hidden
behind this api. This can add a bunch of overhead to transpiler passes
(specifically routing passes) that are using these APIs in a loop. To
address this commit changes these internal properties to public
attributes and adds slots to the classes to speed up direct attribute
access. This will enable us to tune transpiler passes to access these
attributes more easily and faster. The follow on step here is to update
any transpiler passes using the user facing api to access the attributes
directly.

Fixes Qiskit#7035
@mtreinish mtreinish requested a review from a team as a code owner September 17, 2021 14:43
Copy link
Member

@kdk kdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do some more investigation before we go down this particular path. Directly renaming the internal attributes leaves us in a state where the objects less well documented and to some degree harder to understand. (I always found ._p2v and ._v2p confusing as attribute names.)

e.g. Layout already has Layout.get_virtual_bits for direct access to ._v2p. Might changing the access pattern in sabre swap to use that instead of the __getitem__ resolve the performance issue? (__getitem__ is also not great for this use-case because it checks ._p2v first on every call, prior to checking ._v2p.)

@mtreinish
Copy link
Member Author

I think we should do some more investigation before we go down this particular path. Directly renaming the internal attributes leaves us in a state where the objects less well documented and to some degree harder to understand. (I always found ._p2v and ._v2p confusing as attribute names.)

e.g. Layout already has Layout.get_virtual_bits for direct access to ._v2p. Might changing the access pattern in sabre swap to use that instead of the __getitem__ resolve the performance issue? (__getitem__ is also not great for this use-case because it checks ._p2v first on every call, prior to checking ._v2p.)

I still need to benchmark things here to see the difference this causes. But I expect it to have a modest improvement in most of the passes I've updated so far. I think the profiling from @georgios-ts in #7035 showed pretty clearly the cumulative overhead of wrapping an attribute with a pyfunction in an inner loop like in sabre (this also lines up with what we looked at in #6493 and the performance improvements that resulted from #6567).

I think what it comes down to for me is more how we name things, I don't mind keeping things private (I was just trying to avoid the _append/append thing like in QuantumCircuit) but I think in passes where we know that the lookup is valid and/or the structure is correctly formed we should be accessing slotted attributes directly, especially as things scale up. So for me it was more a matter of having a bunch of things like coupling_map._dist_matrix[layout._v2p[q_0]][layout._v2p[q_1]] all over our passes or move it something better named (which I agree this isn't making the naming better) and public. I'm honestly fine either way

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have anything to add on the discussion of naming things (personally I've always disliked the a2b form in any library), but I'm always in favour of defining __slots__ for objects that are used a lot, and having an "I know what I'm doing, skip the checks" access pattern available for internal use, provided it's well documented that it's intended to be read-only.

@kdk
Copy link
Member

kdk commented Sep 20, 2021

I think the profiling from @georgios-ts in #7035 showed pretty clearly the cumulative overhead of wrapping an attribute with a pyfunction in an inner loop like in sabre (this also lines up with what we looked at in #6493 and the performance improvements that resulted from #6567).

I think what it comes down to for me is more how we name things, I don't mind keeping things private (I was just trying to avoid the _append/append thing like in QuantumCircuit) but I think in passes where we know that the lookup is valid and/or the structure is correctly formed we should be accessing slotted attributes directly, especially as things scale up.

Agree that @georgios-ts 's numbers show there is room for improvement in the sabre case (but not whether that's the fault of our sabre implementation our of our CouplingMap/Layout object). e.g. some rough numbers I found for @georgios-ts 's test case (on python 3.6, without cprofile):

  • main: 300.69 s
  • Avoid layout.__getitem__ in favor of one call to layout.get_virtual_bits() in _compute_cost: 192.82 s
  • CouplingMap.distance to try/except instead of first range checking of physical_qubit{1,2}: 153.51 s
  • Fetching the self.coupling_map.distance_matrix once in SabreSwap.__init__ and re-using in _compute_cost: 132.10 s

My point was more that we should try to build APIs that are both fast and safe by design (so that we don't end up just pushing these mutability and safety concerns out of our code and into our user's code) and that we document these safety and performance characteristics. That will help both us and our users develop best practices that eliminate some of the low hanging fruit like the above.

So for me it was more a matter of having a bunch of things like coupling_map._dist_matrix[layout._v2p[q_0]][layout._v2p[q_1]] all over our passes or move it something better named (which I agree this isn't making the naming better) and public. I'm honestly fine either way

and having an "I know what I'm doing, skip the checks" access pattern available for internal use, provided it's well documented that it's intended to be read-only.

Agree, and I think the QuantumCircuit._append pattern has worked well in this way (though it too, is insufficiently documented or discoverable). Also, it's worth keeping in mind that our passes are templates for other pass developers, who won't necessarily have the context to know when a call to e.g. coupling_map._dist_matrix or layout._p2v is "safe" (unless we document it somewhere).

@kdk kdk added this to the 0.19 milestone Oct 26, 2021
@mtreinish mtreinish changed the title Make internal layout and coupling map attributes public and slotted Make internal layout and coupling map slotted and adjust passes for fast access Oct 26, 2021
@mtreinish
Copy link
Member Author

I've updated this to not rename the attributes and leave them as private, just add the slotting and adjust the access pattern for speed in the passes where it was relevant (I might have missed some though).

@mtreinish mtreinish requested review from kdk and jakelishman October 26, 2021 16:45
@mtreinish mtreinish changed the title Make internal layout and coupling map slotted and adjust passes for fast access Make internal Layout and CouplingMap attrs slotted and adjust passes for fast access Oct 26, 2021
@mtreinish
Copy link
Member Author

mtreinish commented Oct 26, 2021

I ran some benchmarks with this PR:

Benchmarks that have improved:

       before           after         ratio
     [5652057a]       [adfe5cab]
     <main>       <tune-layout-coupling>
-         771±1ms          700±2ms     0.91  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'sabre', 'sabre')
-         423±1ms          383±1ms     0.91  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'stochastic', 'sabre')
-         471±1ms          425±4ms     0.90  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'sabre', 'sabre')
-       403±0.6ms        362±0.8ms     0.90  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'sabre', 'sabre')
-         1.86±0s       1.66±0.01s     0.89  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'sabre', 'sabre')
-         556±2ms          496±2ms     0.89  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'stochastic', 'sabre')
-         902±4ms          804±6ms     0.89  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'stochastic', 'sabre')
-         636±3ms          565±3ms     0.89  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'sabre', 'sabre')
-         798±2ms          703±5ms     0.88  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'stochastic', 'sabre')
-       503±0.8ms          439±1ms     0.87  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'sabre', 'sabre')
-         793±3ms          674±4ms     0.85  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'sabre', 'sabre')
-         731±1ms          616±1ms     0.84  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'sabre', 'sabre')

Benchmarks that have stayed the same:

       before           after         ratio
     [5652057a]       [adfe5cab]
     <main>       <tune-layout-coupling>
        291±0.6ms        296±0.4ms     1.02  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'stochastic', 'dense')
          726±3ms          737±3ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'stochastic', 'noise_adaptive')
          265±2ms          268±2ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'stochastic', 'noise_adaptive')
          264±1ms          267±2ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'stochastic', 'dense')
          357±2ms          360±1ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'stochastic', 'noise_adaptive')
          547±1ms        551±0.9ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'stochastic', 'dense')
          2.01±0s       2.03±0.01s     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'stochastic', 'dense')
        577±0.7ms          581±4ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'stochastic', 'noise_adaptive')
       1.01±0.01s       1.02±0.01s     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'stochastic', 'dense')
        242±0.8ms        243±0.9ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'stochastic', 'noise_adaptive')
        330±0.5ms          332±1ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'stochastic', 'noise_adaptive')
          509±2ms          511±2ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'stochastic', 'noise_adaptive')
          595±2ms          597±3ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'stochastic', 'dense')
          420±1ms          421±1ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'stochastic', 'noise_adaptive')
          437±1ms          438±3ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'stochastic', 'dense')
        367±0.3ms        368±0.9ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'stochastic', 'dense')
          698±4ms          700±1ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'sabre', 'noise_adaptive')
          373±1ms          374±1ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'stochastic', 'dense')
        223±0.5ms          223±2ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'sabre', 'noise_adaptive')
        400±0.7ms          401±1ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'stochastic', 'noise_adaptive')
        287±0.7ms          288±2ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'sabre', 'dense')
        279±0.6ms          280±2ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'sabre', 'noise_adaptive')
          332±4ms          332±1ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'stochastic', 'noise_adaptive')
       2.43±0.01s       2.43±0.02s     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'stochastic', 'dense')
       1.91±0.01s       1.91±0.01s     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'stochastic', 'noise_adaptive')
          1.58±0s       1.58±0.01s     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'sabre', 'noise_adaptive')
          1.70±0s       1.70±0.02s     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'stochastic', 'noise_adaptive')
          441±1ms          440±1ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'stochastic', 'dense')
        243±0.9ms        242±0.5ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'sabre', 'dense')
       1.58±0.02s          1.58±0s     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'sabre', 'dense')
          269±1ms          267±2ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'sabre', 'noise_adaptive')
          1.20±0s       1.19±0.01s     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'sabre', 'dense')
          684±2ms          678±1ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'sabre', 'dense')
          386±1ms        382±0.9ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'stochastic', 'dense')
          301±1ms        297±0.8ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'sabre', 'dense')
          1.10±0s       1.09±0.01s     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'sabre', 'noise_adaptive')
          270±1ms          266±1ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'sabre', 'dense')
          332±1ms          324±1ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'sabre', 'dense')
          473±1ms          459±1ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'sabre', 'dense')
        475±0.9ms        461±0.8ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'sabre', 'noise_adaptive')
          356±1ms          344±2ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'sabre', 'dense')
          264±2ms          255±2ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'sabre', 'noise_adaptive')
          429±1ms          413±4ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'sabre', 'dense')
          350±1ms        336±0.7ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'sabre', 'noise_adaptive')
       2.53±0.01s       2.43±0.01s     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'stochastic', 'sabre')
          402±1ms        385±0.3ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'sabre', 'noise_adaptive')
          324±2ms          310±2ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'sabre', 'noise_adaptive')
        346±0.7ms        329±0.7ms     0.95  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'sabre', 'noise_adaptive')
        376±0.9ms          356±3ms     0.95  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'sabre', 'dense')
          521±2ms          490±2ms     0.94  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'stochastic', 'sabre')
          975±3ms          915±4ms     0.94  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'stochastic', 'sabre')
          2.55±0s       2.38±0.04s     0.94  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'stochastic', 'sabre')
          869±4ms          809±4ms     0.93  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'stochastic', 'sabre')
          1.87±0s       1.74±0.01s     0.93  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'sabre', 'sabre')
          331±1ms        308±0.7ms     0.93  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'stochastic', 'sabre')
          896±2ms          828±3ms     0.92  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'sabre', 'sabre')
          428±2ms          394±1ms     0.92  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'sabre', 'sabre')
        311±0.5ms          287±2ms     0.92  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'sabre', 'sabre')
          493±9ms          451±1ms     0.91  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'stochastic', 'sabre')
          722±1ms          659±2ms     0.91  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'stochastic', 'sabre')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

I ran the qualitative compiler benchmarks mostly because it's the only one that runs with sabre and stochastic swap, but I don't think they're large enough to show the real benefits on sabre here.

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks fine to me, and the speedups are good. Two-way dicts really feel like they should be a part of the standard library, or at least have a rock-solid external dependency, but that's not really related to this PR.

@@ -134,7 +134,6 @@ def _trivial_not_perfect(property_set):
# layout so we need to clear it before contuing.
if property_set["trivial_layout_score"] is not None:
if property_set["trivial_layout_score"] != 0:
property_set["layout"]._wrapped = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These things are (presumably) part of the old FencedObject stuff that appeared at the start of the transpiler. I'm guessing it's being removed because the Layout became slotted and you didn't want to add the extra slot? Do we need the FencedPropertySet stuff at all really - is it useful for error checking?

Copy link
Member Author

@mtreinish mtreinish Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I removed this because slots meant I couldn't add an arbitrary attribute to the Layout object and it didn't seem worth the slot especially since it wasn't universally used.

I think we can look into deprecating the FencedPropertySet. At one time it was used to ensure that a TransformationPass could not modify the property set but that restriction was removed in #4387 so I'm not sure anything needs it anymore.

(edit: looking at the running passmanager object the flow controller gets a fenced property set to prevent a condition callable from modifying it. I'm not sure how important that is in practice, I've never seen an error like that come up before)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds sensible to me!

jakelishman
jakelishman previously approved these changes Oct 27, 2021
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Nov 1, 2021
This commit switches the default routing and layout method for
optimization level 3 to use SabreSwap and Sabre layout passes. The
quality of results is typically better with sabre than the default
stochastic swap and dense layout we're using now. For optimization
level 3 where we try to produce the best quality result and runtime
is of secondary concern using sabre makes the most sense (especially
after Qiskit#7036 which improves the runtime performance). Also for
optimization level 3 currently sabre is typically faster because we
increase the number of trials for stochastic swap (which is generally
significantly faster) which slows it down as it's doing more work.
This should improve the quality and speed of the results in general
when running with optimization level 3. In the future we can do more
work to improve the runtime performance of the sabre passes and
hopefully make it fast enough to use for all the optimization levels
which have more constraints on the run time performance than level 3.

Related to Qiskit#7112 and Qiskit#7200
Copy link
Member

@kdk kdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates here. Out of curiosity, do you have benchmark numbers for accessing Layout._{v2p,p2v} via their public methods (get_{virtual,physical}_bits), perhaps once outside the hot loops? I'd be interested to see exactly how much direct attribute access is gaining us, and whether or not a broader refactor of the Layout object for performance should be considered.

(I'm not a fan of updating all of our passes to access internal attributes because then our passes, and the passes our users develop based on them, will depend on something other than the stable API, and arguably would make a future refactor of Layout harder.)

jakelishman
jakelishman previously approved these changes Nov 18, 2021
jakelishman
jakelishman previously approved these changes Nov 18, 2021
@mtreinish mtreinish requested a review from kdk November 19, 2021 13:02
Copy link
Member

@kdk kdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. Few more comments.

Copy link
Member

@kdk kdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates.

@kdk kdk added the automerge label Nov 19, 2021
@mergify mergify bot merged commit 1567e4e into Qiskit:main Nov 19, 2021
@mtreinish mtreinish deleted the tune-layout-coupling branch November 19, 2021 23:11
mergify bot added a commit that referenced this pull request Nov 20, 2021
* Switch default routing/layout method to sabre for opt level 3

This commit switches the default routing and layout method for
optimization level 3 to use SabreSwap and Sabre layout passes. The
quality of results is typically better with sabre than the default
stochastic swap and dense layout we're using now. For optimization
level 3 where we try to produce the best quality result and runtime
is of secondary concern using sabre makes the most sense (especially
after #7036 which improves the runtime performance). Also for
optimization level 3 currently sabre is typically faster because we
increase the number of trials for stochastic swap (which is generally
significantly faster) which slows it down as it's doing more work.
This should improve the quality and speed of the results in general
when running with optimization level 3. In the future we can do more
work to improve the runtime performance of the sabre passes and
hopefully make it fast enough to use for all the optimization levels
which have more constraints on the run time performance than level 3.

Related to #7112 and #7200

* Clarify punctuation in release note

Co-authored-by: Jake Lishman <jake@binhbar.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
@kdk kdk added the Changelog: None Do not include in changelog label Dec 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fast vs safe attribute access
4 participants