Skip to content

Conversation

Sebastian-Brandhofer
Copy link

Summary

A few quick fixes for the general case of checking the commutation of two gates. This reduces the runtime to roughly a third on my machine. np.allclose and np.reshape seem to be particularly slow with the used tensor structures. I also think the matrix multiplication should be refactored -- it is certainly beneficial to write them up like this for large matrices but it seems for smaller matrices, as typical for quantum gates, normal tensor and matrix products are faster.

See also #8020.

Details and comments

Original runtime:
original

Updated:
update

@Sebastian-Brandhofer Sebastian-Brandhofer requested a review from a team as a code owner May 17, 2022 21:48
@qiskit-bot
Copy link
Collaborator

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core

@CLAassistant
Copy link

CLAassistant commented May 17, 2022

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@kdk kdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on @Sebastian-Brandhofer ! It looks like there were some test failures, though all within the Operator tests, which I wouldn't expect to be impacted by the changes here: https://dev.azure.com/qiskit-ci/qiskit-terra/_build/results?buildId=36448&view=logs&j=e4ea27c2-3a16-5b32-7be8-0bf30fe9e828&t=25d46f44-c831-5e21-7716-9054a5afcf2d&l=17434

Would it be possible to compare performance between this PR and main over some random circuits, similar to what was done in #8078 ?

@Sebastian-Brandhofer
Copy link
Author

Sure, the runtime is roughly halved:

#qubits #gates depth #edges previous time new time
5 100 41 138 0.26172518730163574 0.13296890258789062
5 500 216 722 3.178704023361206 1.8769340515136719
5 1000 411 1500 9.984090805053711 5.910406112670898
10 100 24 131 0.3885350227355957 0.22033214569091797
10 500 126 741 3.987920045852661 2.370765209197998
10 1000 253 1535 12.320502042770386 7.292937994003296
20 100 15 100 0.5815088748931885 0.33023500442504883
20 500 68 714 6.411149024963379 3.793527126312256
20 1000 136 1564 18.472851991653442 10.613489389419556
50 100 9 55 0.533311128616333 0.30852603912353516
50 500 35 659 11.17283296585083 6.6091039180755615
50 1000 61 1464 34.09161925315857 19.603893041610718

@Sebastian-Brandhofer
Copy link
Author

The code seems to have some issues with the test runner though. E.g. the tests

  • test.python.quantum_info.operators.test_operator.TestOperator.test_from_circuit_constructor_reverse_embedded_layout
  • test.python.quantum_info.operators.test_operator.TestOperator.test_circuit_init
    fail on my local machine if I start the tests with
    tox -epy37
    however, they pass if the tests are run individually e.g.
    tox -epy37 -- -n test.python.quantum_info.operators.test_operator.TestOperator.test_circuit_init
    Can you have a look @mtreinish, @chriseclectic ?

@alexanderivrii
Copy link
Member

Thanks @Sebastian-Brandhofer. I don't fully understand the reason for test failures above, but calling np.reshape as in

op1 = np.reshape(node1.op.to_matrix(), (2, 2) * len(qarg1))

copies the array, while setting .shape directly as in

op1 = node1.op.to_matrix()
op1.shape = (2, 2) * len(qarg1)

modifies the array in place. Which means that your code may inadvertently modify matrices for already existing gates. Which might explain why the test failures are only exhibited when running multiple tests and not just the last problematic test.

It's also interesting that executing the following code:

a = _get_identity(4)
print(f"{a.shape = }")

b = a
b.shape = (2, 2, 2, 2)
print(f"{b.shape = }")

c = _get_identity(4)
print(f"{c.shape = }")

gives the output:

a.shape = (4, 4)
b.shape = (2, 2, 2, 2)
c.shape = (2, 2, 2, 2)

That is, the first call to _get_identity(4) returns the matrix with shape (4, 4), but the next call to _get_identity(4) returns the matrix with shape (2, 2, 2, 2) because of lrucache and b.shape = ..... Though this is something that you may want to have here.

And a bit of an off-topic question. What did you use to produce such wonderful graphics above? I have recently started using SnakeViz, but the graphics above look better. Additionally, are your runtimes with profiler enabled or disabled?

@Sebastian-Brandhofer
Copy link
Author

Sebastian-Brandhofer commented May 19, 2022

Thanks @alexanderivrii, that is probably the reason for the test fails. I expected to_matrix to return a copy. The runtimes are with a profiler disabled (and also without your code) and the graphs are called flame graphs ( https://github.com/baverman/flameprof ) which you can generate out of cProfile's output. :-)

@Sebastian-Brandhofer
Copy link
Author

New runtime using np.reshape instead of .shape

#qubits #gates depth #edges previous time new time
5 100 41 138 0.34026026725769043 0.1481037139892578
5 500 216 722 3.0843498706817627 1.9479050636291504
5 1000 411 1500 10.775328874588013 8.38437795639038
5 5000 2081 7144 271.8974268436432 157.30398511886597
10 100 24 131 0.41545796394348145 0.3056662082672119
10 500 126 741 3.928831100463867 2.4933621883392334
10 1000 253 1535 11.923341751098633 7.786161184310913
10 5000 1201 7358 253.9152159690857 163.85867476463318
20 100 15 100 0.636815071105957 0.43134188652038574
20 500 68 714 5.8873608112335205 3.763284206390381
20 1000 136 1564 16.975779056549072 10.914414882659912
20 5000 653 7617 285.21893310546875 183.0057098865509
50 100 9 55 0.5619361400604248 0.39586400985717773
50 500 35 659 11.628947973251343 6.560704946517944
50 1000 61 1464 37.69327211380005 20.084483861923218
50 5000 300 7741 388.1853427886963 249.665668964386

Comment on lines 637 to 646
return np.allclose(op12, op21)
maybe_zero = np.abs(op12 - op21)
maybe_zero[np.abs(maybe_zero) < 1e-7] = 0
return not maybe_zero.any()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change require?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.allclose seems to be much slower in this use case than lines 644-646.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the double np.abs?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Yeah, thanks for spotting this! :)

@coveralls
Copy link

coveralls commented May 31, 2022

Pull Request Test Coverage Report for Build 2415450922

  • 8 of 8 (100.0%) changed or added relevant lines in 1 file are covered.
  • 31 unchanged lines in 5 files lost coverage.
  • Overall coverage increased (+0.02%) to 84.393%

Files with Coverage Reduction New Missed Lines %
qiskit/transpiler/passes/layout/csp_layout.py 1 98.04%
qiskit/transpiler/passmanager_config.py 2 94.12%
qiskit/utils/optionals.py 2 94.44%
qiskit/quantum_info/operators/symplectic/base_pauli.py 11 87.43%
qiskit/version.py 15 76.85%
Totals Coverage Status
Change from base Build 2349198842: 0.02%
Covered Lines: 54565
Relevant Lines: 64656

💛 - Coveralls

@1ucian0 1ucian0 changed the title quick DAGDependency performance fixes [unitaryhack] DAGDependency performance improvement May 31, 2022
@HuangJunye HuangJunye added the Community PR PRs from contributors that are not 'members' of the Qiskit repo label Jun 21, 2022
@@ -567,6 +568,12 @@ def merge_no_duplicates(*iterables):
yield val


@lru_cache(maxsize=None)
Copy link
Member

@1ucian0 1ucian0 Jul 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe with a maxsize?

name refactoring

Co-authored-by: Luciano Bello <bel@zurich.ibm.com>
@1ucian0 1ucian0 self-assigned this Apr 17, 2023
@1ucian0 1ucian0 added the unitaryhack Issues/PR participating (now or in the past) in the UnitaryHack event see https://unitaryhack.dev/ label May 4, 2023
@1ucian0 1ucian0 removed the unitaryhack Issues/PR participating (now or in the past) in the UnitaryHack event see https://unitaryhack.dev/ label Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog Community PR PRs from contributors that are not 'members' of the Qiskit repo performance
Projects
Status: unitary hack
Development

Successfully merging this pull request may close these issues.

9 participants