-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Tune performance of OneQubitEulerDecomposer #5915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit improves the performance slightly of the OneQubitEulerDecomposer by avoiding the use of append(), which does extra validation of the arguments when constructing the output circuits. Instead the _append() method is used which does not have the validation overhead, however this is not necessary since we know how the circuit is constructed and the arguments when adding the gate are being set correctly (there is only 1 qubit anyway).
The np.isclose() function quickly becomes a bottlneck in the OneQubitEulerDecomposer class as the majority of the operations are comparing angles to decide when to insert gates. However, we can avoid this overhead by switching from numpy's isclose function which is optimzed for arrays not floats [1][2] to just using the standard library's math module's isclose() function. This commit makes this change as well as using intermediate variables to avoid duplicate calculations to speed up the decomposer.
To see the difference here are profiles from running the decomposer as part of the 1q optimization pass as part of a transpile of a qv 7x7 model circuit. WIthout this PR: With this PR: The thing I'm not sure of right now is why |
The other thing I think we should look at doing (although not in this PR) is adding support to the deocmposer for working in a |
According to the np.mod() docs [1] the np.mod() is equivalent to the stdlib '%' operator in python however it is designed to work with numpy array's instead of single values. This adds about an order of magnitude overhead to the mod operations (on the order of 1us vs 100ns) which can add up as _mod2pi is called multiple times. However python's mod operator is not ideal for working with floats [2] and when used produces different results than expected. To avoid this but still improve performance this commit switches to use the stdlib math module's fmod() function, which produces the expected result and is only marginaly slower than the '%' operator, which will still avoid the overhead of the numpy function. [1] https://numpy.org/doc/stable/reference/generated/numpy.mod.html [2] https://docs.python.org/3/library/math.html#math.fmod
Summary
This commit improves the performance slightly of the
OneQubitEulerDecomposer
by making to changes, avoiding the use ofappend()
and avoiding the use ofnp.isclose()
.Avoiding
append()
, which does extra validation of the arguments whenconstructing the output circuits. Instead the _append() method is used
which does not have the validation overhead, however this is not
necessary since we know how the circuit is constructed and the
arguments when adding the gate are being set correctly (there is only
1 qubit anyway).
The
np.isclose()
function quickly becomes a bottlneck in theOneQubitEulerDecomposer
class as the majority of the operations arecomparing angles to decide when to insert gates. However, we can avoid
this overhead by switching from numpy's isclose function which is
optimized for arrays not floats [1][2] to just using the standard
library's math module's
isclose()
function. This commit makes thischange as well as using intermediate variables to avoid duplicate
calculations to speed up the decomposer.
Details and comments
[1] numpy/numpy#16160
[2] numpy/numpy#10161