Skip to content
This repository was archived by the owner on Aug 19, 2023. It is now read-only.

Conversation

nkanazawa1989
Copy link
Contributor

Summary

This PR renews pulse benchmark with modern grammar. This invalidates old performance history.

Current pulse benchmarks targets pulse programs written in the form of Schedule, but this approach will be shortly discouraged. Pulse programs are usually built with pulse builder, that outputs ScheduleBlock. Then, such programs will be get lowered to Schedule at the time of execution. So measuring performance on Schedule doesn't give us practical measure of our pulse SDK.

Details and comments

In new benchmark, following codes are added.

  1. load_pulse_defaults.py

This benchmark measures the speed of loading calibration data from JSON, which is usually provided by a backend as command definitions. Recently, owing to the increase of qubit numbers, loading speed of the calibration data is becoming critical. To track improvements of the logic, this test is newly added.

To prevent artifact due to fake provider update (especially calibration data) in terra, a dedicated fake data is introduced in the file. This generator assumes 2Q device, but can add arbitrary number of random_gate consisting of a single waveform with frame changes.

PulseDefaultsBench measures loading speed with varying the number of random gates, and CircuitSchedulingBench measures circuit -> schedule conversion speed on top of new fake data. The latter test is replacement of ScheduleToInstructionBench, which had dependency on Fake 2Q pulse backend in terra.

  1. schedule_construction.py

Tests inside this file are renewed. New test consists of EchoedCrossResonanceConstructionBench and ParameterizedScheduleBench. These tests aim at benchmarking ScheduleBlock performance, rather than Schedule. Reference mechanism (block can manage external reference or subroutine as if managing parameters) is also tested. ParameterizedScheduleBench assumes the situation of calibration experiments, where we scan a particular parameter of pulse with inplace=False mode, and usually the pulse schedule is fully parameterized. Parameters are assigned to flat schedule, referenced schedule, and pulse gate to cover various situations.

  1. schedule_lowering.py

Execution of program requires conversion from ScheduleBlock to Schedule, and this file includes such test. Some random and sufficiently complicated pulse program is prepared and digested by target_qobj_transform. Note that this is standard transformer function though, this is not well designed and logic itself could be changed in future.

- Add pulse defaults loading test. This mainly measures instmap construction.
- Add lowering test. This measures conversion of block -> schedule.
- Add ECR building tests with three major approaches.
- Add parameterized block test with parameter scan, assuming calibration
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These benchmarks look great to me, thanks for doing this. I have one question inline about the pulse default benchmarks but it's not really a blocker we can always add it a follow up too.

Comment on lines +541 to +542
def setup(self, num_random_gate):
self.source = gen_source(num_random_gate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the slowest of these take ~500 ms for me running benchmarks locally. Since this is a critical concern for performance I'm wondering if this is large enough to showcase the issues we're hitting. Does the conversion scale with number of gate definitions in absolute terms or is it also a function of qubits?

I'm wondering if we should just use FakeWashingtontoo I saw in the commit message where you mentioned keeping a stable base and I normally agree with that (which is why some fake backends are vendored in the code here). But in practice I don't think we're likely to ever change the snapshot of FakeWashington unless there was a big backwards incompatible change made to the configuration of the device that we wanted to ensure we tested (the only time that happened historically was the move from u1, u2, u3 to sx, x, rz)

Copy link
Contributor Author

@nkanazawa1989 nkanazawa1989 Sep 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that these backends still report u* gates, and parsing of string parameters is the heavy overhead. I also included U3 gates and randam_gates with string parameter (this also checks efficiency of pulse library) in this benchmark, so I think this is enough sensitive to the improvement of the parser (currently we are doing sort of overengineering). I don't know why backends still report u* gate calibrations, but there could be a possibility of removal of them, resulting in drastic performance improvement without actual performance change. This is why I hesitate to have fake backends in the benchmark. The test time increases with number of qubits and instructions, but I think scaling is linear unless the machine runs out of memory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine with me, then I'm all good with these benchmarks. (we also can expand things in the future if needed)

@mtreinish mtreinish added the automerge This PR will automatically merge once its CI has passed label Sep 26, 2022
@mergify mergify bot merged commit ff3c249 into Qiskit:master Sep 26, 2022
jakelishman pushed a commit to jakelishman/qiskit-terra that referenced this pull request Aug 1, 2023
- Add pulse defaults loading test. This mainly measures instmap construction.
- Add lowering test. This measures conversion of block -> schedule.
- Add ECR building tests with three major approaches.
- Add parameterized block test with parameter scan, assuming calibration
jakelishman pushed a commit to jakelishman/qiskit-terra that referenced this pull request Aug 11, 2023
- Add pulse defaults loading test. This mainly measures instmap construction.
- Add lowering test. This measures conversion of block -> schedule.
- Add ECR building tests with three major approaches.
- Add parameterized block test with parameter scan, assuming calibration
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
automerge This PR will automatically merge once its CI has passed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants