Update pulse benchmarking #1607

nkanazawa1989 · 2022-09-23T17:49:57Z

Summary

This PR renews pulse benchmark with modern grammar. This invalidates old performance history.

Current pulse benchmarks targets pulse programs written in the form of Schedule, but this approach will be shortly discouraged. Pulse programs are usually built with pulse builder, that outputs ScheduleBlock. Then, such programs will be get lowered to Schedule at the time of execution. So measuring performance on Schedule doesn't give us practical measure of our pulse SDK.

Details and comments

In new benchmark, following codes are added.

load_pulse_defaults.py

This benchmark measures the speed of loading calibration data from JSON, which is usually provided by a backend as command definitions. Recently, owing to the increase of qubit numbers, loading speed of the calibration data is becoming critical. To track improvements of the logic, this test is newly added.

To prevent artifact due to fake provider update (especially calibration data) in terra, a dedicated fake data is introduced in the file. This generator assumes 2Q device, but can add arbitrary number of random_gate consisting of a single waveform with frame changes.

PulseDefaultsBench measures loading speed with varying the number of random gates, and CircuitSchedulingBench measures circuit -> schedule conversion speed on top of new fake data. The latter test is replacement of ScheduleToInstructionBench, which had dependency on Fake 2Q pulse backend in terra.

schedule_construction.py

Tests inside this file are renewed. New test consists of EchoedCrossResonanceConstructionBench and ParameterizedScheduleBench. These tests aim at benchmarking ScheduleBlock performance, rather than Schedule. Reference mechanism (block can manage external reference or subroutine as if managing parameters) is also tested. ParameterizedScheduleBench assumes the situation of calibration experiments, where we scan a particular parameter of pulse with inplace=False mode, and usually the pulse schedule is fully parameterized. Parameters are assigned to flat schedule, referenced schedule, and pulse gate to cover various situations.

schedule_lowering.py

Execution of program requires conversion from ScheduleBlock to Schedule, and this file includes such test. Some random and sufficiently complicated pulse program is prepared and digested by target_qobj_transform. Note that this is standard transformer function though, this is not well designed and logic itself could be changed in future.

- Add pulse defaults loading test. This mainly measures instmap construction. - Add lowering test. This measures conversion of block -> schedule. - Add ECR building tests with three major approaches. - Add parameterized block test with parameter scan, assuming calibration

mtreinish

These benchmarks look great to me, thanks for doing this. I have one question inline about the pulse default benchmarks but it's not really a blocker we can always add it a follow up too.

mtreinish · 2022-09-23T20:28:45Z

test/benchmarks/pulse/load_pulse_defaults.py

+    def setup(self, num_random_gate):
+        self.source = gen_source(num_random_gate)


So the slowest of these take ~500 ms for me running benchmarks locally. Since this is a critical concern for performance I'm wondering if this is large enough to showcase the issues we're hitting. Does the conversion scale with number of gate definitions in absolute terms or is it also a function of qubits?

I'm wondering if we should just use FakeWashingtontoo I saw in the commit message where you mentioned keeping a stable base and I normally agree with that (which is why some fake backends are vendored in the code here). But in practice I don't think we're likely to ever change the snapshot of FakeWashington unless there was a big backwards incompatible change made to the configuration of the device that we wanted to ensure we tested (the only time that happened historically was the move from u1, u2, u3 to sx, x, rz)

I found that these backends still report u* gates, and parsing of string parameters is the heavy overhead. I also included U3 gates and randam_gates with string parameter (this also checks efficiency of pulse library) in this benchmark, so I think this is enough sensitive to the improvement of the parser (currently we are doing sort of overengineering). I don't know why backends still report u* gate calibrations, but there could be a possibility of removal of them, resulting in drastic performance improvement without actual performance change. This is why I hesitate to have fake backends in the benchmark. The test time increases with number of qubits and instructions, but I think scaling is linear unless the machine runs out of memory.

That's fine with me, then I'm all good with these benchmarks. (we also can expand things in the future if needed)

- Add pulse defaults loading test. This mainly measures instmap construction. - Add lowering test. This measures conversion of block -> schedule. - Add ECR building tests with three major approaches. - Add parameterized block test with parameter scan, assuming calibration

nkanazawa1989 requested review from mtreinish, jakelishman and kevinhartman as code owners September 23, 2022 17:49

mtreinish reviewed Sep 23, 2022

View reviewed changes

mtreinish approved these changes Sep 26, 2022

View reviewed changes

mtreinish added the automerge This PR will automatically merge once its CI has passed label Sep 26, 2022

mergify bot merged commit ff3c249 into Qiskit:master Sep 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update pulse benchmarking #1607

Update pulse benchmarking #1607

Uh oh!

nkanazawa1989 commented Sep 23, 2022

Uh oh!

mtreinish left a comment

Uh oh!

mtreinish Sep 23, 2022

Uh oh!

nkanazawa1989 Sep 25, 2022 •

edited

Loading

Uh oh!

mtreinish Sep 26, 2022

Uh oh!

Uh oh!

		def setup(self, num_random_gate):
		self.source = gen_source(num_random_gate)

Update pulse benchmarking #1607

Update pulse benchmarking #1607

Uh oh!

Conversation

nkanazawa1989 commented Sep 23, 2022

Summary

Details and comments

Uh oh!

mtreinish left a comment

Choose a reason for hiding this comment

Uh oh!

mtreinish Sep 23, 2022

Choose a reason for hiding this comment

Uh oh!

nkanazawa1989 Sep 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mtreinish Sep 26, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nkanazawa1989 Sep 25, 2022 •

edited

Loading