-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
Information
- Qiskit Terra version:
qiskit 0.29.0
qiskit-aer 0.8.2
qiskit-aqua 0.9.4
qiskit-ibmq-provider 0.16.0
qiskit-ignis 0.6.0
qiskit-machine-learning 0.2.1
qiskit-terra 0.18.1 - Python version:
Python 3.7.11 :: Intel Corporation - Operating system:
CentOS Linux 7
What is the current behavior?
The memory requirements grown unexpectedly (unreasonably?) quickly with the circuit depth. As the circuit depth is varied, the qubit count stays fixed. I assume this issue is due to some inefficiency in transpiler, since according to the trace the function that runs out of memory is QuantumInstance.traspile
. This is undesirable as many use cases (e.g. QuantumKernel
class) require evaluating very large numbers of circuits.
Steps to reproduce the problem
import numpy as np
import time
import sys
from qiskit import BasicAer
from qiskit.providers.aer import AerSimulator
from qiskit.utils import QuantumInstance
from qiskit.circuit.library import ZZFeatureMap
from qiskit_machine_learning.kernels import QuantumKernel
reps = int(sys.argv[1])
n = 20
m = 800
X = np.random.uniform(0,1,n*m).reshape(m,n)
FeatureMap = ZZFeatureMap(n, reps=reps)
t1 = time.time()
quantum_instance = QuantumInstance(AerSimulator(method="statevector"))
quantum_kernel = QuantumKernel(feature_map=FeatureMap, quantum_instance=quantum_instance)
kernel_matrix = quantum_kernel.evaluate(x_vec = X)
t2 = time.time()
print(f'Finished in {t2-t1} sec')
To run:
>>> for i in $(seq 2 4 12); do mprof run --include-children python test_QuantumInstance.py $i; done
Output:
mprof: Sampling memory every 0.1s
running new process
Finished in 389.05642342567444 sec
mprof: Sampling memory every 0.1s
running new process
Finished in 866.7115132808685 sec
mprof: Sampling memory every 0.1s
running new process
Traceback (most recent call last):
File "test_QuantumInstance.py", line 27, in <module>
kernel_matrix = quantum_kernel.evaluate(x_vec = X)
File "/home/rshaydulin/soft/anaconda3/envs/qiskit_latest/lib/python3.7/site-packages/qiskit_machine_learning/kernels/quantum_kernel.py", line 290, in evaluate
results = self._quantum_instance.execute(circuits)
File "/home/rshaydulin/soft/anaconda3/envs/qiskit_latest/lib/python3.7/site-packages/qiskit/utils/quantum_instance.py", line 398, in execute
circuits = self.transpile(circuits)
File "/home/rshaydulin/soft/anaconda3/envs/qiskit_latest/lib/python3.7/site-packages/qiskit/utils/quantum_instance.py", line 346, in transpile
circuits, self._backend, **self._backend_config, **self._compile_config
File "/home/rshaydulin/soft/anaconda3/envs/qiskit_latest/lib/python3.7/site-packages/qiskit/compiler/transpiler.py", line 293, in transpile
circuits = parallel_map(_transpile_circuit, list(zip(circuits, transpile_args)))
File "/home/rshaydulin/soft/anaconda3/envs/qiskit_latest/lib/python3.7/site-packages/qiskit/tools/parallel.py", line 164, in parallel_map
raise error
File "/home/rshaydulin/soft/anaconda3/envs/qiskit_latest/lib/python3.7/site-packages/qiskit/tools/parallel.py", line 154, in parallel_map
results = list(future)
File "/home/rshaydulin/soft/anaconda3/envs/qiskit_latest/lib/python3.7/concurrent/futures/process.py", line 483, in _chain_from_iterable_of_lists
for element in iterable:
File "/home/rshaydulin/soft/anaconda3/envs/qiskit_latest/lib/python3.7/concurrent/futures/_base.py", line 598, in result_iterator
yield fs.pop().result()
File "/home/rshaydulin/soft/anaconda3/envs/qiskit_latest/lib/python3.7/concurrent/futures/_base.py", line 428, in result
return self.__get_result()
File "/home/rshaydulin/soft/anaconda3/envs/qiskit_latest/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
Here the black line corresponds to reps=2
, blue line to reps=6
and red line to reps=10
(crashed, presumably due to running out of memory; I confirmed that the memory was the issue by watching htop
, though for some reason it the end of the run right before crash does not show up on the trace)
What is the expected behavior?
Memory requirements stay reasonable.
Suggested solutions
A potential solution could be batching of circuits. For example, if simulator receives 10^6 circuits, they are transpiled and evaluated in chunks of 10^3, with intermediate results saved and circuits discarded.