-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
What should we add?
Just putting this at the top: link to qrusty: https://github.com/chetmurthy/qrusty
When constructing sparse matrixes from SparsePauliOp, the current code constructs a sparse matrix from each BasePauli (which is a string of form 'IIIXYZIZYIII') and then combines then in linear combination with complex coefficients. This is inefficient and space-wasting on many levels:
- the BasePauli construction is Python code, and it's not the fastest.
- for large problems, we can tens of thousands of BasePaulis, each of which takes tens of megabytes, so adding them up entails considerable construction of intermediate datastructures that are nearly-immediately discarded.
Concretely, I'm working with some people who have 24-qubit problems with >20k BasePaulis -- the construction problem alone takes effectively interminable time (heck, the 20-qubit problem runs longer than a weekend). Their largest problem (so far) has 92B nonzero complex entries, and another person I'm working with has 27-qubit problems.
There's another way to solve this problem, but let's review how the current solution works:
sumMatrix := empty
for each (basePauli,coeff) in sparsePauliOp:
spmat := convert basePauli to sparse matrix
sumMatrix += coeff * spmat
And that convert line can be written as:
basemat := empty
for each row in 0..2^#qubits:
basemat = basemat <append> (row,col, value for the nonzero value in this basePauli)
Viewed this way, the "sumMatrix +=" (when the matrices are represented as sparse matrices) is really an insertion sort: each row of "basemat" is insertion-sorted into "sumMatrix" and if there's a nonzero value there, then it's added.
But we can swap the order of the iterations, yielding:
sumMatrix := empty
for each row in 0..2^#qubits:
baseRow := empty
for each (basePauli, coeff) in sparsePauliOp:
baseRow = baseRow <append> (col, value for nonzero value in this basePauli row)
sort baseRow by col#, adding value for columns with more than one value
add baseRow to sumMatrix
This code has a natural parallelized implementation by taking the "for each row" loop and running it in parallel.
I have code that can do this in this repo: https://github.com/chetmurthy/qrusty
and specifically this function: https://github.com/chetmurthy/qrusty/blob/ea11318ff4c92d33f2141841b6e28eb4cd1df8b3/qrusty/src/accel.rs#L267
which constructs the three vectors required to build a CSR matrix.
I've built code around this with classes that match "BasePauli" and "SparsePauliOp", so that this Rust code can be used from Python outside of Qiskit, and also a Python wrapper of the sprs
sparse-matrix type, because the people I'm working with also find that Python Scipy is not parallelized for many operations (e.g. matrix-vector multiply, "ax+by", evaluating "cMx + dMx" for complex "c", vector "x") that need to be parallelized when matrix dimension reaches 2^20 and above.
There are several ways one might import the basic function of "fast construction of matrices from SparsePauliOp" into Qiskit:
-
just pull in this function (link above) and write enough wrapper to use it from Qiskit
-
refer to Qrusty's implementation (== "construct the qrusty SparsePauliOp, then call the function to construct a SpMat, then convert to a Scipy CSR matrix") but leave Qrusty as-is, so that people who need to do more parallelized matrix operations can do so with Qrusty
-
integrate Qrusty into Qiskit, slowly making qrusty's version of BasePauli and SparsePauliOp into the implementation of Qiskit's versions of those objects.
It seems clear that #3 is a lot of work, maybe even not desirable. But #1 has the problem that for anybody who actually wants to work with high # of qubits, once you have the matrix you can't do much with it without converting it back into a form where you can then apply Rust parallelism to it. B/c the matrix is so bloody big.
My belief is, #3 is the way to go for this reason.