Skip to content

Conversation

zhaog6
Copy link
Contributor

@zhaog6 zhaog6 commented Mar 4, 2022

Reference issue

Closes gh-15149 because my forked scipy repository is removed by mistake

What does this implement/fix?

The PR reimplemented restarted GMRES algorihm, the code in the PR is more efficient than the original version. The comments and performance is shown in gh-15149.

Additional information

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 4, 2022

Dear Andrew @lobpcg , Could you review it again? The references have been added according to your suggestions and the rest of the code remains unchanged. Thanks

@swallan swallan added enhancement A new feature or improvement scipy.sparse.linalg labels Mar 7, 2022
@pescap
Copy link

pescap commented Mar 24, 2022

Dear @zhaog6, I take this opportunity to share some results and discuss a few points.

I plan to add the weighted GMRES(m) to scipy, as this variant can lead to mesh independent convergence results for FEM/BEM simulations (see e.g. this article). The weighted version requires only a few changes to the code as compared to the classic Euclidean version.

What I did so far:

  • I was having difficulties with the Fortran core in scipy.sparse.linalg.gmres, and found out that the pyamg.krylov.gmres code was much easier to understand, and to modify. I implemented the weighted GMRES for mgs orthogonalization (I shall propose a PR soon to pyamg).
  • I came across your issue, and found it very interesting as it didn't use the Fortran core. I decided to wait for the commit to be done, in order to start implementing a weighted scipy.sparse.linalg.gmresk function.
  • Today, I wondered whether the classic pyamg.krylov.gmres (with both 'mgs' and 'householder' orthogonalizations) was more efficient that scipy.sparse.linalg.gmres, so I implemented a simple benchmark.
  • pyamg.krylov.gmres appeared to be twice as slow as scipy.sparse.linalg.gmres for my BEM simulations (performed with bempp-cl).
  • I decided to compare those results to your scipy.linalg.gmresk implementation. scipy.sparse.linalg.gmresk and scipy.linalg.gmres appear to have similar performances (see the exhaustive results above).

A few questions:

  • Which are the differences between pyamg.krylov.gmres('mgs') and scipy.sparse.linalg.gmresk? Did you use pyamg.krylov.gmres('mgs') as a reference?
  • Is your scipy.sparse.linalg.gmresk implementation supposed to replace the original scipy.sparse.linalg.gmres in scipy?

Paul

@pescap
Copy link

pescap commented Mar 24, 2022

Example:
3D EM scattering problem by a unit sphere solved by BEM on a graded mesh. CFIE formulation in strong form with BC function space + FMM.
N = 711 dofs
k = 4 wavenumber
tol = 1e-5
maxiter = 1000

For restart = maxiter, convergence is reached in 20 iterations:

Execution time (in s.):

  • 15.713715314865112 sc gmres
  • 15.42734169960022 sc gmresk
  • 30.51805353164673 pyamg gmres_householder
  • 31.24692392349243 pyamg gmres_mgs

Final residuals are:
0.00019359173984727306 sc gmres
0.00019765435974601611 sc gmresk
0.0001935917398478821 pyamg gmres_householder
0.00019359173984632332 pyamg gmres_mgs

I tried other configurations, and obtained the similar relative performance. I hope this helps!
Paul

@pescap
Copy link

pescap commented Mar 24, 2022

While performing the simulations, I detected two possible issues. I defined this callback:

class sc_gmres_counter(object):
    def __init__(self, disp=True):
        self._disp = disp
        self.niter = 0
        self.residual = []
    def __call__(self, rk=None):
        self.niter += 1
        self.residual.append(rk)
        if self._disp:
            print('iter %3i\trk = %s' % (self.niter, str(rk)))
            #print(rk.dtype)

With scipy.sparse.linalg.gmres:

ta = time.time()
sc_gmres_callback = sc_gmres_counter(disp = True)
sc_gmres_x, conv = sc_gmres(MFIE, b0, tol = tol, maxiter = maxiter, restart = restart, callback = sc_gmres_callback)
tsc_gmres = time.time() - ta
print(tsc_gmres, 'tsc_gmres')

I get:

iter   1	rk = 0.1837916301090236
iter   2	rk = 0.11138931939225478
iter   3	rk = 0.08211903375560312
iter   4	rk = 0.07004378432798104
iter   5	rk = 0.06380985251775857
iter   6	rk = 0.05715909562403769
iter   7	rk = 0.04642969181489359
iter   8	rk = 0.03214239224867582
iter   9	rk = 0.01107110975966527
iter  10	rk = 0.0029690925151010243
iter  11	rk = 0.0014425678231152971
iter  12	rk = 0.0010554942264707778
iter  13	rk = 0.0009143820403326146
iter  14	rk = 0.0006545110058378569
iter  15	rk = 0.0004261595451156276
iter  16	rk = 0.00013309019662648385
iter  17	rk = 6.889628500893324e-05
iter  18	rk = 4.214960307118879e-05
iter  19	rk = 1.6825761832743438e-05
iter  20	rk = 8.99929372085533e-06
15.713715314865112 tsc_gmres

With scipy.sparse.linalg.gmresk:

ta = time.time()
sc_gmresk_callback = sc_gmres_counter(disp = True)
sc_gmresk_x, conv = sc_gmresk(MFIE, b0, tol = tol, maxiter = maxiter, restart = restart, callback = sc_gmresk_callback)
tsc_gmresk = time.time() - ta
print(tsc_gmresk, 'tsc_gmresk')

I get:

iter   1	rk = (21.511881471209872+0j)
iter   2	rk = 3.9537038393927517
iter   3	rk = 2.396696066591928
iter   4	rk = 1.7695582180118858
iter   5	rk = 1.5127003077945789
iter   6	rk = 1.4133875853610975
iter   7	rk = 1.3805802960304863
iter   8	rk = 1.2685742747255486
iter   9	rk = 0.7814670758076535
iter  10	rk = 0.24208381988871075
iter  11	rk = 0.06396454310698772
iter  12	rk = 0.03104710829792781
iter  13	rk = 0.022771018219659676
iter  14	rk = 0.023974382488567897
iter  15	rk = 0.023093319894124793
iter  16	rk = 0.010817520444123376
iter  17	rk = 0.002906011109539687
iter  18	rk = 0.001493678744490108
iter  19	rk = 0.0009188626403153703
iter  20	rk = 0.00037376595656475045
iter  21	rk = (0.00019765435974601611+0j)
15.42734169960022 tsc_gmresk

Comments:

  • It sounds like the gmres callback is with relative residual rk/r0 while gmresk sounds to be giving me rk.
  • My simulations are in complex128. With gmresk, some residuals are float, as expected, but I get complex numbers at iterations 1 and 21 (i.e. first and last iteration).

Paul

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 25, 2022

Dear @pescap thanks for your attention!

I plan to add the weighted GMRES(m) to scipy, as this variant can lead to mesh independent convergence results for FEM/BEM simulations (see e.g. this article).

Yes, it is interesting and meaningful.

Which are the differences between pyamg.krylov.gmres('mgs') and scipy.sparse.linalg.gmresk?

I downloaded it and take a look. There are some differences. The algorithm steps of the algorithm are almost the same because we all refer to the textbook of Yousef Saad, but some optimizations have been made for data storage way based on Python and some mathematical operations. Therefore it will be expected to bring performance improvements.

Did you use pyamg.krylov.gmres('mgs') as a reference?

No. General steps of the algorithm are basically the same, but the implementation details and optimizations are different.

Is your scipy.sparse.linalg.gmresk implementation supposed to replace the original scipy.sparse.linalg.gmres in scipy?

Yes

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 25, 2022

Example:
3D EM scattering problem by a unit sphere solved by BEM on a graded mesh. CFIE formulation in strong form with BC function space + FMM.
N = 711 dofs
k = 4 wavenumber
tol = 1e-5
maxiter = 1000

Thanks. Increase the number of DOFs (mesh: 64x64/128x128/256x256/...), gmresk will be significantly faster than the original implementation gmres in SciPy, at least for the 2-D convection-diffusion equations (asymmetric) and Poisson equations (symmetric).

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 25, 2022

It sounds like the gmres callback is with relative residual rk/r0 while gmresk sounds to be giving me rk.

Yes, gmresk gives absolute resdiual rk, if you want to get relative residual, you only need to change it as follows

self.residual.append(rk / np.linalg.norm(b))

@pescap
Copy link

pescap commented Mar 25, 2022

Example:
3D EM scattering problem by a unit sphere solved by BEM on a graded mesh. CFIE formulation in strong form with BC function space + FMM.
N = 711 dofs
k = 4 wavenumber
tol = 1e-5
maxiter = 1000

Thanks. Increase the number of DOFs (mesh: 64x64/128x128/256x256/...), gmresk will be significantly faster than the original implementation gmres in SciPy, at least for the 2-D convection-diffusion equations (asymmetric) and Poisson equations (symmetric).

You are right. For a 3D Helmholtz scattering problem with 24960 dofs.

iter   1	rk = 0.21239499055185496
iter   2	rk = 0.047501227412432526
iter   3	rk = 0.018352544841195494
iter   4	rk = 0.009056496233647232
iter   5	rk = 0.0035722844518443855
iter   6	rk = 0.0018383231116009518
iter   7	rk = 0.0007350773996823222
iter   8	rk = 0.0003416683629293291
iter   9	rk = 0.00015138548500454288
iter  10	rk = 7.977638945187258e-05
iter  11	rk = 3.520489360069971e-05
iter  12	rk = 1.55545350706511e-05
iter  13	rk = 8.360237315363883e-06
636.2255909442902 tsc_gmres
iter   1	rk = (128.84483327949104+0j)
iter   2	rk = (27.412184944688544+0j)
iter   3	rk = (6.4727620531216825+0j)
iter   4	rk = (2.4495065775979064+0j)
iter   5	rk = (1.3297945331107937+0j)
iter   6	rk = (0.5018230268422232+0j)
iter   7	rk = (0.22928138169014464+0j)
iter   8	rk = (0.10494432051197332+0j)
iter   9	rk = (0.05199837364367407+0j)
iter  10	rk = (0.024557429870131414+0j)
iter  11	rk = (0.010634252492458552+0j)
iter  12	rk = (0.005078965019093442+0j)
iter  13	rk = (0.0026998541348253924+0j)
iter  14	rk = (0.0013364458882280097+0j)
iter  15	rk = (0.000630652708956495+0j)
484.6333725452423 tsc_gmresk
iter   1	rk = 0.2123949905518553
iter   2	rk = 0.047501227412431936
iter   3	rk = 0.018352544841195136
iter   4	rk = 0.009056496233647054
iter   5	rk = 0.0035722844518446847
iter   6	rk = 0.001838323111601299
iter   7	rk = 0.0007350773996816898
iter   8	rk = 0.0003416683629294242
iter   9	rk = 0.00015138548500468371
iter  10	rk = 7.977638945181058e-05
iter  11	rk = 3.5204893600427436e-05
iter  12	rk = 1.5554535070677968e-05
iter  13	rk = 8.360237315876192e-06
636.6359264850616 tpyamg_gmres_householder
iter   1	rk = 0.21239499055185512
iter   2	rk = 0.04750122741243243
iter   3	rk = 0.018352544841194574
iter   4	rk = 0.009056496233647382
iter   5	rk = 0.003572284451844334
iter   6	rk = 0.001838323111601512
iter   7	rk = 0.0007350773996820497
iter   8	rk = 0.0003416683629294184
iter   9	rk = 0.00015138548500480726
iter  10	rk = 7.977638945268354e-05
iter  11	rk = 3.520489360080139e-05
iter  12	rk = 1.5554535070783054e-05
iter  13	rk = 8.360237315680992e-06
615.2137970924377 tpyamg_gmres_mgs

Final residuals:

0.0010771733830902388 sc_gmres
0.000630652708956495 sc_gmresk
0.0010771733831410476 pyamg_gmres_householder
0.0010771733831158971 pyamg_gmres_mgs

Remarks:

  • gmresk does not converge in the same number of iterations as other functions
  • gmresk is indeed faster than other functions
  • pyamg (householder) and scipy formulations are with almost the same execution time

@pescap
Copy link

pescap commented Mar 25, 2022

It sounds like the gmres callback is with relative residual rk/r0 while gmresk sounds to be giving me rk.

Yes, gmresk gives absolute resdiual rk, if you want to get relative residual, you only need to change it as follows

self.residual.append(rk / np.linalg.norm(b))

I understand. Yet, for compatibility issues, don't you think that gmres and gmresk should behave the exact same way, and be with the same default behavior?

@pescap
Copy link

pescap commented Mar 25, 2022

I plan to add the weighted GMRES(m) to scipy, as this variant can lead to mesh independent convergence results for FEM/BEM simulations (see e.g. this article).

Yes, it is interesting and meaningful.

I'll create an issue, and start working on its implementation as soon as gmresk is added to the library.

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 26, 2022

My simulations are in complex128. With gmresk, some residuals are float, as expected, but I get complex numbers at iterations 1 and 21 (i.e. first and last iteration).

I see, I'll fix the bug.

@zhaog6
Copy link
Contributor Author

zhaog6 commented Mar 26, 2022

Yet, for compatibility issues, don't you think that gmres and gmresk should behave the exact same way, and be with the same default behavior?

In fact, I prefer callback of absolute residual norm because it can be compared with the results of many math library. If users want to get relative residual norm, which can be defined in callback function. If you think it is really inconvenient or affects compatibility, I can modify it.

@ilayn
Copy link
Member

ilayn commented May 29, 2023

Hi @zhaog6 I found this a little too late to take in. Apologies for that but now we have #18488 in case you want to add the extras on top of the existing solver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.sparse.linalg
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants